The Secret Life of Bugs

Reviewed by Greg Wilson / 2021-09-23
Keywords: Faults, Research Methods

Twelve years ago, Aranda2009 forced me to rethink a lot of what I thought about software engineering research. Its authors (one of whom later co-founded this site) looked closely at the histories of ten randomly-selected bugs at Microsoft, comparing the electronic records (like version control commits and email messages) with detailed interviews of 26 people. Their conclusion:

The most striking lesson from our cases is the deep unreliability of electronic traces, and particularly of the bug records, which were erroneous or misleading in seven of our ten cases, and incomplete and insufficient in every case. In fact, even considering all of the electronic traces of a bug that we could find (repositories, email conversations, meeting requests, specifications, document revisions, and organizational structure records), in every case but one the histories omitted important details about the bug. [emphases in the original]

I think we can learn a lot about software and the people who create it by mining and analyzing digital data. Since reading this paper, though, one of the first things I check when reviewing a data mining paper is whether the authors thought carefully about what that data didn't capture. If they don't address that issue (which unfortunately many don't), I'm much more skeptical about the paper's claims. If this site ever does turn into the evidence-based introduction to software engineering that I've wanted for over a decade, this paper will be one of the first required readings.

Aranda2009 Jorge Aranda and Gina Venolia: "The secret life of bugs: Going past the errors and omissions in software repositories". 2009 IEEE 31st International Conference on Software Engineering, 10.1109/icse.2009.5070530.

Every bug has a story behind it. The people that discover and resolve it need to coordinate, to get information from documents, tools, or other people, and to navigate through issues of accountability, ownership, and organizational structure. This paper reports on a field study of coordination activities around bug fixing that used a combination of case study research and a survey of software professionals. Results show that the histories of even simple bugs are strongly dependent on social, organizational, and technical knowledge that cannot be solely extracted through automation of electronic repositories, and that such automation provides incomplete and often erroneous accounts of coordination. The paper uses rich bug histories and survey results to identify common bug fixing coordination patterns and to provide implications for tool designers and researchers of coordination in software development.

« Taxonomy of Package Management in Programming Languages and Operating Systems

Two Studies of Software Evolution »