A case history analysis of software error cause-effect relationships

Reviewed by Geoff Cramer / 2022-02-28
Keywords: Development Process, Program Analysis

Where do software errors come from? Almost thirty years ago, Nakajo1991 analyzed approximately 700 software errors in four commercial measuring-control products and used a restricted version of fault tree analysis (FTA) to capture meaningful cause-effect failure patterns. The authors observed failures at chosen points along a cause-effect process of software errors; selected points are highlighted below:

  1. inappropriate management of work system
  2. work system flaw (process, individual, environmental flaws)
  3. human error
  4. program fault
  5. system failure

The methodology in this paper provides a systematic way to map software failures to their ultimate cause and leverages statistical analysis to draw conclusions about rates of particular errors across projects. However, the study only analyzed two system products and two firmware products and was not able to study either individual or environmental paths along its own defined error cause-effect process.

Nakajo1991 concluded that:

  1. Two major error patterns resulted from hardware or software interface misunderstandings.
  2. Two more common error sources emerged from system and module function misconceptions.
  3. Structured Analysis and Structured Design (SA/SD) methods were effective in mitigating the first two but not the last two sources of errors.

Thirty years is a long time in our industry, but this work shows that systematic analysis of errors, and of strategies to mitigate them, was possible long before the internet made "big data" about software available. The question the paper doesn't ask, but we should, is, "Why aren't analyses like these an everyday thing?"

Nakajo1991 Takeshi Nakajo and Hitoshi Kume: "A case history analysis of software error cause-effect relationships". IEEE Transactions on Software Engineering, 17(8), 1991, 10.1109/32.83917.

Approximately 700 errors in four commercial measuring-control software products were analyzed, and the cause-effect relationships of errors occurring during software development were identified. The analysis method used defined appropriate observation points along the path leading from cause to effect of a software error and gathered the corresponding data by analyzing each error using fault tree analysis. Each observation point's data were categorized, and the relationships between two adjoining points were summarized using a cross-indexing table. Four major error-occurrence mechanisms were identified; two are related to hardware and software interface specification misunderstandings, while the other two are related to system and module function misunderstandings. The effects of structured analysis and structured design methods on software errors were evaluated.