Abbreviated vs. Full-Word Identifier Names
Reviewed by Greg Wilson / 2021-08-09
Keywords: Programming Languages
Three studies from 2017 gave conflicting about the readability of short and long names in programs. Scanniello2017 found no difference in development effort or efficiency when using abbreviated identifier names as opposed to full names. This is consistent with the findings of Beniamini2017, which looked specifically at single-letter variable names and found that, even in controlled experiments, there was no different in comprehension.
However, Hofmeister2017 found that words outperformed both single letters and abbreviations (which performed the same). The difference was still fairly small (19%), so on balance this issue probably falls under the heading of "not worth worrying about".
Scanniello2017 Giuseppe Scanniello, Michele Risi, Porfirio Tramontana, and Simone Romano: "Fixing Faults in C and Java Source Code". ACM Transactions on Software Engineering and Methodology, 26(2), 2017, 10.1145/3104029.
We carried out a family of controlled experiments to investigate whether the use of abbreviated identifier names, with respect to full-word identifier names, affects fault fixing in C and Java source code. This family consists of an original (or baseline) controlled experiment and three replications. We involved 100 participants with different backgrounds and experiences in total. Overall results suggested that there is no difference in terms of effort, effectiveness, and efficiency to fix faults, when source code contains either only abbreviated or only full-word identifier names. We also conducted a qualitative study to understand the values, beliefs, and assumptions that inform and shape fault fixing when identifier names are either abbreviated or full-word. We involved in this qualitative study six professional developers with 1--3 years of work experience. A number of insights emerged from this qualitative study and can be considered a useful complement to the quantitative results from our family of experiments. One of the most interesting insights is that developers, when working on source code with abbreviated identifier names, adopt a more methodical approach to identify and fix faults by extending their focus point and only in a few cases do they expand abbreviated identifiers.
Beniamini2017 Gal Beniamini, Sarah Gingichashvili, Alon Klein Orbach, and Dror G. Feitelson: "Meaningful Identifier Names: The Case of Single-Letter Variables". 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), 10.1109/icpc.2017.18.
It is widely accepted that variable names in computer programs should be meaningful, and that this aids program comprehension. "Meaningful" is commonly interpreted as favoring long descriptive names. However, there is at least some use of short and even single-letter names: using i in loops is very common, and we show (by extracting variable names from 1000 popular GitHub projects in 5 languages) that some other letters are also widely used. In addition, controlled experiments with different versions of the same functions (specifically, different variable names) failed to show significant differences in ability to modify the code. Finally, an online survey showed that certain letters are strongly associated with certain types and meanings. This implies that a single letter can in fact convey meaning. The conclusion from all this is that single letter variables can indeed be used beneficially in certain cases, leading to more concise code.
Hofmeister2017 Johannes Hofmeister, Janet Siegmund, and Daniel V. Holt: "Shorter identifier names take longer to comprehend". 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 10.1109/saner.2017.7884623.
Developers spend the majority of their time comprehending code, a process in which identifier names play a key role. Although many identifier naming styles exist, they often lack an empirical basis and it is not quite clear whether short or long identifier names facilitate comprehension. In this paper, we investigate the effect of different identifier naming styles (letters, abbreviations, words) on program comprehension, and whether these effects arise because of their length or their semantics. We conducted an experimental study with 72 professional C# developers, who looked for defects in source-code snippets. We used a within-subjects design, such that each developer saw all three versions of identifier naming styles and we measured the time it took them to find a defect. We found that words lead to, on average, 19% faster comprehension speed compared to letters and abbreviations, but we did not find a significant difference in speed between letters and abbreviations. The results of our study suggest that defects in code are more difficult to detect when code contains only letters and abbreviations. Words as identifier names facilitate program comprehension and can help to save costs and improve software quality.