Automatically Assessing Method Names
Reviewed by Greg Wilson / 2023-03-15
Keywords: Natural Language, Programming Style
Amidst the excitement about using large language models to generate code,
it's easy to lose sight of all the other ways that
the things programmers have built can be used to make programming better.
One example is this work,
which looks at whether we can use natural language processing
to assess the quality of method names.
The authors collected ten rules (shown in the table below)
and used them to score names from like setIconItemStatus()
from several software projects.
(That name gets a score of 10 out of 10, by the way.)
The authors recognize that the rules are not fully objective—for example,
they split on whether the first letter after an acronym should be capitalized or not—and
automatic tools sometimes struggle because of grammatical ambiguities
(e.g., words that can be both nouns and verbs),
but their work points the way toward a new generation of code checking tools.
# | Standard Name | Rules |
---|---|---|
1 | Naming Style | A single standard naming style is used. |
2 | Grammatical Structure | If there are multiple words, they form a grammatically correct sentence structure. |
3 | Verb Phrase | It is a verb or a verb phrase. |
4 | Dictionary Terms | Only natural language dictionary words and/or familiar/domain-relevant terms are used. |
5 | Full Words | Full words are used rather than a single letter. |
6 | Idioms and Slang | It does not contain personal expressions, idioms, or slang. |
7 | Abbreviations | It only contains known or standard abbreviated terms. All abbreviations are well known or part of the problem domain. |
8 | Acronyms | It only contains standard acronyms. All acronyms are well known or part of the problem domain. |
9 | Prefix/Suffix | It does not contain a prefix/suffix that is a term from the system. This standard does not apply to languages such as C that do not have namespaces. |
10 | Length | Maximum number of words is no greater than 7. |
Reem S. Alsuhaibani, Christian D. Newman, Michael J. Decker, Michael L. Collard, and Jonathan I. Maletic. An approach to automatically assess method names. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, May 2022. doi:10.1145/3524610.3527780.
An approach is presented to automatically assess the quality of method names by providing a score and feedback. The approach implements ten method naming standards to evaluate the names. The naming standards are taken from work that validated the standards via a large survey of software professionals. Natural language processing techniques such as part-of-speech tagging, identifier splitting, and dictionary lookup are required to implement the standards. The approach is evaluated by first manually constructing a large golden set of method names. Each method name is rated by several developers and labeled as conforming to each standard or not. These ratings allow for comparing the results of the approach against expert assessment. Additionally, the approach is applied to several systems and the results are manually inspected for accuracy.