It Will Never Work in Theory

Short summaries of recent results in empirical software engineering research

2021-10-24: What's Missing from 'The Missing README'
Keywords: Editorial, Professional Ethics, Social Responsibility
Reviewed by: Greg Wilson

As we noted in our review of The Tech Worker's Handbook, software companies have done a lot of damage to society. From automating discrimination and eroding privacy to spreading disinformation and fuelling racial hatred, those companies have amplified old problems and created new ones. But companies are legal fictions: responsibility actually lies with the executives who run those companies—and with the programmers who choose to write software for them. One reason for this is that programming isn't a profession like nursing, accounting, or law. You can't just call yourself a nurse, an accountant, or a lawyer: instead, you must prove...

2021-10-21: The Impact of Sleep Deprivation
Keywords: Productivity, Sleep Deprivation
Reviewed by: Greg Wilson

Last month we wrote that we wouldn't review any more papers on test-driven development, but Fucci2020 isn't really about TDD. Instead, the authors measured how well students wrote tests in order to gauge the effects of going a night without sleep. By comparing those who slept with those who didn't, they found that a single sleepless night reduced code quality by 50%. This is consistent with what we know from a century of other studies (see here for a short summary, and here for a shorter one); I don't expect companies or universities will suddenly start paying attention to the...

2021-10-20: The Tech Worker Handbook
Keywords: Editorial, Professional Ethics, Social Responsibility
Reviewed by: Greg Wilson

The bigger tech companies get, the harder they work to avoid responsibility for the harm they cause. From trying to cover up sexual assault , keeping alt-right hate sites in business , or stirring up genocidal hatred , the executives who run tech companies work overtime to avoid facing consequences for their actions. To date, most elected officials seem bewildered by the scale and speed of the problem; when companies are held accountable, it's often because a whistleblower with inside knowledge has come forward, but people who do that often face sustained harassment or worse. The Tech Worker Handbook is...

2021-10-19: What's Wrong With my Benchmark Results?
Keywords: Benchmarking
Reviewed by: Greg Wilson

Years ago, I briefly worked with a team that cared a lot about software performance. They had a comprehensive set of unit tests to check the correctness of their software, but unlike any other team I'd ever worked with, they automatically recorded the runtime of each of those tests so that they could spot performance regressions right away. It was a clever idea, and they'd put a lot of work into it, but unfortunately it didn't save them from releasing a new version of their code that ran half as fast for their users as the previous version. Their unit...

2021-10-19: Restarted and Flaky Builds on Travis CI
Keywords: Continuous Integration
Reviewed by: Greg Wilson

As noted yesterday, the spread of continuous integration (CI) has changed software development just as much as reliance on Q&A sites. The study of failing and restarted CI jobs reported in Durieux2020 gives us yet more insight into how it actually works: More mature and more complex projects are more likely to include restarted builds. Builds are mostly restarted because of a failing test, network problem, or Travis CI limitation such as execution timeout. In over half of the restarted builds, the developers analyze and restart a build within an hour of the initial build execution, which suggests developers interrupt...

2021-10-18: Bad Practices in Continuous Integration
Keywords: Continuous Integration
Reviewed by: Greg Wilson

The slow but steady normalization of continuous integration (CI) has changed software development just as profoundly as reliance on Q&A sites: by the time I have merged a pull request into the main branch of one of the project's I'm working on, a dozen different checks and actions have run automatically, at least half of which could reject the merge. I believe this automation has made developers more productive, but like any tool it can be used badly, so Zampetti2020 used interviews and mined Stack Overflow posts to find out how. Their complete catalog divides 79 distinct smells into seven...

2021-10-17: Demystifying 'Bad' Error Messages in Data Science Libraries
Keywords: Data Science, Error Messages
Reviewed by: Eddie Antonio Santos

As a computing educator and an occasional software library developer I often feel that a big portion of novice frustration when coding can be resolved by developers writing "better" error messages. Tao2021 provides evidence that dashes my dreams, but also provides actionable suggestions. The authors mined errors that were raised by six popular Python data science libraries: NumPy, Panadas, SciPy, scikit-learn, TensorFlow, and Gensim. With this large list of possible errors, they found possible fixes by doing what most people I know would do: search that error message on StackOverflow. They categorize each error messages as being either clear, uninformative,...

2021-10-16: Open Source Projects in Baidu, Alibaba, and Tencent
Keywords: Diversity, Open Source
Reviewed by: Greg Wilson

You don't realize what barriers people face if you've never had to get over them. I used to tell people that open source was a great leveller because everyone could contribute. I now realize that claim needs to be qualified with, "…provided they're affluent enough to have free time and good internet connectivity, speak English well enough to join a mostly monolingual conversation, and if they're not white and/or not male, willing to put up with a steady drizzle of disparagement or harassment." Han2021 is therefore a very welcome look at some of what's happening outside my bubble. As the...

2021-10-15: Authorship Attribution of Source Code
Keywords: Authorship, Machine Learning
Reviewed by: Greg Wilson

The question, "Who actually wrote this code?" comes up in many contexts, from plagiarism detection in schoolwork to design recovery in legacy systems. Bogomolov2021 presents two machine learning approaches to the problem using neural networks and random forests. Unlike most earlier work, these models operate on paths through the source code's abstract syntax tree (AST). The authors find that: their random forest approach outperforms the previous best result on C++, it matches the best performance of previous systems on Python, and both of their approaches outperform previous results on Java. I have reservations about how eagerly and uncritically some researchers...

2021-10-14: Exploring Programmers' API Learning Processes
Keywords: Cognition
Reviewed by: Greg Wilson

Today was my eighth day in my new job, and I have already had to come to grips with the APIs of half a dozen packages and web services. Figuring out what's available to call and what it will do is central to modern programming, so any research that helps us do it more efficiently is very welcome. Gao2020 is a preliminary observatory study designed to help create a theoretical framework for that task. It draws on cognitive load theory, information foraging theory, and research into external memory (i.e., the ways in which jot things down, draw sketches, and otherwise...