Denae Ford, Justin Smith, Philip J. Guo, and Chris Parnin: "Paradise Unplugged: Identifying Barriers for Female Participation on Stack Overflow". FSE'16, https://denaeford.wordpress.com/2016/07/20/paradise-unplugged-barriers-to-stack-overflow-use/. It is no secret that females engage less in programming fields than males. However, in online communities, such as Stack Overflow, this gender gap is even more extreme: only 5.8% of contributors are female. In this paper, we use a mixed-methods approach to identify contribution barriers females face in online communities. Through 22 semi-structured interviews with a spectrum of female users ranging from non-contributors to a top 100 ranked user of all time, we identified 14 barriers preventing...
Five From ICER'16
These papers were all presented at the 12th Annual International Computing Education Research conference in Melbourne earlier this month, and give a good sense of what CS education researchers are looking at and what they're finding. Elizabeth Patitsas, Jesse Berlin, Michelle Craig, and Steve Easterbrook: "Evidence That Computer Science Grades Are Not Bimodal", 10.1145/2960310.2960312. Although it has never been rigourously demonstrated, there is a common belief that CS grades are bimodal. We statistically analyzed 778 distributions of final course grades from a large research university, and found only 5.8% of the distributions passed tests of multimodality. We then devised a...
I noted earlier this week that the ACM was making papers from this year's International Computing Education Research conference freely available—but only for two weeks, which isn't what anyone else means by "open access" (and is frankly ridiculous). I'm therefore grateful to Neil Brown for pointing out bullet #4 on this page, which says that authors and owners permanently hold the right to: Post the Accepted Version of the Work on (1) the Author's home page, (2) the Owner's institutional repository, (3) any repository legally mandated by an agency funding the research on which the Work is based, and (4)...
You Keep Using That Word...
Keywords: Opinion, Open Access
We decided in 2012 that we would only review material that is openly available. I was therefore pleased to discover earlier this week that I could actually download papers from this year's International Computing Education Research conference from the ACM's site. ...until I read that "#ICER2016 papers are open access for next two weeks" (emphasis added). If that's true, it's a real shame: a lot of very cool things are being presented at ICER that deserve to be more widely known, but there's no point posting links that are going to 403 by the time most people outside the Great...
Jonathan L. Krein, Lutz Prechelt, Natalia Juristo, Aziz Nanthaamornphong, Jeffrey C. Carver, Sira Vegas, Charles D. Knutson, Kevin D. Seppi, and Dennis L. Eggett: "A Multi-Site Joint Replication of a Design Patterns Experiment Using Moderator Variables to Generalize Across Contexts". IEEE Trans. Software Engineering, 42(4), April 2016, 10.1109/TSE.2015.2488625. Context. Several empirical studies have explored the benefits of software design patterns, but their collective results are highly inconsistent. Resolving the inconsistencies requires investigating moderators—i.e., variables that cause an effect to differ across contexts. Objectives. Replicate a design patterns experiment at multiple sites and identify sufficient moderators to generalize the results across...
Helen Sharp, Yvonne Dittrich, and Cleidson R.B. de Souza: "The Role of Ethnographic Studies in Empirical Software Engineering". IEEE Trans. Software Engineering, 42(8), August 2016, 10.1109/TSE.2016.2519887. Ethnography is a qualitative research method used to study people and cultures. It is largely adopted in disciplines outside software engineering, including different areas of computer science. Ethnography can provide an in-depth understanding of the socio-technological realities surrounding everyday software development practice, i.e., it can help to uncover not only what practitioners do, but also why they do it. Despite its potential, ethnography has not been widely adopted by empirical software engineering researchers, and...
Felienne Hermans and Efthimia Aivaloglou: "Do Code Smells Hamper Novice Programming?" TUD-SERG-2016-006, 2016. Recently, block-based programming languages like Alice, Scratch and Blockly have become popular tools for programming education. There is substantial research showing that block-based languages are suitable for early programming education. But can block-based programs be smelly too? And does that matter to learners? In this paper we explore the code smells metaphor in the context of block-based programming language Scratch. We conduct a controlled experiment with 61 novice Scratch programmers, in which we divided the novices into three groups. One third receive a non-smelly program, while the...
Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. Its goal is to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the community’s leaders. Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and...
You are invited to participate in a survey on software licensing designed to investigate how well software developers understand common open source software licenses. We are looking for software developers that have built or are currently building on open source software in their projects (and I am personally interested in hearing from people building open source software for research). The study is being conducted by Prof. Gail Murphy (email@example.com) and graduate student Daniel Almeida (firstname.lastname@example.org); participating in the anonymous online survey will take approximately 30 minutes. If you are interested in participating, please go to: https://survey.ubc.ca/surveys/danielalmeida/software-licensing-survey/ If you have any...
Have you read any empirical software engineering research papers recently
that you think a wider audience would enjoy?
If so, please send us pointers:
we'd be happy to feature them.
(But please note that we only discuss work that is openly available—nothing paywalled, please.)
An Interview with Andreas Stefik
Keywords: Programming Languages
Functional Geekery's interview with Andreas Stefik
is a great summary of what we actually know about the usability of programming languages—it's worth listening to the whole thing,
especially the detailed discussion of studies of statically and dynamically typed languages.
Polymorphism in Python
Keywords: Programming Languages
Beatrice Åkerblom and Tobias Wrigstad: "Measuring Polymorphism in Python Programs". SPLASH'15, October 2015, https://people.dsv.su.se/~beatrice/python/dls15_large_images.pdf. In a break from our usual practice of quoting abstracts, here are this paper's conclusions: Our results show that while Python’s dynamic typing allows unbounded polymorphism, Python programs are predominantly monomorphic, that is, variables only hold values of a single type. This is true for program start-up and normal runtime, in library code and in program-specific code. Nevertheless, most programs have a few places which are megamorphic, meaning that variables in those places contain values of many different types at different times or in different contexts....
David Pritchard: "Frequency Distribution of Error Messages". PLATEAU'15, October 2015, http://2015.splashcon.org/event/plateau2015-frequency-distribution-of-error-messages. Which programming error messages are the most common? We investigate this question, motivated by writing error explanations for novices. We consider large data sets in Python and Java that include both syntax and run-time errors. In both data sets, after grouping essentially identical messages, the error message frequencies empirically resemble Zipf-Mandelbrot distributions. We use a maximum-likelihood approach to fit the distribution parameters. This gives one possible way to contrast languages or compilers quantitatively. Based on a large corpus of error messages, the 5 most common errors in Python programs...
Marc Kiefer, Daniel Warzel, and Walter Tichy: "An Empirical Study on Parallelism in Modern Open-Source Projects". SEPS'15, October 2015, https://ps.ipd.kit.edu/backend/index.php/veroeffentlichungen-details/items/3803.html. We present an empirical study of 135 parallel open-source projects in Java, C# and C++ ranging from small (<1000 lines of code) to very large (>2M lines of code) codebases. We examine the projects to find out how language features, synchronization mechanisms, parallel data structures and libraries are used by developers to express parallelism. We also determine which common parallel patterns are used and how the implemented solutions compare to typical textbook advice. The results show that similar parallel constructs...
Amjad Altadmri and Neil C. C. Brown: "37 Million Compilations: Investigating Novice Programming Mistakes in Large-Scale Student Data". SIGCSE'15, March 2015, http://dx.doi.org/10.1145/2676723.2677258, https://kar.kent.ac.uk/46742/1/fp1187-altadmri.pdf. Educators often form opinions on which programming mistakes novices make most often – for example, in Java: "they always confuse equality with assignment", or "they always call methods with the wrong types". These opinions are generally based solely on personal experience. We report a study to determine if programming educators form a consensus about which Java programming mistakes are the most common. We used the Blackbox data set to check whether the educators' opinions matched data from...
Too Many Knobs
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker: "Hey, You Have Given Me Too Many Knobs! Understanding and Dealing with Over-Designed Configuration in System Software". ESEC/FSE'15, August 2015, http://dx.doi.org/10.1145/2786805.2786852, http://cseweb.ucsd.edu/~tixu/papers/fse15.pdf. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate...
A cortical homunculus is a graphical representation showing how much of the brain is devoted to different parts of the body: It would be fascinating to see a similar diagram showing how much software engineering research effort is devoted to which parts of the things software developers actually do. I strongly suspect, for example, that the ratio of research to practice is much greater than unity for software product families, while there is much (much) less research into installation and package management than there is practical wrangling. If someone has a grad student looking for a project, I'd be happy...
David Log, Nachiappan Nagappan, and Thomas Zimmermann: "How Practitioners Perceive the Relevance of Software Engineering Research". ESEC/FSE'15, August 2015, http://thomas-zimmermann.com/publications/files/lo-esecfse-2015.pdf. The number of software engineering research papers over the last few years has grown significantly. An important question here is: how relevant is software engineering research to practitioners in the field? To address this question, we conducted a survey at Microsoft where we invited 3,000 industry practitioners to rate the relevance of research ideas contained in 571 ICSE, ESEC/FSE and FSE papers that were published over a five year period. We received 17,913 ratings by 512 practitioners who labelled ideas...
Michael Eichberg, Ben Hermann, Mira Mezini, and Leonid Glanz: "Hidden Truths in Dead Software Paths". ESEC/FSE'15, August 2015, http://dx.doi.org/10.1145/2786805.2786865 http://www.thewhitespace.de/publications/ehmg15-deadpath.pdf. Approaches and techniques for statically finding a multitude of issues in source code have been developed in the past. A core property of these approaches is that they are usually targeted towards finding only a very specific kind of issue and that the effort to develop such an analysis is significant. This strictly limits the number of kinds of issues that can be detected. In this paper, we discuss a generic approach based on the detection of infeasible paths in...
Goto in C
Keywords: Programming Languages
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Eric Tante, Shane McIntos4, Audris Mocku5,and Ahmed E. Hassa: "An Empirical Study of Goto in C Code from GitHub Repositories". ESEC/FSE'15, August 2015, http://www.se.rit.edu/~mei//publications/publications/FSE2015-Nagappan.pdf. It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is 'harmful' enough to be a part of a post-release bug. We, therefore, conduct a...