It Will Never Work in Theory

Shaping the Next Generation (or, the exam defines the course defines the discipline)

Posted Sep 1, 2012 by Greg Wilson

| Education |

As we reported a few days ago, one of our contributors, Greg Wilson, gave a keynote at the MSR Vision 2020 workshop in Kingston on August 20. In that, he explored why there's still a gulf between software engineering researchers and the people who actually build software for a living (see the slides or the discussion on Reddit for details). He also said that:

  1. there's no easy way to close that gap, because most of the people in industry that researchers want to collaborate with have never encountered empirical software engineering studies, and therefore don't understand their scope or value; so
  2. researchers—many of whom are professors—should pivot the software engineering classes they teach to focus on how to analyze real-world data, and what past analyses have told us, so that the next generation of developers will understand (and listen, and want to collaborate).

To make this more concrete, Greg asked the workshop participants to make up some assignments and exam questions for such a course. Some of the suggestions are listed below; we would welcome other ideas as well (please post them as comments). We'd also like to know who'd be interested in trying to teach such a course at their institution, and what you think the prerequisites would have to be: statistics, obviously, but would a database course that introduced students to SQL be necessary? What about a natural language processing course? Or something else we haven't thought of?

Group 1

Give two examples of success stories in studies of the social aspects of software engineering.
  1. Reorganization based on social structures
  2. Identifying the "big players" in a software project
What are three sources of social interaction in software projects?
  1. Email
  2. IRC
  3. bug comments
  4. source code comments
Name three challenges in preprocessing emails.
  1. signatures
  2. code snippets
  3. stack traces
  4. fake/multiple email addresses
  5. identifying email headers and inline replies
  6. typos
  7. chat acronyms
  8. non-native speakers
  9. use of multiple languages
>Group 2

  1. You are given a dataset A of OSS projects and a subset of it B. Evaluate whether a hypothesis H can be rejected on A and B. Design the question in such a way that H is significant (at 0.05 level) at A and not B. Discuss the discrepancy.
  2. Given a dataset and a specific question, perhaps from exisitng MSR papers, discuss which data mining approach is best suited for that question.
  3. Given a specific question (e.g., bug finding) what repositories should you use to solve it? Illustrate it with Bugzilla. How do you adapt this to Jira?
  4. Given that two variables A and B correlate, can you say "A causes B"? Why or why not?
  5. Repeat an existing analysis from an MSR paper. Do you get the same results? Vary a number of variables. How different are the results?

Group 3

  • Statistics
    1. What is wrong with this claim: "Files with a large number of committers/authors have more defects/bugs, so we conclude that more authors cause more bugs, and we recommended that the number of commiters be reduced."
    2. A tool is 99% accurate in detecting defective lines of code. Should developers use the tool? Why or why not?
    3. What are the internal validity issues and external validity issues with this method? "Researcher X finds that a lack of modularity leads to more defects in Windows, and Y is going to apply that predict defects in Eclipse."
    4. Design a study to see whether people who go to lunch together have fewer build defects in their software.
    5. Which would product fewer false positives: 90% recall and 10% precision, or 10% precision and 90% recall?
  • Data
    1. Given a table of bug reports with severity, etc. and another table of users with qualifications, etc., determine whether experience and bug report frequency are correlated, and if so, how strongly.
    2. Define: evolutionary coupling, tokenizing, word nets, stemming, n-gram, entropy.
    3. List 10 sources of data that could be mined to estimate the risks to a software projects, and describe the limitations of each.
  • Interpretation and Actionability
    1. Your boss has asked you to generate documentation for a legacy system that doesn't have any. What approach(es) would you use to automatically generate some useful documentation for each class and method?
    2. Given a set of version control logs, how would you tell which commits were bug fixes (vs. adding new features)?
    3. What technique(s) would you use to correlate email messages from a mailing list archive with related version control commits?
  • Ethics
    1. Given a data set (mailing list archive, bug reports, and version control log), anonymize it so that it can be shared without risk.
    2. Is it ethical to do an experiment to find out whether one race or gender produces more bugs than another? Justify your answer. How about graduates of one university vs. another?
Comments powered by Disqus