Reading Answers on Stack Overflow: Not Enough!

Reviewed by Greg Wilson / 2021-09-19
Keywords: Crowdsourcing, Stack Overflow

I spoke with someone earlier this year who had been using the Unix shell for several years but had never used the man command. Whenever they had a question they went to Stack Overflow: experience had taught them that they could find their answer there more quickly than by hunting through comprehensive breadth-first documentation written by people who are guessing what the reader wants to know rather than responding to the actual gaps in their knowledge. (See this post for more on documentation types and their audiences.)

The answers on Stack Overflow are only part of the story, though. As Zhang2020 shows, a lot of the value is in the comments. Most of these are added after an answer is accepted; half are fast responses (added within one day of an answer being posted), but later comments tend to be more informative. Question posters rarely integrate comments back into answers, and inexperienced users tend to raise limitations and concerns while experienced users tend to enhance the answer.

Stack Overflow is not a level playing field: as Ford2016 showed, it is a less friendly place for women than it is for men, which means its growing importance has only exacerbated tech's deep-seated biases. Hopefully, studies like Zhang2020 that give us more insight into how it's used will also help us make it a more level playing field.

Zhang2020 Haoxiang Zhang, Shaowei Wang, Tse-Hsun Chen, and Ahmed E. Hassan: "Reading Answers on Stack Overflow: Not Enough!". IEEE Transactions on Software Engineering, 2020, 10.1109/tse.2019.2954319.

Stack Overflow is one of the most active communities for developers to share their programming knowledge. Answers posted on Stack Overflow help developers solve issues during software development. In addition to posting answers, users can also post comments to further discuss their associated answers. As of Aug 2017, there are 32.3 million comments that are associated with answers, forming a large collection of crowdsourced repository of knowledge on top of the commonly-studied Stack Overflow answers. In this study, we wish to understand how the commenting activities contribute to the crowdsourced knowledge. We investigate what users discuss in comments, and analyze the characteristics of the commenting dynamics, (i.e., the timing of commenting activities and the roles of commenters). We find that: 1) the majority of comments are informative and thus can enhance their associated answers from a diverse range of perspectives. However, some comments contain content that is discouraged by Stack Overflow. 2) The majority of commenting activities occur after the acceptance of an answer. More than half of the comments are fast responses occurring within one day of the creation of an answer, while later comments tend to be more informative. Most comments are rarely integrated back into their associated answers, even though such comments are informative. 3) Insiders (i.e., users who posted questions/answers before posting a comment in a question thread) post the majority of comments within one month, and outsiders (i.e., users who never posted any question/answer before posting a comment) post the majority of comments after one month. Inexperienced users tend to raise limitations and concerns while experienced users tend to enhance the answer through commenting. Our study provides insights into the commenting activities in terms of their content, timing, and the individuals who perform the commenting. For the purpose of long-term knowledge maintenance and effective information retrieval for developers, we also provide actionable suggestions to encourage Stack Overflow users/engineers/moderators to leverage our insights for enhancing the current Stack Overflow commenting system for improving the maintenance and organization of the crowdsourced knowledge.