Reuse and Maintenance Among Divergent Forks
Reviewed by Greg Wilson / 2023-02-27
Keywords: Maintenance, Reuse
I have forked dozens of repositories on GitHub over the years, but only once with the intention of creating a new project rather than contributing back to the original. While my fork didn't survive, many others do, becoming new projects in their own right.
How common is that? And how do related projects (called "software families") interact, if at all, after they diverge? This paper finds that they mostly go their own way, and that when they do share code, they do so directly through Git rather than via pull requests on GitHub.
The analysis scripts used in the paper are available online.
John Businge, Moses Openja, Sarah Nadi, and Thorsten Berger. Reuse and maintenance practices among divergent forks in three software ecosystems. Empirical Software Engineering, Mar 2022. doi:10.1007/s10664-021-10078-2.
With the rise of social coding platforms that rely on distributed version control systems, software reuse is also on the rise. Many software developers leverage this reuse by creating variants through forking, to account for different customer needs, markets, or environments. Forked variants then form a so-called software family; they share a common code base and are maintained in parallel by same or different developers. As such, software families can easily arise within software ecosystems, which are large collections of interdependent software components maintained by communities of collaborating contributors. However, little is known about the existence and characteristics of such families within ecosystems, especially about their maintenance practices. Improving our empirical understanding of such families will help build better tools for maintaining and evolving such families.