Challenges with Checked-In Secrets

Reviewed by Greg Wilson / 2023-04-06
Keywords: DevOps, Security

Like many programmers, I have accidentally committed authentication keys and other secrets to Git, then scrambled to delete them and replace them before anyone noticed. To find out exactly what kinds of mistakes people like me make, the authors of this paper mined questions from Stack Overflow, grouped them, and ranked the results. They found that the four most common problems are:

  1. storing and versioning secrets during deployment;
  2. storing and versioning secrets in source code;
  3. ignoring or hiding secrets in source code (which was surprising to me); and
  4. cleaning up verison control history after an oops.

The most common solutions recommended were:

  1. move secrets out of source code and use template configuration files;
  2. use some form of secret management in deployment; or
  3. rely on environment variables.

One of the most frequently ignored rules of engineering is, "Figure out what the problem is before you start trying to solve it," and this paper is a welcome example of a systematic, evidence-based attempt to do that.

Setu Kumar Basak, Lorenzo Neil, Bradley Reaves, and Laurie Williams. What challenges do developers face about checked-in secrets in software artifacts? 2023. arXiv:2301.12377.

Throughout 2021, GitGuardianā€™s monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. To our knowledge, the challenges developers face to avoid checked-in secrets are not yet characterized. The goal of our paper is to aid researchers and tool developers in understanding and prioritizing opportunities for future research and tool automation for mitigating checked-in secrets through an empirical investigation of challenges and solutions related to checked-in secrets. We extract 779 questions related to checked-in secrets on Stack Exchange and apply qualitative analysis to determine the challenges and the solutions posed by others for each of the challenges. We identify 27 challenges and 13 solutions. The four most common challenges, in ranked order, are: (i) store/version of secrets during deployment; (ii) store/version of secrets in source code; (iii) ignore/hide of secrets in source code; and (iv) sanitize VCS history. The three most common solutions, in ranked order, are: (i) move secrets out of source code/version control and use template config file; (ii) secret management in deployment; and (iii) use local environment variables. Our findings indicate that the same solution has been mentioned to mitigate multiple challenges. However, our findings also identify an increasing trend in questions lacking accepted solutions substantiating the need for future research and tool automation on managing secrets.