What is "secret sprawl" and why is it harmful?
Having secrets sprawled throughout your system in plain text creates several roadblocks to application security. Learn what secret sprawl looks like, and how you can overcome it.
Founder & Co-CTO, HashiCorp
When we talk about secret sprawl, what we're first talking about is secrets themselves. What are secrets?
» Defining "secrets"
We talk about secrets as anything that gives us access to a system or allows us to authenticate or authorize ourself. Examples of this are a username or password, an API token, TLS certificates—any of these things we can provide to a database or to a cloud to authenticate ourself and then authorize us to perform actions—to read and write data from the database, to read or write from S3, so on and so forth.
These things are very sensitive because if I—as a third party or an attacker—get access to one of these secrets, I can use it to authenticate and authorize myself and do something that I shouldn't be doing. These mechanisms are really critical. Anyone who gets ahold of the mechanism can then authenticate whether or not they actually should be doing so or they're trusted.
That's what a secret is.
» Defining "secret sprawl"
Now when we talk about secret sprawl, what we're really talking about is the distribution of these things. They're sort of all over. They're littered about our infrastructure.
What we classically see is: You have a database username and password that's hard-coded into the source code of an application. It's in plaintext in the configuration file. It's in plaintext in config management. It's in plaintext in version control. It's in a Dropbox and it's in a Wiki. It's sprawled all over our infrastructure in different places.
Challenges of secrets management
The challenge with sprawl is a few-fold:
- The first challenge is, we don't actually even know what's where. It's not like we're keeping track of what's in the source code, what's in a repo, what's in GitHub. We don't even know. There's a level of unknowability here where it's all over the place. So, one level is we don't know what credentials are where.
- The second challenge is, we generally have limited access control over this. These are not systems that are designed for secret management. A Wiki or Dropbox or a version control—they have really no idea that you're storing secrets. They're not maintaining fine-grain audit logs of who's doing what and what credentials did they read. So, it gives us very little auditability, very little access control over it.
- The third one is, what do we do when there's a breach? Something bad happens and we find our database username and password is on the public internet. Now what? In this world, we actually don't even know where that username and password came from. Was it in the source code? Hardcoded? Was it in a config file somewhere? We don't even know where this thing is. And two, we don't really know, how do we remediate this breach? Was it an insider who leaked it? We don't necessarily have the access logs to help us do that forensic. We don't really have a good story of how to rotate it and change it. If it's hardcoded in the application source, now what? We have to change the source code, recompile the application, redeploy it. It's now a complex process to orchestrate the change in terms of rolling out that configuration.
So, secret sprawl comes with a number of different problems but it's really a lack of visibility and a lack of control, so this doesn't give us good answers when something happens. When we talk about secret management, the goal is to solve that.
The solution to "secret sprawl"
The first level answer is centralization. We need to move from stuff living everywhere—all over the place in different systems—to living in a single place where it's tightly access-controlled, tightly audited, it's encrypted so that it's not anyone who has access to version control.
Even if you have access to a system like Vault, we have fine-grain access controls restricting what secrets you have access to on a need-to-know basis. In this sense, now we have good answers around: If there's a breach, we have audit logs of who had access, and when did they have access. We can change it in a central place and distribute it to our infrastructure. It simplifies a lot of the lifecycle around this credential management.
Those are the challenges of secret sprawl and why focusing on centralization and using formal secret management reduces the risks associated.