PLEASE NOTE: This post is kept for historical purposes only and to avoid breaking inbound links. The paradigms discussed here have been extended, formalized, and standardized into a built-in Vault feature called Response Wrapping. Check out the Response Wrapping concept page for more information.
Last Updated: 2016-04-27
In the Vault 0.3 release post,
introduced, along with an example workflow showing how it can be used for
secure authentication to Vault. Here at HashiCorp, we believe that
Cubbyhole-based authentication is the best approach for authenticating to Vault
in a wide variety of use-cases. In this post we will explain why Cubbyhole may
be the right authentication model for you, and present multiple considerations
around Cubbyhole authentication to help fit a number of real-world deployment
This post will first explain the motivation behind developing the Cubbyhole authentication model, then describe the model itself, and finally present some considerations for designing deployment scenarios.
Please note: Cubbyhole authentication provides a set of useful security primitives. The information in this post is intended to be useful, but it is not exhaustive and should not be taken as a recommendation of one approach or another. It is up to the implementor to ensure that any given authentication workflow meets their organization's policy.
In Vault, there are two main types of authentication backends available:
User-oriented authentication backends: These generally rely on knowledge of a shared secret, such as a password for
ldapor a GitHub API token for
github. This shared knowledge is distributed out-of-band.
Machine-oriented authentication backends: These rely on some intrinsic property of the client (such as a machine MAC address for
app-id) or a secret distributed out-of-band (such as a TLS private key/certificate for
cert) in order to facilitate automatic authentication of machines and applications.
The MAC address of a machine (an intrinsic property) is almost certainly known to operators.
The other half of
app-idauthentication is a unique UUID for the machine (an out-of-band distributed secret), suggested to be stored in configuration management. However, storing this value in configuration management often means that it is stored in plaintext on-disk (likely in many places if using a DVCS to version configuration management information, as is very common). Even if it is encrypted on-disk (as most configuration management utilities are able to do), any operator or service that is used to deploy to a machine must know the decryption key, which itself is a secret that must now be protected and securely distributed.
There is no perfect answer to automatic secret distribution, which in turn
means that there is no perfect answer to automatic machine authentication.
However, risk and exposure can be minimized, and the difficulty of successfully
using a stolen secret can be heightened. Using the
cubbyhole backend along
with some of Vault's advanced token properties enables an authentication model
that is in many cases significantly more secure than other methods both within
Vault and when compared to other systems. This in turns enhances Vault's
utility as an access control system to secrets stored both within Vault itself
and other systems (such as
even for directly accessing other systems (for instance, via
SSH or TLS
The Cubbyhole Authentication Model
As a quick refresher, the
cubbyhole backend is a simple filesystem
abstraction similar to the
generic backend (which is mounted by default at
secret/) with one important twist: the entire filesystem is scoped to a
single token and is completely inaccessible to any other token.
Since many users are using Vault secrets to establish secure channels to other machines, we wanted to develop an authentication model that itself does not rely on a secure channel (but of course can be enhanced by a secure channel).
Instead, the model takes advantage of some of the security primitives available with Vault tokens and the Cubbyhole backend:
Limited uses: tokens can be valid for only a certain number of operations
Limited duration: tokens can be revoked after a configurable duration
Using these primitives, and with input and feedback from security experts both in the field and at several commercial companies, we constructed a new authentication model. Like all authentication models, it cannot guarantee perfect security, but it sports a number of desirable properties.
The model works as follows, supposing that we are trying to get a Vault token to an application (which could be a machine or container or VM instead):
A process responsible for creating Vault authentication tokens creates two tokens: a permanent (
perm) token and a temporary (
temp) token. The
permtoken contains the final set of policies desired for the application in the container. The
temptoken has a short lease duration (e.g. 15 seconds) and a maximum use count of 2.
temptoken is used to write the
permtoken into the cubbyhole storage specific to
temp. This requires a single write operation, reducing the remaining uses of
The invoked process gives the
temptoken to the application management engine (for instance, Docker), which starts the application and injects the
temptoken value into the application environment.
The application reads the
temptoken from the environment and uses it to fetch the
Let's take a look at some of the properties of this method:
Even if the value of the
temptoken is passed in cleartext to a process, the value of the
permtoken will be (should be!) covered by TLS.
The value of the
temptoken written to disk in order to pass it to an application presents no long-term security threat.
If the value is logged (for instance, because it was handed to a container in its environment variable settings), the value in the log presents no long-term security threat.
Because accesses to the cubbyhole are logged in Vault's audit log, an operator can then use the audit log to discover whether an application took too long to start (only one instance of the cubbyhole being accessed will appear in the audit logs before revocation) or whether another process used the token (two instances will appear in the audit logs) to steal the
Of course, depending on the particular setup, there is still the potential for malfeasance, but there are also mitigations:
- If the disclosure of the
Some of the considerations detailed in the next section provide defense against this.
If the sniffer actually fetches the value of the
permtoken, this is detectable.
This is detectable as well.
No authentication mechanism is perfect. However, limited uses, limited duration, and limited access together form a powerful set of security primitives. The ability to detect accesses by bad actors is also an extremely important property. By combining them, the Cubbyhole authentication model provides a way to authenticate to Vault that likely meets the needs of even the most stringent security departments.
Cubbyhole Authentication Considerations
A natural question at this point is "who or what is responsible for creating
perm tokens?" Because this authentication workflow is simply a
model, there isn't a single structured way to use it. The flip side, however,
is that there is a lot of flexibility to easily craft the solution that meets
This section will present a few considerations targeted to different deployment and operation models. They are not exhaustive, but should provide ideas for implementing Cubbyhole authentication in your own Vault deployment.
Token Distribution Mechanisms
In a normal push model, tokens are generated and pushed into an application,
its container, or its virtual machine, usually upon startup. A common example
of this would be generating the
perm tokens and placing the
token into a container's environment.
This is a convenient approach, but has some drawbacks:
- Environment information is often logged at application or container startup,
and may be accessible by others within a short time frame.
However, as with all Cubbyhole models, malfeasance is detectable; if a bad actor gains access to the environment and manages to consume the
temptoken before the intended target, the target can log a security alert that its given token was invalid.
- If an application fails and it (or its container) is restarted, it may see an
Additionally, it may be difficult or impossible to configure the provisioning and/or orchestration system to perform this functionality, especially without writing a framework, executor, or other large chunk of code.
As a result, this is a conceptually easy approach but can be difficult to implement. Often, a hybrid approach with a push model as well as a pull or coprocess model will be the right approach.
In a pull model, the application, upon startup, reaches out to a
token-providing service to fetch a
temp token. Alternately, rather than
coding this logic into each application or container, a small bootstrapping
application could perform this task, then start the final application and pass
the value of the
perm token in. The same bootstrapping application could be
used across machines or containers.
The pull model provides some benefits:
It requires constructing a simple token providing service rather than potentially modifying an application scheduling system.
If an application fails and it (or its container) is restarted, it can pull another token in rather than seeing an out-of-date
However, there is a major drawback to the pull model as well: it requires a token providing service to implement some logic and heuristics to determine whether a request is valid. For instance, upon receiving a request from a specific IP and with a specified application, it could check with the application or container scheduler to determine if, in fact, that application was just spun up on that node. This may not work well if the application has a runtime manager that restarts it locally if it fails.
A third approach uses a coprocess model: applications implement a known HTTP endpoint, then an agent, usually on the local machine, pushes a token into the application.
At first, this approach might seem like it combines the complexity of implementing both a push and a pull model. However, its advantages are enough to warrant serious consideration (the following points assume that the token-providing service is on each local machine, although it does not have to be):
Unlike the pull model, the coprocess does not need to trust that the calling application is what it says it is -- it sends a token to the known good location of the application. If that application did not request a new token, it can send an alert to trigger an investigation.
If the application is running in a container, a local coprocess could write the token to a directory on the host file system created specifically for that container and bind-mount it into the container upon startup, rather than pushing it in via environment variables that are more likely to be logged.
- If the application is running in a chroot, a local coprocess could write the token into the chroot before application startup.
An example of a coprocess-based approach would be a coprocess that watches the
local Docker event API to determine when new containers have started, then
perm token pair and sends the
temp token to the
well-known endpoint of the application in the new container. The coprocess
itself would need a token; this could also use Cubbyhole authentication, with
temp token injected via a push approach to the host, perhaps from a
central coprocess that manages host authentication.
Overall, a coprocess-based approach requires little support from the provisioning and orchestration systems and minimal changes to applications, which can usually be implemented with a common library. As a result, it generally offers the best combination of flexibility and operational robustness.
Token-based Authentication Pros and Cons
Finally, some remarks on the differences between using Cubbyhole authentication, which is token-based, versus a normal authentication backend, which is credential-based.
Tokens carry with them some restrictions:
When the TTL of a token expires or is otherwise revoked, any child tokens it created will be revoked as well.
- Child tokens must contain a subset of their parent's policies.
Compare this to an authentication backend, which can create tokens up to a lifetime determined by the mount properties or system configuration, and with any set of policies as configured by a backend administrator.
Of course, whether these restrictions are pros or cons depends on your own needs. However, looking at the restrictions another way:
A parent token's TTL expiring forces all tokens to eventually rotate, preventing accidental long-lived tokens (at least, longer than the token used to create the
- A backend cannot accidentally or maliciously be configured to associate more (or more permissive) policies with a token, because the set of available policies is constrained by the parent. In fact, a token-providing service could have multiple tokens each associated with a limited set of policies, and use the correct token to create the correct child token depending on the needs of the application.
If you want the benefits of the restricted set of policies, but not tie child
tokens to the lifetime of the parent token, Vault 0.4 will contain a
auth/token/create-orphan endpoint which will allow tokens with policies
granting them access to this endpoint to create orphan tokens without requiring
sudo access to
Share the Knowledge
We think that the Cubbyhole authentication model will be extremely useful to many organizations. We hope that as our users deploy their authentication solutions based on Cubbyhole, they will provide feedback, tips, and useful code to the community. Please be sure to spread the knowledge around on the vault-tool mailing list!