See how the Vault Sidecar Agent Injector and the CSI provider Kubernetes integration methods for Vault can be used to seamlessly secure Kubernetes Secrets.
Hello everyone. Eric forgot to tell you that I'm going to speak in French as well. In my defense, that's a joke. I admit that my French knowledge is not that good. Hopefully, I will get a chance to practice with some friends here. Welcome everyone. Both joining us in person and virtually.
The title mainly emphasizes the Vault Agent. When I started thinking about this talk, the Vault Secrets Operator was not actually released. Meanwhile, when I saw it was released, I included it in this talk. This talk is not limited to the Vault Agent. However, we will explore the approaches we can use to integrate Kubernetes and Vault.
A few words about myself, I am Elif. I'm a HashiCorp ambassador in Romania. Also very happy to build up the HashiCorp community in Romania alongside two other amazing organizers. I am trying to write as much as possible to pass down knowledge, and I invite you to look at the blog.
As a quick overview, let's start with a question. How many of you are using Kubernetes? A great deal. How many of you have encountered issues when using Kubernetes Secrets? Quite some. It is no surprise that Kubernetes adoption has been interesting over the past years due to the adoption of microservices — also taking into account that organizations have acknowledged the benefits of using containers and containerized applications.
However, as many of you may already know, Kubernetes Secrets has certain limitations — how are we able to address these? This is the time when Vault comes into action, and we'll see how HashiCorp Vault is able to address these limitations and improve and make our systems more secure. Afterward, we'll dive deep into how we can integrate Vault and Kubernetes — also doing some hands-on time showcasing these approaches and then draw some conclusions.
A lot of workloads are currently sitting on top of Kubernetes. There are a considerable amount of studies — an example would be the Gartner 2022 Hype DevOps Report for Agile — in which it is known that Kubernetes adoption will continue to increase over the next years.
This comes as no surprise because it is beneficial to development teams and also DevOps because readiness and time to production have gone much lower. On the other side, there are reports showing security concerns in the Kubernetes ecosystems are an interesting point.
In the recent study from Red Hat — the DevOps report from 2023 — shows a lot of professionals are concerned about the amount of money invested into security. It is a 7% raise in comparison to the previous year. That being considered, we are still witnessing a steady growth of Kubernetes as efficiency has increased over time.
Kubernetes Secrets are stored unencrypted in etcd. This is the key value data store of Kubernetes. We have some major disadvantages of using these.
Despite the fact there are some identity providers able to help with encrypting the data at rest, this is not sufficient because this approach might not meet the specific security requirements of the companies we work with. Moreover, configuring such providers requires certain actions by the cluster administrator, so it is not actually an easy task.
We are not able to track down who accessed certain data or when this data has been accessed. Moreover, in case a person gets access to the cluster, it is easy to get access to all the data stored within. So, security breaches are easy to encounter.
Natively, Kubernetes Secrets are not meant to be key rotated. Moreover, a great disadvantage or drawback of such an approach is that we need to restart all the Pods using such a secret, which results in a service outage. This is not desirable.
This is self-explanatory. Who accessed the data, and when? And how are these actions recorded? This is not possible using old plain Kubernetes Secrets.
Taking all these and coming back to Vault, why choose such a tool? There are a lot of advantages. However, before that, what is Vault? Vault is a platform developed by HashiCorp, which aims to help us in developing, storing, and managing sensitive data such as passwords, API keys, certificates, and Kubernetes Secrets.
This means every secret or secretive data is stored within a single place. This reduces the likelihood of teams managing secrets independently or having these stored in several places.
Vault comes natively with fine-grained access control mechanisms, which allows us to set privileges for certain data, either per-application basis, team, or individual.
Besides or additionally to the plain or static secrets, Vault is also able to help us with the use of dynamic secrets. Due to the limited life of these secrets, security breaches are less likely to happen.
Data is encrypted natively during transit and at rest, which comes in addition to only securing or encrypting etcd at rest. So Vault additionally helps us to encrypt data in transit
We know at every point and single point of time who accessed certain data.
I quite like this diagram because it showcases the effort HashiCorp has put over the years to align Vault with industry trends. As microservices adoption has increased over the years, we see how Vault has evolved to support these movements.
As an example, in 2017, with the release of Vault version 0.8.3, we see the release of the Kubernetes authentication method. Afterward, the Vault helm chart for the community and enterprise versions was released.
In our case, the attention will be drawn over the CSI driver — the Container Storage Interface driver — released in 2021, the Kubernetes Secrets Engine in 2022, the Vault Agent Sidecar Injector, and, very recently, the Vault Secrets Operator. I need to update this diagram because it has become generally available.
Compare Vault Kubernetes integrations in this blog: Kubernetes Vault integration via Sidecar Agent Injector vs. Vault Secrets Operator vs. CSI provider
The first one — I thought the best approach would be to take them chronologically — would be the Container Storage Interface. In a nutshell, the Container Storage Interface allows Pods in Kubernetes to consume storage backed up by Vault.
This means using Kubernetes-native resources, such as persistent volumes and persistent volume claims, can consume data from Vault. Although it is not recommended, we could sync these secrets from Vault naturally as Kubernetes Secrets. However, this behavior is not desirable because, in the end, the data from Vault will end up being stored in etcd. The Pods using the CSI driver are able to consume secrets by reading those from a certain path in the files in the Pod file system.
This is a mutating admission webhook controller that alters the Pod template definition and automatically injects secrets into Pods. A Pod started with a single container. Due to this mutating webhook controller, the Pod definition is changed. This means there is a Vault Agent init container introduced. As you might know, an init container runs before every other container in the container's list. And there is additionally a Vault Agent sidecar container that runs along the main one.
It is important to notice the sidecar pattern, which is the main key point of the injector because we apply the principle of separation of concern. Our application doesn't need to be aware of Vault because the other containers are taking care of the application to have the necessary data to properly function.
This also applies to the CSI driver. Previously, the easiest approach — or handiest — was for an application to simply integrate with the Vault API. As applications and architectures have evolved — and the effort to alter the codebase of applications was great and very costly — we needed different approaches so that we simply inject the data, and applications do not have any Vault knowledge.
The third approach and, I think, the newest and the nicest — and I will argue why — is the Vault Secrets Operator. The Vault Agent Sidecar Injector has a small disadvantage. I would say disadvantage because there is no good tool that is just the right tool for the right job.
The disadvantage of the previous approach is the Vault Agent Sidecar Injector gets deployed as a DaemonSet. A DaemonSet is a Kubernetes resource that will ensure there is a copy of the same Pod running on every node in our cluster. However, this means we need to allocate more resources to have the injector up and running — and properly intercepting events in our cluster. Moreover, having an init container and a sidecar container also requires more resources for the whole system to work well.
In contrast, the Vault Secrets Operator comes as a deployment with custom resources, which simply watch for resources in the configured namespaces that ask for certain sets of data. We'll see very soon on the demonstration that the Vault Secrets Operator has deployed using the helm chart. It is configured to authenticate with Vault. I need to check the change log with the latest list, but I would say that only the Kubernetes authentication method was able — and the Pod in the App Namespace is able to read the data using a static secret.
Very soon, in the demonstration, you'll see I have preconfigured the Kubernetes authentication method, which allows a service account within a namespace to authenticate to Vault using its token. This token gets automatically mounted at the Pod.
However, I need to point out the peculiarity starting with Kubernetes version 1.26: whenever creating a service account, a token associated with this one is no longer created. This means we actually need to create the token and, as it then gets associated with the service account. So, pay attention when working with the latest versions of Kubernetes. In a nutshell, this would be the Kubernetes authentication method. Let's get to the fun side of it.
I have enabled very quickly the KV version to Kubernetes KVV2 Secrets Engine and written a dummy secret. I have installed Vault using the official helm chart. You'll see in the first example that the CSI Container Storage Interface is enabled in the helm chart.
Also, I have installed the Secret Store CSI driver, and the helm chart was not very customized. The Secret Store CSI driver works with a custom resource called Secret Provider Class. That is configured with certain parameters to read secrets from Vault. Then we'll see how the deployment is configured to use this.
Let’s check that the Secret Provider Class exists in the namespace, and it does. This is a rather simple deployment that uses the Nginx container. You see that everything is mounted as a volume, and that volume is backed up by Kubernetes. We have created a deployment, and the next step is to check that everything is in place. The Pod is up and running within the HashiDays namespace, and we'll soon see the files are mounted at the given path.
By default, these are read for /mnt/secrets-store, and I expect to see the PG user and the PG password. Indeed, the first one is there, and it is developer. And the password is — I have destroyed the Vault instance, so don't try to hack it. I do not want to give you the impression that I'm just pasting passwords.
I have deleted the deployment and the Secret Provider Class previously created, and I want to show you how we can create or sync a secret. I am going to recommend that the secret — there will be a secret object created as well — which will reflect the PG user; respectively the PG password. And the secret name will be PostgreSQL minus credentials and will be a plain Kubernetes Secret.
The Secret Provider Class is in place. However, we should have no actual secret created in that particular namespace, and there are none. Let's alter the deployment file. We need to define an env list in which we specify the environment variables to be injected — and from the source, I mean the secret which is to be created.
And we'll notice something very interesting: Despite the provider class being in place, there was no secret created. However, upon the creation of the deployment and defining the environment variables which are read from that secret, actually the secret will be created.
They are in place — and then the secret was created. This is just for demonstration purposes. I need to say that I created the secret that ended up being in etcd — it is a highly discouraged practice. Also, the documentation would say the following. However, we need to be aware that this is possible. Decoding the secret and not decrypting it, it is indeed as set by myself.
Let's go onto the second one, the Vault Agent Sidecar Injector demonstration. As you have noticed, I have already deployed Vault enabling the injector flag in the helm chart value piles, and in this case, the deployment looks way easier. We have certain annotations which actually trigger the injection part — and that would be agent inject. We signal the fact that we want our Pod to consume secrets from Vault.
We define where the secret will be, and also using Consul templating capabilities, we define how the data will be injected into the file. In that case, I'm creating the connection string for a PostgreSQL database, and the role is a role of leader. As I was mentioning, I have beforehand configured the Kubernetes authentication method. The reader role allows only read capabilities at the key value Secrets Engine.
The agent is running. You should see here there are two out of two ready containers. Despite the fact that I have defined only one and exacting into that, it is important to notice we have three containers — Nginx one, which is expected because I have defined it in my deployment template and the Vault Agent init container, and the sidecar container. This would be the pattern upon which the Vault Agent Sidecar is based.
The third and newest approach would be the Vault Secrets Operator. At the time of this demonstration — it was still in beta. Vault Static Secret is a custom resource that comes built with the Vault Secrets Operator. It is configured to read a secret from the mount called HashiDays. The Secrets Engine is of type KV version two. The secret’s name is PostgreSQL, and the destination secret is also Postgres. In case the secret does not already exist within my cluster — as you can see at line 12 — this will be created.
Important to note is the refresh after, which is 10 seconds. This means in case the secrets get updated, the Vault operator can refresh the secret, and then a rolling deployment will start so that our Pods start using the new version of that secret. Up to this point, there was no availability to make Pods aware of changing the content of a secret. There is actually now.
Let's see what happens when creating the resource. This is created in the HashiDays namespace, and it is indeed in place, and we should also see the secret. The final step would be confirming the contents of the secret are as expected and as written in the KV version two Secrets Engine. Again, decoding the contents. And reading this I promise the server no longer exists. I would discourage everyone from displaying sensitive information. Until this demonstration wraps up. Let's take some conclusions.
Kubernetes will continue to grow in usage because the efficiency and effectiveness of teams and applications continues to grow. Either using on-premises Kubernetes, which is a bit more rare, or managed services. Teams and large organizations are migrating their workloads into a more containerized approach.
The advantages of using Kubernetes are actually obvious. However, Kubernetes and especially Kubernetes Secrets have a lot of limitations. The greatest of them being that our data is exposed in etcd unless this is encrypted.
However, encryption is difficult. This is why Vault comes with at least three different approaches. I'm sure we'll witness how HashiCorp continues supporting my applications, workflows, and systems to become more secure from the secrets management perspective. We have the CSI driver, which was released a few years ago. We have the Vault Agent Sidecar Injector and the Vault Secrets Operator.
Each of these has advantages and disadvantages. As I was saying, the Vault Agent Sidecar Injector needs more resources. However, the Vault Secrets Operator simply intercepts what happens in a Kubernetes cluster — and using the custom resources — it can create secrets, which in turn will be used by our Pods or applications.
However, the CSI driver simply doesn't natively sync these secrets as Kubernetes Secrets because the Pods will read these data from files mounted at the Pods file system. I would say that there are various approaches. There is no right tool, but it is rather important to be aware of all of these so that one can choose the right tool for the right job to resolve a certain problem.
That being said, thank you, and I hope you'll enjoy the upcoming labs. And before that, the break, if I'm not mistaken.