How the HashiCorp Stack Integrates with Kubernetes
Mar 29, 2021
In this video, HashiCorp co-founder and CTO Armon Dadgar will show where each HashiCorp product fits into the Kubernetes ecosystem.
- Armon DadgarCo-founder & CTO, HashiCorp
What does a Kubernetes cluster—and multiple clusters—look like when they're used in conjunction with HashiCorp tools? You get a much smoother experience in many ways using:
- HashiCorp Terraform for cluster creation and management as code
- HashiCorp Vault for more robust and safe secrets management
- HashiCorp Consul for multi-cluster service discovery and service mesh.
See HashiCorp co-founder and CTO Armon Dadgar draw out this architecture in this whiteboard video.
Today, I want to spend a little bit of time talking about HashiCorp tools in the context of using Kubernetes. I think we often get asked across the whole portfolio of HashiCorp tools, how are they meant to be used?
Are they used in conjunction with Kubernetes? Does it make sense to use them if there's some of that functionality that's already built into the Kubernetes platform? Where did the different edges and use cases of the tools belong? I thought we'd spend a little bit of time talking across the whole HashiCorp portfolio and where the different tools fit in alongside Kubernetes.
I think it makes sense to start to think about it through the lifecycle of managing a Kubernetes cluster. If we think about a Kubernetes cluster — it generally comes in maybe two different flavors. One flavor is a cloud-managed solution — think GKE, AKS, EKS. These are solutions where a cloud vendor is providing the Kubernetes cluster for us in a fully managed configuration.
Using Terraform To Define a Cluster
In a mode like that, one of the first problems we have is how do we define that cluster and stand it up? Oftentimes, this is where Terraform is used. Terraform can be used in what I'll call the southbound setting of Kubernetes — southbound in the sense that it's the operator's responsibility. We're setting up a Kubernetes cluster that's going to be used — let's say — by an application team or a separate team internal to the company.
On the southbound — in terms of standing up Kubernetes — we use the cloud-specific providers that Terraform has. This could be, for example, the GCP provider. It could be the AWS provider or the Azure provider, etc. — which provide us a resource to define a GKE cluster or an EKS cluster or an AKS cluster, etc. In this sense — on the southbound — we'll use it to stand up and deploy an actual Kubernetes installation.
Another common way that Kubernetes has stood up is self-managed. You're operating it yourself, might be on-premise with something like OpenShift, you might be using the vanilla distribution. In that case, you might be authoring Terraform directly. You're writing Terraform modules or in the case of OpenShift, you're using the OpenShift installer, which leverages Terraform. Here, you're writing a Terraform module — or using one that exists already — but still using it to stand up Terraform on the southbound.
Using Terraform To Configure a Cluster
Now we have a Kubernetes cluster. Before we hand it to our application teams, the next challenge is a bunch of setup we might want to do on that cluster. This might be things like defining namespaces, defining policy, defining quotas, things like that.
I'll call this the cluster setup, if you will. We're not quite yet at deploying applications, but it's that more policy-level governance type things we want to do.
At this level, we can also use the Terraform Kubernetes provider. There's a Terraform Kubernetes provider that allows us to specify things like namespaces, quotas, different policies, etc. As an operations team, we might still use Terraform — using maybe a module that defines how we want to configure our cluster when we stand it up and use that — to set it up.
Using Terraform To Define The Workload
Now at this point, we're ready for our applications to land on the cluster. We've stood it up, we've configured it, we have the policy in place. When we want to define an actual workload — Application 1 and Application 2, that's running on top of this cluster — there's a bunch of different choices of tooling that exists here.
We might define a workload through a YAML file and directly use the Kubernetes CLI tools like kubectl to apply and create these things, defining a deployment and the other objects we might need. It might be that this application is already packaged up as a Helm package — so we're using Helm to deploy and install this. There are a variety of different ways.
Our philosophy comes down to, we want to capture all of this as code. We really don't want to have people directly creating things through a UI or doing it through a CLI. Then we don't have a source of truth.
If we need to recreate it or rebuild it, we don't have it living as infrastructure as code to easily be able to recreate.
Whether it lives as a Helm configuration, whether it lives in a YAML file, whether it lives in native Terraform, the idea for us is — again with Terraform — you have multiple options. You can do it through the Kubernetes provider. This would let you directly specify a deployment object, a
DaemonSet, a CRD, etc. — much like you would model it in a YAML configuration. You can also use the Helm provider. If you're leveraging existing Helm packages, you can use that without having to refactor it out of Helm
It doesn't matter. Either way, you can still specify a Terraform configuration that's going to use both of these providers possibly at the same time — and you can even have this provider live in conjunction with these other resources. For example, if I'm running on top of GKE, in addition to my app running on Kubernetes, maybe I have a database that I need.
Within the same Terraform configuration, I can specify the database that I'm consuming as a managed service from Google, in addition to maybe using Helm and the Kubernetes provider to provision the application itself running on top of Kubernetes. In this sense, you can see how Terraform can be used with Kubernetes in almost three different layers.
There's southbound operator-facing, there's still like northbound. We're on the north side at this point. But it's still operator-facing instead of setup and configuration. And then all the way up — this is more developer-facing in terms of how do we consume the applications that run on top of Kubernetes, and other dependencies we might have — such as a database, a Blobstore, etc. All of those can be provisioned and managed through Terraform.
This is one piece of how we define the lifecycle; day one — stand this up; day two — if we need to modify and evolve it over time. Then day three, if we want to decommission it, whether we're tearing down this one app, we can use Terraform. Whether we're evolving the setup or destroying the entire cluster, we can use Terraform for all of that.
Using Vault To Manage Secrets
Moving into runtime further down into the lifecycle, the next challenge is often these applications frequently need sensitive credentials. For example, how does A1 — which maybe requires this database — get that database credential?
Kubernetes itself has a notion of secrets. The challenge is these secrets have relatively limited role-based access control around them, relatively limited audit, and in many configurations, are stored at rest un-encrypted.
For a variety of reasons, Kubernetes secrets tend to be a little bit on the weak side in terms of the guarantees we want of everything being encrypted at rest, everything encrypted in transit, tight access control, tight visibility.
It's very common that alongside a Kubernetes cluster, people will leverage Vault. Vault has a number of different use cases. Most commonly, it's used as a secret manager. If we're storing things like database credentials, certificates, API keys, those kinds of things, we store them within Vault and fetch them, but Vault also does more sophisticated things like dynamic secrets.
Instead of storing a static database password or a static certificate, Vault can generate them on demand by integrating with the database or acting as a certificate authority. This lets us get much more sophisticated and move to a model where we have constantly rotating ephemeral credentials rather than long-lived and static.
Using Vault for Data Protection
In addition to that, we can use Vault to do data protection. Treating it as a key manager in allowing us to encrypt data, sign transactions, tokenize things, etc. — and provide a simple application-level API.
For a number of different reasons, we might want to use Vault alongside to provide either core secrets for our applications or to act as middleware to actually protect application data. We tend to tie this application to a service account or a service jot. We'll have a dedicated Kubernetes service account that identifies — in this case — application A1. When this application comes up, it can then authenticate with Vault using the JWT identity that it has. Vault then integrates back and validates. We know this is application one, or this is application two.
Based on the identity of that application, we can then give it access to different credentials. It might get access to the database credential and a certificate and an API key. There are a number of ways this integration can work. We have a Vault agent so that this can act as an init process and a sidecar. The application's generally unaware. The init container does this workflow to authenticate against Vault, fetch the secrets, and put it on disk, "on disk," — really in a memory-shared segment.
When it boots, the application reads its credentials from /secret, and it's none the wiser than that credentials came from Vault. This gives us a seamless way to integrate an application without having to know about Vault. It boots and reads its credentials like normal — and great — those things happen to be getting sourced from Vault.
If the application is slightly more sophisticated and it wants to directly leverage the Vault API to do data encryption or tokenization or things like that, then once it's been authenticated through that Vault agent and the sidecar, it has a Vault token already available to it. It can use that token to directly query Vault and use its API and do things like data encryption and tokenization.
This gives us multiple options for simple use cases where we don't want our application to have to be aware of Vault and have to modify all of our applications. We can use the Vault agent and have a transparent workflow. In the other cases where we want the more advanced capabilities, we can still use the agent to do the base secrets and then leverage Vault API and SDK to get more sophisticated. This tends to be how Vault fits into the picture.
Using Consul for Service Networking
The next piece tends to be Consul, and I think this it's about service networking and then service mesh. You might say great, if I'm within Kubernetes already, why do I need a service networking or a service discovery solution? I already have it built in. Kubernetes has a notion of being able to do label-based selectors, and I can do the discovery, and I have DNS within my cluster — and that's often true. If my use case is only service discovery within a single Kubernetes cluster, I probably would be satisfied with the available primitives.
Cross-Cluster Service Discovery With Consul
I think the areas where it starts to become more relevant where we see Consul get frequently used is what happens when I'm spanning multiple Kubernetes clusters? I might have multiple because each of my app teams has a different cluster. This is app team A, and I have a different app team B. They have their own cluster, so their workloads are separate. Maybe I have application A1 needs to be able to reach B1. Now I'm crossing the barrier of the Kubernetes cluster — and that doesn't work particularly well out of the box.
This is one classic example; maybe it's because there are different teams. It might be a scale thing. Kubernetes clusters tend to work better on the smaller side. If I have a very large application — I'm sharding my workload across multiple different Kubernetes clusters —could be that this is East and this is West or cloud and on-prem; a number of different reasons here. In this scenario, we often end up saying that you can treat Consul as a common discovery layer across all of this. All of these clusters get bridged in so that Consul has a bird's eye view of what are all of these different services that are available across it.
Firstly, this enables cross-cluster service discovery. If I'm A2 and I need to discover B1, I can query it and do that service discovery and then connect to this instance across in a different Kubernetes cluster. This is level one and what you might consider that service discovery use case.
Consul, Service Mesh and Traffic Authorization
As we go one level deeper and talk about service mesh, I think what that starts to bring in is how do I do traffic authorization between all of this?
Service mesh has a number of different sophisticated use cases. It's everything from routing. If I want to get fancy in terms of saying great, when traffic comes in, I want 90% to go to version one and 10% to go to version two. I can do that traffic management.
From a security aspect, it also gives me the ability to say, great, A2 is allowed to talk to B1. A2 might not be allowed to talk to B2. I can express those more granular policies — in terms of what services are allowed to talk to one another — and the mesh will enforce that.
Using Consul to Interface With Kubernetes and Non-Kubernetes Workloads
Consul can act as a service mesh that spans not only multiple Kubernetes clusters, but other non-Kubernetes workloads as well. If I also have a set of VM-based applications that need to interface with Kubernetes-based applications, Consul can provide a way of spanning all of that. We can start and just use it during the discovery where we're routing. Then incrementally add the service mesh capability to get the more sophisticated traffic routing, security, observability, etc.
Stepping back of the common core HashiCorp portfolio, you can see how Terraform is used throughout the lifecycle. Everything from stand-up and configuration to application deployment. Vault can be used as a secret management solution as well as to provide a way for applications to do their data protection. Then Consul can act as a powerful service networking tool when you're spanning multiple Kubernetes clusters, cloud and on-prem, multiple regions, etc. It adds additional richness, not only in service discovery but in the service mesh as well. Hopefully, that gives you a sense of how the HashiCorp tools can be used within the Kubernetes ecosystem. Thanks.