How does Consul provide a unified registry for Kubernetes and other workloads?

HashiCorp co-founder and CTO Mitchell Hashimoto illustrates how Consul can act as a service mesh and service registry that connects containerized Kubernetes clusters and other types of workloads such as VMs and bare metal.


For a newer whiteboard video on this topic, watch Armon Dadgar's How Consul and Kubernetes work together.


Let's talk about how Consul works with Kubernetes. To start let's see how Consul runs in Kubernetes. The major components of Kubernetes are the nodes. There are multiple nodes within Kubernetes and these are usually mapped to the ends or physical machines or so on. Within the nodes, there are a number of pods. Pods can be multiple containers, but they're running isolated within each node. And then all the nodes put together collectively form what is generally viewed as a single Kubernetes cluster within a single region or data center, the availabilities and so on.

This is the standard Kubernetes layout. The way we deploy Consul in Kubernetes is we use what's called a stateful set for the Consul servers because the Consul servers have state associated with them. They have to store data, back up that data, persist the data. So we run a multiple node stateful set to run the Consul servers.

Let's say we run three of those so that they could perform leader election. We make sure that they run on separate physical nodes so that if one node is wiped out you can still perform the leader election; you don't get stuck with all three running on a single node. And then with the server setup we also run all the agents on every single node as well.

Let's add some more nodes in here. Every single node is going to run an agent and three nodes in particular will run a server as well.

The agent is a really important part of Consul. The agent does a lot of caching, does a lot of permission-based things, does a lot of health checking. It forms an important backbone of the Consul cluster. So it's important that you have an agent on every node, but we do keep in mind with the normal Kubernetes data model where we run them directly in a pod. So you don't need to install anything special on the Kubernetes node. You just deploy it like any other Kubernetes application. This means that you could deploy Consul on any hosted environment. You don't need direct access to the VM underneath or any of that. You can run Consul anywhere.

The other benefit of having Consul on every host is that we expose the API directly onto the node and so when any pod on that node wants to talk to the agent, it uses the host IP to talk directly to that local agent. That's where you get some of that performance benefit and that's also how you simplify how to find Consul. You don't need to use Kubernetes service discovery to find Consul. It's always the local node and talking directly to the agent. And the agent itself knows how to talk to the Consul servers.

Consul also has a lot of networking requirements. Every single agent has to be able to communicate to each other, but luckily we just use the standard Kubernetes networking model that's required by every Kubernetes cluster to make this work across any Kube cluster. So all the pods just connect directly to each other to form their gossip protocols. And the Kubernetes networking model requires that all pods within a single cluster are routable so this works no matter what overlays and underlays you're using with Kubernetes.

With all this here, the biggest take away is that Consul is deployed onto Kubernetes like any other application on Kubernetes. We don't do anything weird or special or unique. We're using the primitives that Kubernetes gives to us in order to make this possible.

So once you have Consul running on Kubernetes, one of the biggest benefits you get from that is being able to do service discovery, service configuration, and service segmentation across other clusters, both Kubernetes and non-Kubernetes.

It's very common for large companies to have multiple Kubernetes clusters, which let's just represent by another large box. It's also very common that the company has non-Kubernetes workloads. These might be VMs running elsewhere and we'll just make those little boxes. Or those could be bare metal servers or something else. Usually you're migrating to Kubernetes or you're just maintaining different workloads and so you just have these, this heterogeneous workload set that's running across platforms.

Kubernetes itself provides you with all these nice first-class mechanisms, like service discovery, configuration and so on. But that doesn't translate well outside of the cluster and those primitives aren't available in non-Kubernetes environments. So one of the benefits you get from Consul is getting those primitives available, both anywhere but also across anything.

The most basic example is service discovery. How do you connect something that's not in Kubernetes to something else in Kubernetes, or two services that are in different Kubernetes clusters? How do they find each other and how do they communicate?

Let's start with another Kubernetes cluster. If you have this Kubernetes cluster here and let's say that it has a pod running here and this pod wants to talk to another pod over here. Maybe it's a web service to a billing service or it's a database in one and a backend in another. For some reason these two pods want to communicate to each other.

Finding another pod inside this Kubernetes cluster is very, very easy. Finding it across, you really have no tools to do that unless you use something like Consul. What Consul gives you is a unified service registry. Let's represent this box as Consul over here, even though Consul could run anywhere. Consul could be in this Kube cluster, it could be outside. It could be anywhere. But if this is the Consul cluster, it's taking all the services from this Kube cluster and registering them here. It's taking all the services from here and keeping track and then it's keeping the services from this Kube cluster also in a single registry.

When this pod wants to talk to something over here, it could ask Consul, "Where is that thing and how do I talk to it?" And Consul's the one that responds with the correct IP address that could be routed to from here so that these two could talk directly to each other.

Consul has the notion and understanding that if you're in the same data center it will return what's called a LAN IP. A locally routable local network IP. And when you're talking across data centers, it'll return a WAN IP in order to communicate across data centers. That's how we view these two things. They're geographically separated logically even if maybe they're in the same physical location.

Then that's also the same for your bare metal machines or your VMs and so on. These VMs could just run the Consul agent as a process on the machine that participates in the same catalog syncing over here so that when they need to talk to something in Kubernetes they can do the same thing and talk directly in and vice versa. Things could talk directly down.

The best part about this is for Kubernetes the way we expose this is by syncing directly to Kubernetes services. So other Kubernetes services can continue to use the same first-class mechanisms and features that Kubernetes provides you. Kubernetes' DNS environment variables, etc. So these don't know that there's another system helping make this possible. It just all feels native to the platform.

For the non-Kubernetes workloads, Consul uses Consul DNS which is exposed as a local DNS that has all the IPs of everything in the registry and so in the same way, you're using what you would normally use—just base DNS that's configured to the local machine—to find something that you don't need to care or realize is in a totally different platform or a totally different networking layout.

Consul manages how these connections happen for you. The other big benefit is when you start looking into features like Connect, which is a feature that does mutual TLS between any two services to ensure authentication and authorization of connections. It's what allows you to define: a web server can talk to a database or cannot and so on, and enable that to happen. Because Connect is based on end-to-end TLS, it doesn't matter what's in the middle of anything as long as the whole connection is properly encrypted with the right TLS certificate.

Consul does this for you. And because we set up the right certificates across all these clusters, when this service down here talks back into Kubernetes here, as long as this is using the right certificate all the way through, this will just work. Consul will be asked if these two services can talk to each other, Consul will say yes or no, and the connection will be established.

This works through load balancers, across various cross data center tunnels. Pretty much any networking layout as long as the two endpoints can communicate end-to-end to each other. This is a huge benefit because you can get both local and global secure encrypted connections between any two services and you can centrally manage with Consul what services can talk to each other. Even though they're in totally different platforms.

The last place this really helps out is with workload migration. So let's say you're a company that's planning a migration to Kubernetes. You'd like to in three to five years be 99% Kubernetes, but the reality is today you're realistically less than 10%. Because everything is exposed—from a discovery, connectivity, and security standpoint—as a single interface through Consul, it doesn't really matter where those are, so it gives you a really nice migration story.

You could slowly migrate applications from your legacy environment into your new environment, and both the developers and operators of the systems don't need to realize where they're currently running. It all feels the same from a workflow perspective. That's the benefit Consul's giving you in these cross-cluster environments: it's uniformity.

More resources like this one

  • 3/15/2023
  • Case Study

Using Consul Dataplane on Kubernetes to implement service mesh at an Adfinis client

  • 1/20/2023
  • FAQ

Introduction to Zero Trust Security

  • 1/4/2023
  • Presentation

A New Architecture for Simplified Service Mesh Deployments in Consul

  • 12/31/2022
  • Presentation

Canary Deployments with Consul Service Mesh on K8s