The Life of a Packet Through Consul Service Mesh
Aug 05, 2020
Learn about packet handling in a HashiCorp Consul-based service mesh, both in a single datacenter scenario and a multi-datacenter scenario.
- Christoph PuhlConsul Technology Specialist | Field Technology Office, HashiCorp
For people who already understand how Consul works and what the concepts of a Service Mesh solution are, this session will deep dive into Consul Service Mesh leveraging Envoy as a sidecar proxy solution. This talk will discuss the packet handling within the Consul service mesh in a single datacenter and in a multi-datacenter setup leveraging Mesh Gateways in between Consul datacenters, and differences with Envoy running as a sidecar proxy instance for a service compared to running as a Mesh Gateway.
You can also read a white paper/ebook diving even deeper into this topic with the same title: Life of a Packet Through Consul Service Mesh
Hey, everyone. Welcome to this talk around The Life of a Packet Through Consul Service Mesh. If you have questions, feel free to make use of the Q&A panel. I have two of the most knowledgeable colleagues with me in relation to Consul—Paul Banks, our Consul engineering lead, and Nick Jackson, one of our developer advocates.
Today, we want to discuss the life of a packet through Consul Service Mesh. My name is Christoph Puhl. I'm working as an Overlay SE at HashiCorp, and I'm solely focused on Consul.
We have a federated datacenter. We have two services: a frontend service—which is called Dashboard—located in datacenter one. This service wants to connect to its backend service called Counting in datacenter two on the right hand side.
Today, we're not going to discuss the basics of a service mesh or the basics of how you set up the Consul service and the Consul clients—or how you bring together this WAN federation or how you write service definitions. This is a prerequisite for this talk today.
I want to discuss how the packets get handled on the data plane level by sidecar proxies and the mesh gateways in between the service endpoints.
The Life of a Packet
I use this image here on purpose. This is Sarah giving a talk at last year's HashiConf around all the underlying protocols on Consul. If you want to learn more about what's powering Consul—like Serf and Gossip—I encourage you to check out last year's HashiConf talk, which is called Everybody Talks.
Let's get back to the life of a packet. The Consul service mesh is split in two parts. The control plane is made up of the Consul server agents and the Consul client agent. The data plane underneath is made up of sidecar proxy and sensors of the mesh gateways—and with 1.8, also ingress gateways and terminating gateways. The data planning Consul is pluggable.
You can run Envoy as a sidecar proxy. We have our integrated sidecar proxy solution. You can run HAProxy as a sidecar. Today we only want to focus on the Envoy data plane—running Envoy as a sidecar proxy and running Envoy as a mesh gateway. In this environment, the Dashboard service wants to initiate a session to the Counting service. It wants to send its first packet to the counting service.
The dashboard service itself, it is not aware that it's necessarily using a service mesh. For the Dashboard service to connect to the Counting service, it's simply connecting to localhost. The service mesh gives the impression to the service that it's like a monolithic service running on the same box, even though it might be a distributed service.
The first thing that happens in this life of a packet is the session initialization by the Dashboard service—by the Dashboard binary towards the sidecar proxy. The sidecar proxy will then accept the session; it will take the packet and do a mapping from the listener—which the dashboard service uses to connect to—to the upstream services.
First, I want to discuss how the Envoy sidecar proxy got all this knowledge to know what to do with this packet. Because before the first packet hits the Envoy sidecar proxy, it already has all the knowledge baked in, so it doesn't need to do any callbacks towards the control plan.
Bootstrapping Envoy as a Sidecar
Bootstrapping Envoy as a sidecar proxy in a Consul service mesh is a two-step process. First of all, if a Consul client agent ingests a service definition—which has a sidecar proxy association—the Consul client agent generates key material for a certificate.
It will send a CSR towards the Consul servers and the Consul servers—or if you use Vault as an upstream CA—will return the certificate to the Consul client agent. This certificate will be used as the service identity within the service mesh later on.
Once the Consul client agent has ingested this service definition, it starts generating an Envoy bootstrap configuration. If you know how Envoy works on a high level—Envoy has two parts of configuration; it has a static part of configuration and a dynamic part of configuration. The Consul client agent generates the static part of the configuration. You can start an Envoy process manually with this bootstrap configuration, or Consul has the ability to invoke—and directly leverage—this bootstrap configuration.
This bootstrap configuration instructs Envoy to call back to the local Consul client agent through a GRPC interface, which hosts the XDS API for Envoy. Through this interface, the Envoy sidecar proxy will fetch all its dynamic configuration. It needs to do packet ending at the data plane level.
It will receive its listeners. The upstream listeners—like you have defined in your service definition; like the one the dashboard service connected to—it will receive a public listener that’s available to external services within the service mesh. If you have configured Layer 7 routing like traffic splitting or HDP routing, those routing rules will be pushed down to the data plane as well. We're not going to discuss Layer 7 traffic management today. The generator certificate will also be installed on the public listener of the Envoy sidecar proxy.
After the session is initialized by the dashboard service to the sidecar proxy and the sidecar proxy does a mapping. It maps the listener that it just received against the private listener connection—and it has a cluster associated with this listener.
It will check the cluster. It will see it's a cluster name—it's in a local datacenter. One thing to mention here—will also set this SNI header when we were setting up TLS connections later on. You will see here that the SNI header, which the Envoy sidecar proxy will set—and within the service mesh—is different from the cluster name itself. This is because Consul already knows that the upstream service is residing in a different datacenter—hence it needs a different SNI header. I will come back to SNI headers later on.
Next, this cluster has associated endpoints—you can see them down here. This is because the services residing in a remote datacenter is an IP address of the mesh gateway in the local datacenter. We have just one IP address here, but there could be multiple mesh gateways.
The certificate that was installed as a service identity looks like this. I highlighted what I think are the most important things. We have the issuer of the certificate—which was the Consul built-in CA functionality. We have a short lifetime on the certificate, which defaults to 72 hours, and we have the service name encoded as a subject. But we also encode a SPIFFE-compatible service ID within the certificate. It includes the service name, the datacenter, the service it’s residing in and—if you're using Consul Enterprise—the namespace that the service resides in. If you were doing Consul Open Source, the namespace would also be populated. But everything would reside in namespace default as in my example here.
All the certificates handling and the certificate rotation is managed through the Consul client agent. As an operator, you don't need to take care of creating new certificates, installing them in Envoy—the Consul client agent will manage all of this. As soon as the certificate is about to expire, the Consul client agent will generate new key material. It will request a new certificate, and we'll automatically rotate their certificate within the Envoy sidecar proxy.
How Does the Mesh Gateway Know How to Deal with the Packet?
With this knowledge, the sidecar proxy knows how to treat this packet. It has the upstream listener where the connection came in; it mapped it against the associated cluster. It chose an available and healthy endpoint out of the service endpoints for this cluster—and now it starts initiating a TLS connection to the next hop.
It takes whatever came from the dashboard service, puts it into a TLS session, and initiates a TLS session towards the mesh gateway. The mesh gateway will accept this connection. It will have the packet incoming, but again, the question arises: How does the mesh gateway know what to do with this packet?
As the mesh gateway is based on Envoy as well, one could think it's like the same bootstrapping process as it is for a sidecar proxy instance—but this is not the case. There are some huge differences between how Consul bootstraps a sidecar proxy and how Consul bootstraps a mesh gateway.
As it's an Envoy instance as well, we have these two configuration parts. First of all, the static part created by the Consul client agent, which instructs the mesh gateway to connect back to its local Consul client agent.
This image might be a little bit misleading. The Consul client agent could run—or the mesh gateway can run—somewhere else in your datacenter; it doesn't need to be on the sandbox necessarily. The Envoy mesh gateway instance would connect back to the Consul client agent and receive the dynamic parts of its configuration. Those consist of public listeners where it's reachable in the local datacenter—how it's reachable through the WAN from a remote datacenter. It will get upstream clusters, so it will get clusters for the remote mesh gateways located in different datacenters. It will have clusters for all the services running in its local datacenter. Importantly—and quite a big difference—it will get SNI routers installed.
A mesh gateway will not get any certificate installed. It will not get a service identity within the service mesh. It will not have any key material. Hence, the mesh gateway itself is not able to perform things like man-in-the middle attacks. It will not terminate TLS. It will not be able to connect to a service, and it will not be able to decrypt your service mesh traffic. When the mesh gateway gets installed with the SNI router—it’s pure TCP interception and SNI routing.
Let's have a look at an SNI route. First, we see the mesh gateway listener—and it has filter chains associated. For remote datacenters, we have a wildcard filter chain associated—as you can see here in the upper box. This says every SNI header that’s intended for datacenter two and matches this route will be sent over to the cluster where my datacenter two mesh gateways are located.
For all of you that don't know what an SNI header is, look at the initial TLS client hello, which came from the sidecar proxy instance. The SNI header is a TLS extension—the server name identification. It is an unencrypted part of the TLS handshake. It instructs—or it immediately tells—devices like our mesh gateway for whom this packet is intended. This is the same SNI header that was set by the Envoy sidecar proxy. It ensures the mesh gateway will only evaluate the SNI header to match it against its filter chains. It will choose one of the available remote mesh gateways within this cluster to start a connection to. As mentioned, the mesh gateway does not react to the TLS session initialization from the sidecar proxy, but it accepted the TCP connection.
The mesh gateway will initiate a new TCP connection to the remote mesh gateway while simply copying every payload—which was in the original TCP connection coming from the sidecar proxy—into the new TCP connection. It's doing TCP splicing between two TCP sessions—one coming from the sidecar proxy, one going to the remote mesh gateway while still forwarding the original TLS client hello coming from the sidecar proxy instance.
The packet then arrives at the remote mesh gateway side, and the remote mesh gateway was bootstrapped in the same way as the mesh gateway in datacenter one was. It has the SNI filter chains installed. It does not have a certificate. It will not perform a man in the middle attack. The mesh gateway in datacenter two—in this case—will check the SNI header in the TLS client hello. The difference is that now it has an exact match and not a wildcard match—for all the services residing in its datacenter too.
It will check the available service endpoints. In case there is more than one endpoint. It will pick one and initiate a new TCP connection to this Counting service sidecar proxy—while maintaining the TLS stream coming from the original Envoy sidecar proxy at the dashboard side. TCP splicing—copying things coming in on the left-hand side, to a new TCP session going to the right-hand side.
The final destination of the Counting service sidecar proxy will receive the initial TLS client hello. It will see that this packet is intended for the SNI it has, and it will react to this initial TLS client handshake.
I don't have like all the details of the TLS session set up in this presentation, but we'll end up with an encrypted channel mutually authenticated between the sidecar proxy instances.
Mutual Authentication Achieved
At this point, the dashboard service can rest assured—as the TLS session is mutually-authenticated—that it reached the upstream servers it wanted to reach. The counting service on the right-hand side in datacenter two—he knows who initiated the connection. Both parties have authenticated each other. We have this encrypted end-to-end channel between the sidecar proxy instances, without a man in the middle—even though there are intermediate devices forwarding the traffic between the datacenters; the mesh gateways. But as mentioned, they're not doing anything. This is like the first step of the session initialization within Consul service mesh: the authentication. But the session is not authorized yet.
To authorize the session, the destination sidecar proxy needs to do a local
auth see lookup against the Consul client agent. It will use the existing GRPC channel that it also uses for fetching its dynamic runtime configuration.
Intention Denies Communication
Two possibilities can happen. As you know, we have intentions in Consul that describe which service is allowed to connect to which service. The first option that obviously can happen is that the intention denies the communication. Our intentions would look something like this. As soon as you define intentions within Consul service mesh, you define them centrally, and they're going to be replicated across all your federated Consul datacenters. They're also getting replicated to the relevant Consul client agents. Every Consul client agent will have a local cache of intentions available locally with the intentions relevant for its services.
In this case, the Consul client agent does not need to call back for the authorization request to a central instance—it already has all the knowledge. We just need to the localhost lookup between the Envoy sidecar proxy and the Consul client agent to get the Auth see results.
To do this, the sidecar proxy will extract the SPIFFE IDs we saw earlier on in the certificate. We'll present those to the Consul client agent. The Consul client agent will check the service names and the namespaces against its intention cache. In this case, as it's not allowed—or it's denied—between the services, it will reply with a denied.
For logging purposes, it will also include the intention ID and the precedence of the intention. But now the Envoy sidecar proxy knows it's not allowed to forward this stream to the final destination service. In this case, the sidecar proxy will send a TCP reset towards the source of the traffic, and after that, the session is torn down, and we are back at where we started.
Intention Allows Communication
The second option is that the intention allows the communication. In this case, the intentions would look something like this—where we have an explicit allowed between the Dashboard and the Counting service; and the same process happens here. The contact client agent extracts the SPIFFE IDs, it presents the SPIFFE IDs through the accessing GRCP channel towards the local Consul client agent. The Consul client agent can reply immediately that this session is allowed. The sidecar proxy instance is good to forward the packet to the final destination application.
Approaching the Final Destination
But the packet still is not at the final destination—it's still like one hop before. Here there needs to be another lookup. Like the sidecar proxy, it has a local application cluster installed when it was bootstrapped, so it knows which port the local application is listening on. This is typically happening on localhost—the local app cluster is something like localhost and the listener port. The Envoy sidecar proxy will decrypt all the traffic—it will strip off all the TLS encryption that was added from the remote sidecar proxy, and whatever was in the TLS stream will then be forwarded to the final destination servers. The packet has hit the final destination service, and our session is set up between the Dashboard and the Counting proxy.
To reiterate what we have so far. We now have five separate TCP sessions running between all the involved pieces of the data plane to make this one packet move from the Dashboard to the Counting service. Having that many different TCP sessions enables us to do some interesting things. The sidecar proxy at the source side—like the Dashboard sidecar proxy—has no clue what the IP address is of the final destination service.
With mesh gateways, it's easy to interconnect environments that duplicate IP addresses. Like two Kubernetes clusters that were bootstrapped the same way and have the same non-routable overlay network. Or classical VM-based environments, which—through mergers and acquisitions—have the same IP address space. Or Kubernetes clusters with non-routable overlay networks to classical bare metal servers and things like that.
But on top of those separate TCP sessions, we have one end-to-end encrypted channel, which then can be used by the application data to flow through. It depends a little bit on what type of application it is and how it flows through this encrypted channel. If you configure a usual TCP application within Consul—like a database or something—then the application traffic is completely unchanged.
If you have configured Layer 7 traffic management within Consul—for instance, HTTP routing with a path prefix rewrite—then your headers can be rewritten by the Envoy sidecar proxy. For L7 traffic, whether or not traffic is flowing through this channel unchanged depends on your configuration. But as Layer 7 traffic management within Consul is a complete chapter on its own, I'm not going to dive into the details of this today.
We have a white paper around the life of a package through Consul service mesh on the HashiCorp blog which goes into much more detail than this topic due to the given amount of time. It doesn’t just cover the case we discussed—you will get more screenshots, packet caches, conflict snippets. It also has a complete chapter on Consul's Layer 7 traffic management and how those configurations end up being Envoy data plane constructs. With that, I'm at the end of my talk. Thanks for joining. I hope you enjoyed it and see you next time.