Consul 1.14 GA: Announcing Simplified Service Mesh Deployments

HashiCorp Consul 1.14 introduces the Consul dataplane, service mesh traffic management across cluster peers, and service failover enhancements.

Nov 18, 2022

We’re pleased to announce general availability of HashiCorp Consul 1.14. Consul is a service networking solution that helps users discover and securely connect any application.

Consul 1.14 focuses on helping organizations simplify Consul deployment, improve resiliency, and enhance operational efficiency. Key features and improvements include:

Consul dataplane: a new simplified deployment model on Kubernetes
General availability of the cluster peering model for connecting multiple datacenters, first introduced as a beta feature in Consul 1.13
Service mesh traffic management across cluster peers
Service failover enhancements

Let’s take a closer look at all these enhancements:

»Simplified Deployments of Consul Service Mesh with Consul Dataplane

Consul 1.14 introduces a simplified deployment architecture that eliminates the need to deploy node-level Consul clients on Kubernetes. The new architecture deploys a new Consul dataplane component that is injected as a sidecar in the Kubernetes workload pod. This dataplane container image packages both an Envoy container and Consul dataplane binary.

»Before Consul Dataplane

Consul's original architecture relies on deploying many Consul clients along with Consul servers. In a more traditional environment where a user has VM-centric workloads, a client agent is deployed to each VM and all the services running on that VM are registered with the client agent. The client agent sees itself as the source of truth for the services running on its node and ensures the server's catalog is kept in sync with its registry.

The client agents and servers all join as part of a single gossip pool, which allows quick detection of node failures. This enables the Consul control plane to respond to service discovery requests, or dynamically alter service mesh routing rules to return only the remaining healthy services or to steer traffic away from the services on the unhealthy node.

This architecture worked well in the past, but as new deployment patterns for Consul have emerged, it has become a source of friction for deploying Consul successfully.

A high-level architectural diagram of Consul on Kubernetes prior to Consul 1.14.

On Kubernetes, a Consul client is deployed on each node and is responsible for registering services and pods for each service. During the service registration process, Envoy sidecars are injected into pods during deployment to enable mTLS and traffic management capabilities for the mesh. Although running clients per node on Kubernetes works well for most use cases, users would occasionally encounter friction due to restrictions on Kubernetes clusters that made it difficult to reliably run clients on all nodes, or due to the gossip network requirements that require low latency between members of a gossip pool.

»With Consul Dataplane

Starting with Consul 1.14, Consul on Kubernetes will deploy Consul dataplane as a sidecar container alongside your workload’s pods, removing the need to run Consul clients using a daemonset. The Consul dataplane component will primarily be responsible for discovering and watching the Consul servers available to the pod and managing the initial Envoy bootstrap configuration and execution of the process.

Consul dataplane's design removes the need to run Consul client agents, which brings multiple benefits:

Fewer network requirements: Consul client agents use multiple network protocols and require bi-directional communication between Consul client and server agents for the gossip protocol. Consul dataplane does not use gossip and instead uses a single gRPC connection out to the Consul servers. This means Consul doesn’t need peer networks between the Consul servers and the runtimes running the workloads, improving the overall time to value for users deploying Consul service mesh when Consul servers are managed by HCP Consul.
Simplified setup: Consul dataplane does not need to be configured with a gossip encryption key and operators do not need to distribute separate access control list (ACL) tokens for Consul client agents.
Additional runtime support: Consul dataplane runs as a sidecar alongside your workload, making it easier to support various runtimes. For example, it runs on serverless platforms that do not provide direct access to a host machine for deploying a Consul client. On Kubernetes, this allows you to run Consul on Kubernetes clusters such as Google Kubernetes Engine (GKE) Autopilot and Amazon Elastic Kubernetes Service (Amazon EKS) on AWS Fargate.
Easier upgrades: Deploying new Consul versions no longer requires upgrading Consul client agents. Consul dataplane provides better compatibility across Consul server versions, so you only need to upgrade the Consul servers to take advantage of new Consul features.

This new architecture simplifies overall deployment and removes the Consul client agent from the default deployment of Consul on Kubernetes. Notably, it also removes the requirement to use hostPort on Kubernetes, which was needed to reliably communicate across Consul clients. For deployments outside of Kubernetes, Consul clients will continue to be supported for the foreseeable future for both service discovery and service mesh use cases.

A high-level architectural diagram of Consul on Kubernetes in Consul 1.14.

»Mesh Traffic Management Across Cluster Peers

Consul service mesh allows operators to configure advanced traffic management capabilities such as canary or blue/green deployments, A/B testing, and service failover for high availability. These capabilities are configured using service splitters, routers, and resolvers. Consul supports these traffic management capabilities across WAN-federated datacenters, but many customers desired a more flexible support model.

In Consul 1.13, we introduced cluster peering in beta as an alternative to WAN federation for multi-datacenter deployments. Cluster peering is much more flexible than WAN federation, but it did not support cross-datacenter traffic management in Consul 1.13. As of Consul 1.14, cluster peering now supports both cross-datacenter and cross-partition traffic management.

The image below shows service failover to a cluster-peered partition in another datacenter:

The advantages of cluster peering compared to WAN federation include:

Each cluster (datacenter or enterprise admin partition) is independent, providing operational autonomy
Full control over which services can be accessed from which peered clusters
Support for complex datacenter topologies, including hub-and-spoke
Support for use of enterprise admin partitions in multiple connected datacenters

For more information on cluster peering, please refer to the Consul 1.13 release blog or cluster peering documentation. Visit the cluster peering tutorial to learn how to federate Consul datacenters and connect services across peered service meshes. For more information on configuring cross-datacenter traffic management, check out the service resolver documentation’s failover and redirect capabilities.

»Service Failover Enhancements

Critical services should always be available, even when infrastructure components fail. Achieving high availability involves:

Deploying instances of a service across fault domains, such as multiple infrastructure regions and zones
Configuring service failover; when healthy instances of a service are not available locally, traffic will failover to instances elsewhere

Consul supports configuring service failover, enabling you to create high availability deployments for both service mesh and service discovery use cases. Consul 1.14 includes several failover enhancements.

»General Failover Enhancements

Consul 1.14 further enhances the flexibility of service failover in all Consul deployments, enabling operators to address more-complex failover scenarios in which service failover targets may:

Exist in WAN-federated datacenters, cluster peers, and local admin partitions.
Have a different service name, service subset, or namespace than the unhealthy local service.

»Failover Enhancements Exclusive to Cluster Peering

In Consul 1.14, service mesh failover to cluster peers offers improved resiliency compared to failover to WAN-federated datacenters.

Failover involves sending traffic to a different pool of service instances. With WAN-federation failover, the control plane triggers service failover by reconfiguring Envoy proxies in the mesh to use the next failover pool, meaning failover depends on control plane availability. In Consul 1.14, cluster peer failover is triggered by Envoy proxies themselves, providing increased failover resiliency.

»Next Steps for Consul

We are excited for users to try these new Consul updates and further expand their service mesh implementations. The Consul 1.14 includes enhancements for all types of Consul users leveraging the product for service discovery and service mesh across multiple environments, including serverless functions. Our goal with Consul is to enable a consistent enterprise-ready control plane to discover and securely connect any application.

Learn more in the Consul documentation.
Get started with Consul 1.14 by installing the latest Helm chart provided in our Consul Kubernetes documentation.