HashiCorp Nomad’s addition of native service discovery offers a new option for simple service registration, plus integration with Consul for more complex service mesh use cases.
Many people use HashiCorp Nomad as a scheduler because of its flexibility, performance, and simplicity. In the past, we’ve focused on technical simplicity, where Nomad has functioned as just a scheduler, and that has been very valuable for many deployments. More complex deployments relied on bringing in additional tools and products to manage those operations.
Over the next few Nomad releases we will be working on further simplifying the user experience and workflows. Nomad 1.3 focused on extending this to service networking. Prior to 1.3, users had to set up a HashiCorp Consul cluster or roll out their own solution to register and manage services. While the integration with Consul is feature-rich and well supported, for simpler Nomad use cases and to just get up and running faster, we wanted to move some basic service discovery capabilities into Nomad itself. This meant that the stack could be lighter without requiring integration of a full service mesh product just for simple service discovery.
This post reviews the thinking behind this new feature and provide a decision framework for when a service discovery use case is too complex for Nomad's built-in discovery and the integration with Consul would be better.
Prior to Nomad 1.3, if users needed to connect multiple services, they could choose from three options:
For most use cases, it made sense to rely on Nomad’s Consul integration. A common challenge with this option, however, is that you need to separately install and manage a Consul cluster alongside Nomad, or have it hosted in the HashiCorp Cloud Platform (HCP). This choice offers many additional networking features, but it also increases deployment complexity as users have to learn about and maintain another product. For simple use cases or when just getting started with Nomad, this added complexity is less than ideal.
Nomad 1.3 introduced a simpler option: native service discovery. This new feature lets users get Nomad tasks to make requests to other tasks, register, and discover services all through template stanzas or the Nomad API. This is possible because the Nomad state already contains all the necessary information to discover routing information regarding Nomad task services.
Either the Nomad or Consul provider can be specified in the service
stanza and Nomad will manage registering, updating, and deregistering services with the defined service provider. All services within a single task group must use the same provider value. By default, Nomad services use the consul
provider to ensure backwards compatibility.
So which provider is best for your services — Nomad or Consul? The key tradeoffs between the providers are operational simplicity and functionality.
The Nomad service provider is a great option if:
Selecting the Nomad provider means choosing operational simplicity over functionality. Services using the nomad
provider get only a subset of the functionality enabled by the consul
provider. As of Nomad 1.3.2, you can use the Nomad provider for simple service discovery and load-balancing use cases.
There is no need to set up a separate Consul cluster to run alongside your Nomad cluster and you have to be familiar only with Nomad documentation and terminology, which can simplify the user experience.
Once the provider is set to nomad
, you can use the new template functions, nomadService
and nomadServices,
to query and interact with services. The function requests are tied to the same namespace as the job that contains the template stanza. Service
commands are also new in Nomad 1.3 and can be used to interact with the service API.
You can define services, service providers, and service queries in the Nomad job spec. The optional address
parameter allows you to configure a public or custom address to advertise in service registration. To query a service’s address and port information with Nomad, you can use the nomadService
function in the template stanza:
As of Nomad 1.3.2, you can use the nomadService
function to support simple load balancing by selecting instances of a service via rendezvous hashing. To enable simple load balancing, the nomadService
function requires three arguments:
With the release of Traefik Proxy 2.8, Traefik Proxy introduced the Nomad provider to allow users to integrate natively with Nomad for ingress and reverse proxy:
Prometheus 2.37.0 introduced a new plugin to support service discovery for Nomad. Modeled on the existing Consul plugin, it lets you discover scrape targets using Nomad's services API.
For complex or large-scale deployments, integrating with Consul provides the added functionality that those workloads require. Selecting the Consul provider allows you to use Consul for service discovery and offers features like health checks, a key-value store, service mesh, and robust support for multi-datacenter deployments.
To use Consul with Nomad, you need to configure and install Consul on nodes alongside Nomad, or schedule it as a system job. There are a number of configuration options and features that can be implemented only when Consul is the chosen provider.
Consul provides a DNS interface that downstream services can use to find the IP addresses of their upstream dependencies.
Service stanza parameters for Consul service discovery include:
enable_tag_override
: Lets users of Consul's Catalog API make changes to the tags of a service without having those changes overwritten by Consul's anti-entropy mechanism.meta
([Meta][]: nil)
: Specifies a key-value map that annotates the Consul service with user-defined metadata.canary_meta
([Meta][]: nil)
: Specifies a key-value map that annotates the Consul service with user-defined metadata when the service is part of an allocation that is currently a canary. Once the canary is promoted, the registered meta will be updated to those specified in the meta parameter.Another notable feature of the Consul agent is managiing system-level and application-level health checks. You can leverage these health checks to ensure that only healthy services are discoverable or for general monitoring purposes within your datacenter. To create a health check in Consul, you first need to define the monitoring scope, then write a check definition before registering the check with Consul.
Service stanza parameters for Consul health checks include:
[check](https://www.nomadproject.io/docs/job-specification/service#check) ([Check](https://www.nomadproject.io/docs/job-specification/check): nil
): Specifies a health check associated with the service. This can be specified multiple times to define multiple checks for the service. At this time, the Consul integration supports the grpc
, http
, script
, and tcp
checks.Nomad can be used with Consul Connect, the component shipped with Consul that enables service mesh functionality. When the service mesh feature is enabled, multiple sidecar proxies are deployed alongside your application. The sidecar proxies handle all aspects of security, observability, and traffic management.
Service stanza parameters for Consul Connect include:
connect
: Configures the Consul Connect integration.Consul KV can be used as a central configuration store so that configuration can be referenced outside of the application. Consul’s natively integrated RPC functionality allows clients to forward requests to servers, including key-value reads and writes. Nomad also provides integration with Consul namespaces — a Consul Enterprise feature — for service registrations specified in service blocks and Consul KV reads in template blocks.
A recent improvement in Nomad lets you advertise arbitrary IP addresses or domain names for either Consul or Nomad services. Now, jobs can advertise services on any address routable to the node it is scheduled on and you can advertise the public IP of a node instead of the private IP.
While the focus of this blog post is to help you understand the functionality and limitations of the Nomad and Consul service providers, it’s important to briefly review the pros and cons of using Nomad with third-party tools vs. hardcoding IPs and ports.
Nomad’s agnosticism about service discovery lets users connect services using other third-party tools. This lets people use the tool they are most familiar with, but also demands a high level of configuration. This option can be challenging for those new to Nomad due to a lack of support and documentation.
This is a possible solution, especially if you are just getting started with Nomad and have only a single service or two. But hardcoding IPs and ports is generally not recommended outside of testing since the architecture can become brittle and difficult to scale.
Service discovery is now a native feature in Nomad, which gives users a lighter-weight option that is easier to get started with for simple use cases. This feature is not meant to replace Consul for most users, especially those with complex deployment requirements around service mesh and service discovery. Both native service discovery and the Nomad-Consul integration will continue to improve and expand going forward.
If you want to learn more about this new addition to Nomad, check out the following resources:
Attending KubeCon EU, either in person or online? Check out what HashiCorp is doing and talking about at the event, and learn about recent Kubernetes-related product features.
Configure Consul’s transparent proxy on virtual machines to find and connect to services in the service mesh with DNS.
HCS on Azure has been deprecated. HCP Consul on Azure is the preferred way to run your HashiCorp Consul clusters on Microsoft Azure.