Announcing HashiCorp Nomad 1.1 Beta

Nomad 1.1 delivers more than 10 new features to enable more flexible scheduling and a simplified operator experience.

Mike Nomitch

Nomad

May 3, 2021

Mike Nomitch

We are excited to announce that the beta release of HashiCorp Nomad 1.1 is now available. Nomad is a simple and flexible orchestrator used to deploy and manage containers and non-containerized applications across on-premises and cloud environments. Nomad is widely adopted and used in production by organizations such as Cloudflare, Roblox, Q2, Pandora, and GitHub.

Nomad 1.1 delivers more than 10 new features to enable more flexible scheduling and a simplified operator experience. The core Nomad scheduler has been upgraded with new resource control mechanisms to improve cluster efficiency, application performance, and volume management. Enhancements across the UI and API support Nomad’s dedication to a simple and intuitive operator experience. In addition to core Nomad improvements, the Nomad Autoscaler now allows for more flexible scaling policies on more cloud providers.

Highlights of the Nomad 1.1 beta include:

Memory oversubscription: Improve cluster efficiency by allowing applications, whether containerized or non-containerized, to use memory in excess of their scheduled amount.
Reserved CPU cores: Improve the performance of your applications by ensuring tasks have exclusive use of client CPUs.
UI improvements: Enjoy a streamlined operator experience with fuzzy search, resource monitoring, and authentication improvements.
CSI enhancements: Run stateful applications with improved volume management and support for Container Storage Interface (CSI) plugins such as Ceph.
Readiness checks: Differentiate between application liveness and readiness with new options for task health checks.
Remote task drivers (Technical Preview): Use Nomad to manage your workloads on more platforms, such as AWS Lambda or Amazon ECS.
Consul namespace support (Enterprise): Run Nomad-defined services in their HashiCorp Consul namespaces more easily using Nomad Enterprise.
License autoloading (Enterprise): Automatically load Nomad licenses when a Nomad server agent starts using Nomad Enterprise.
Autoscaling improvements: Scale your applications more precisely with new strategies.

Download Now

Let’s look at each of these in more detail.

»Memory Oversubscription

Increase the resilience of your applications and improve resource efficiency by using memory oversubscription. This feature enables Nomad tasks to exceed their allocated memory limit without throwing out-of-memory (OOM) errors, resulting in more efficient bin-packing and smoother handling of applications with variable memory usage.

task "redis" {
  driver = "exec"
  ...
  config {
    resources {
      cpu	= 500
      memory = 256
      memory_max = 512
    }
  }
}

Nomad 1.1 introduces an optional memory_max value in addition to the existing memory value in a task’s resources stanza. Nomad uses the memory value to reserve resources on client nodes while the memory_max value is used as a hard limit. Tasks that exceed their hard limit will be restarted and emit an OOM error.

The memory_max attribute is supported by all default Nomad drivers except raw_exec and QEMU.

»Reserved CPU Cores

Improve performance by pinning applications to run in isolation on exclusive CPU cores. This ensures that latency-sensitive or business-critical are not blocked by other applications running on the same node.

task "my-application" {
  driver = "docker"
  ...
  config {
    resources {
      cores  = 2
      memory = 500
    }
  }
}

Instead of setting a cpu value in the resources stanza, users can now optionally set cores instead. The client nodes will reserve this set number of CPU cores exclusively for the task and no other application will have access to these cores.

»UI Improvements

Version 1.1 of Nomad enhances the user interface in several important ways. The UI’s search now delegates to an external fuzzy search API, making it quicker to navigate to allocations, task groups, and CSI plugins, in addition to the jobs and clients that were already returned.

Improved resource monitoring helps Nomad operators better understand client resource constraints and health at a glance. Client CPU and memory charts now expose resource reservations for non-Nomad processes:

Allocation metrics now report resource consumption for individual tasks:

Namespaces are now treated as a filterable property in relevant views and are no longer selected via the sidebar dropdown. You can now view jobs across all namespaces with the “All (*)” option:

Finally, a new -authenticate flag on the nomad ui command. This opens the UI using a one-time token generated using the NOMAD_TOKEN environment variable:

»CSI Improvements

You can run a wider variety of stateful workloads on Nomad with Container Storage Interface (CSI) improvements. Applications using CSI now have access to more functionality in the CSI spec, including volume creation, volume destruction, and volume snapshotting.

This expands the set of CSI plugins that work with Nomad to include popular integrations such as Ceph. Additionally, scheduler improvements allow for easier management of volumes that must be associated with single allocations.

For more information on how to use CSI with Nomad, see these examples in the Nomad repository, or read our storage plugin documentation.

»Readiness Checks

Nomad 1.1 lets you make your deployments more robust by giving you more granular control over the health status of tasks. You can now differentiate between checks used solely by Consul for health and traffic routing (readiness) and checks used by Nomad for general application health (liveness).

This can simplify deploying Nomad tasks that have long-running setups such as cache-warming or database migrations:

check {
  name = "deploy-readiness-check"
  type = "script"
  name = "database-health"
  command = "/user/local/bin/db-health"
  on_update = "ignore" # or "ignore_warnings" or "require_healthy"
}

Check stanzas now take an optional on_update attribute, which determines Nomad’s response to failing checks.

Nomad will fail a deployment if on_update is set to "require_healthy" and the check does not pass. (This is the default behavior, and the existing behavior in releases prior to 1.1.) If on_update is set to "ignore" or "ignore_warnings", Nomad will ignore failing checks. Previously, these checks would have caused the deployment to fail. These checks will still be used by Consul to determine application readiness.

»Remote Task Drivers

In Nomad 1.1, you can deploy and manage workloads on a wider variety of environments using remote task drivers. Nomad can now manage the lifecycles of applications running on nodes where a Nomad agent is not deployed.

Nomad task driver capabilities now include a RemoteTask Boolean value. Remote tasks have their state propagated to replacement allocations when nodes are drained or down (lost). The remote task will remain running throughout a node drain or rescheduling of lost allocations. This allows Nomad to manage tasks running on serverless container managers, such as Amazon ECS, or function-as-a-service providers, such as AWS Lambda, without unnecessarily restarting remote services if the Nomad agent managing them crashes or is drained.

To learn more about how to write a Remote Task Driver, see the Elastic Container Service (ECS) Task Driver example on Github.

This feature is currently an experimental tech preview and feedback is welcome!

»Consul Namespace Support

This feature improves the interoperability between Nomad and Consul while simplifying the adoption of hierarchical network models powered by Consul Enterprise:

group "billing" {
 
  consul {
    namespace = "finance"
  }
 
  task "api" {
    # ...
  }
}

Operators can specify a namespace value in a consul stanza at the job, task group, or task level. Services defined within the same block will be registered in the given Consul namespace:

nomad job run -consul-token <token-of-namespace> job.nomad

Additionally, if a Consul namespace is not explicitly defined in the job configuration file, users can pass in a Consul ACL token via the nomad run command. This registers any Consul services in the token’s associated namespace.

»License Autoloading

Nomad licenses are now automatically read from the file system when a Nomad server agent starts:

server {
  enabled  	= true
  license_path = "/opt/nomad/license.hclc"
}

Nomad licenses can be set by adding a license_path to the server configuration, or by using NOMAD_LICENSE or NOMAD_LICENSE_PATH environment variables when launching a server agent. See the Nomad licensing documentation for more details.

The nomad license put command and the PUT v1/operator/license API endpoint have been removed in favor of autoloading licenses. This is a breaking change for Nomad enterprise users. See the 1.1 upgrade guide for more details.

If you would like to try Nomad Enterprise, get started with a 30-day trial license.

»Autoscaling Improvements

In Nomad 1.1., you can tune application and cluster autoscaling more precisely using three new strategies:

The pass-through strategy allows users to defer scaling logic to their APM of choice.
The fixed-value strategy maintains a fixed number of nodes.
The threshold strategy lets you toggle different scaling strategies based on whether a tracked metric is within a defined range.

See Nomad’s plugin documentation for more details.

The Nomad Autoscaler now officially supports horizontal cluster autoscaling for AWS Auto Scaling groups, Google Cloud managed instance groups, and Microsoft Azure virtual machine scale sets. Additional targets such as Digital Ocean, OpenStack, and Hetzner Cloud are supported via community plugins.

»Ecosystem Integration Update

The Nomad team plans to continue its significant investments in the ecosystem. In the past year, we have added CSI and CNI support and built our application and cluster autoscaler for three major cloud providers.

To better facilitate collaboration and contribution with ecosystem partners, we have launched the Nomad integration program, a self-service process with links and guidance to information sources, defined steps, and checkpoints. Check the integration program page to learn more about the program and check our dedicated ecosystem page to explore more integration solutions.

»What’s Next for Nomad 1.1

We encourage you to experiment with the new features in Nomad 1.1 but recommend against using this beta build in a production environment. We are eager to see how these new features enhance your Nomad experience. If you encounter an issue, please file a new bug report in GitHub and we'll take a look.

To watch new features in action, register the webinar here.

Finally, on behalf of the Nomad team, I’d like to conclude with a big “thank you” to our amazing community! Your dedication and bug reports help us make Nomad better. We are deeply grateful for your time, passion, and support.