HashiCorp Nomad Dynamic Application Sizing
With the release of HashiCorp Nomad 1.0, Nomad Enterprise now includes Dynamic Application Sizing, an expansion of Nomad's existing Autoscaler capabilities.
HashiCorp Nomad 1.0 announced a new Nomad Enterprise feature, Dynamic Application Sizing (DAS). DAS enables organizations to optimize application resource consumption intelligently and non-disruptively at scale without the manual, trial-and-error of hardcoding resource requirements.
» Overview
Dynamic Application Sizing was designed with the following goals in mind:
- Reduce toil: running a Nomad job requires knowledge of how much CPU and memory to allocate. This is often an unknown value which results in a frustrating loop of trial and error “guesstimations”. But these values are far from set-and-forget settings. As your user-base increases or the job is updated with new code, the resource usage profile of the job will change as well. This requires even more work to monitor and track new limits. DAS monitors your jobs and provides you with recommendations for new limit values automatically.
- Maximize infrastructure usage: overestimating job limits can result in resource waste as servers will sit idle while Nomad is unable to schedule other jobs in them due to resource constraints. DAS detects overprovisioned jobs and recommends lower limits based on actual resource usage.
- Improve reliability: underprovisioned jobs can suffer from problems such as Out of Memory (OOM) errors and CPU throttling, causing reliability issues and requiring additional SRE attention. DAS recommendations are based on actual usage and it can detect when an application is starting to require more resources.
Nomad's Dynamic Application Sizing feature is comprised of three new components:
- A new Recommendations API in Nomad
- Nomad vertical autoscaling policies in the job specification
- DAS-specific plugins in the Nomad Autoscaler
Dynamic Application Sizing builds on top of Nomad's existing Autoscaler capabilities.
Using the new DAS plugins, the Nomad Autoscaler pulls a list of vertical scaling policies from Nomad. These policies indicate which jobs should be monitored by DAS and specify the strategy for making resource recommendations, including customizations based on the tolerance of the job to out-of-memory errors or CPU throttling. The Autoscaler pulls historical information about resource utilization from the configured APM and then proceeds to collect point-in-time resource utilization.
The Nomad UI provides statistics about DAS recommendations to build operator confidence.
The configured DAS strategy plugin consumes these metrics, determines an optimal resource value, and submits that value to the Recommendation API in Nomad. The UI displays these recommendations, along with statistics about the tasks that were computed by the autoscaler. After reviewing the recommendation in the Nomad UI, users can choose to dismiss the recommendation or apply it; applying the recommendation updates the job. Furthermore, after reviewing the recommendation, an operator can tune the scaling strategy.
» Getting Started
Here is a list of resources for more information and help you get started
-
Watch Presentation from Armon explaining DAS
-
Watch video demonstrating how DAS works
-
Read Learn tutorial on DAS concepts
-
Test it out yourself with a Vagrant-based demo on GitHub
-
Read documentation on configuring DAS
You can start a free 30-day trial of Nomad Enterprise or reach out to our sales team for more information.
Sign up for the latest HashiCorp news
More blog posts like this one
Terraform Enterprise improves deployment flexibility with Nomad and OpenShift
Customers can now deploy Terraform Enterprise using Red Hat OpenShift or HashiCorp Nomad runtime platforms.
Nomad’s internal garbage collection and optimization discovery during the Nomad Bench project
A look into Nomad’s internal garbage collection process and the optimization discovered during the bench project.
New approaches to measuring Nomad performance
See how the HashiCorp Nomad team re-examined how to capture performance for a workload orchestrator, resulting in new metrics to better capture Nomad’s performance.