Get a preview of Nomad 0.11, its new autoscaling feature, and the Roblox & Nomad case study in our first Nomad Virtual Day.
To watch demos of *all** the major Nomad 0.11 new features, check out this demo session: Nomad Technical Demo: Autoscaling, CSI Plugins and More*
On March 11, 2020, HashiCorp held it's first ever HashiCorp Nomad Virtual Day livestream. The 2+ hour stream included a case study from a Technical Director at Roblox as well as demos from the Nomad engineers at HashiCorp, concluding with an all-speaker panel at the end of the stream. One of the main topics was the updates coming in Nomad 0.11
Attendees had a chance to ask questions to the HashiCorp Nomad team directly and get questions answered on functionality, use cases, and roadmap.
(in the following order)
Roblox Customer Presentation: Learn how a gaming platform at global scale serves over 100 million monthly active users with a lean operation team and HashiCorp Nomad. Rob Cameron will share his Nomad journey at Roblox.
Hands-On Demo - The New Autoscaling Feature in Nomad 0.11: Autoscaling enables users to dynamically scale up or scale down their infrastructure based on the true load of their applications. It has gradually become a critical capability for any workload orchestrator. In this talk, we will walk you through the upcoming Autoscaling in Nomad 0.11.
Hands-On Demo - The New Task Dependencies Feature in Nomad 0.11: Have you been writing wrapper scripts to orchestrate the ordering of your tasks? Do you have long-running containers sitting idle, taking up resources while waiting for other jobs? The upcoming Nomad 0.11 supports task dependencies through first-class features and ecosystem integrations. Through demos, we will show you how the Nomad scheduler can simplify these operations.
Panel Discussion and AMA: See Nomad experts, engineers, solution architects, and real users answer attendees' Nomad questions.
Q: Do you spin up servers with nomad as needed on cloud providers to minimize lag/jitter? How do they manage transferring data ?
We do. We use direct links to cloud providers to our own infrastructure. As the second question is slightly vague, for network data we use ACLs and security groups based on predictive ranges we cut out for our direct cloud links. For storage, we actually run Portworx nodes in clouds as well as on-prem, and data transfers between them automatically.
Q: Can you run .NET core apps on windows server directly without containers?
Yep! raw_exec handles this, and the artifact stanza even makes deploying new versions of code a snap. That said, our .NET core apps are actually all running on our Linux hosts in Docker.
Q: How do you run Traefik?
We run a Traefik in a container, as a service job, with host networking (to expose it to the Consul agent directly) utilizing the consul catalog provider. Clients can use a Nomad template to discover the containers and do direct routing with host headers, or we provide CNAMEs which point to traefik.service.(dc).consul. If anyone else is going to do this ensure they use 2.1+, as there's some architecture changes in Traefik to make it much gentler on large Consul catalogs.
Q: Two political issues I face, my organization doesn't want containers yet and has a phobia for open source. How has anyone overcome these?
With Nomad, being in containers or not doesn’t matter as much. Nomad provides a lot of the orchestration features you want and can get with containers, but can do it without the need for containers. We have support for a few different drivers beyond docker, such as raw executables and the java jar driver.
As far as open source is concerned, it depends on what your company’s phobia is specifically around, is this fear of no support? Unsure of the code? With Nomad, while we have an open source product, your company can get the backing of support and the Nomad engineering team through the purchase of the enterprise edition of Nomad. With Enterprise you get some pretty interesting features (quotas, namespaces) too.
Q: How do you "sell" the idea of Nomad versus Kubernetes? We have a pretty aggressive kubernetes fanboy that runs the current k8s cluster setup. How do you broach the topic that it might be better to consider nomad? Our kubernetes setup is fairly new, and relatively stable, but still in it's infancy. Any ideas?
When thinking about the approach of nomad vs kubernetes internally, especially as a practitioner showing others, the focus should be on ease of use. It’s more of a show than a tell story. Standing up nomad on your laptop, deploying a job quickly, building simple terraform code to launch and manage a job can go a long way, and this is all stuff you can do extremely fast. Doing a similar thing in kubernetes, while absolutely possible, requires a lot more steps and 3rd party tools built to make it possible. What can be gained from these examples is how quickly a production environment can be built and made stable, again, from a easy to manage single binary.
There is also a really good getting started track located at https://play.instruqt.com/hashicorp/tracks/nomad-basics
Q: Is the built in connect proxy recommended for production?
Nomad’s Connect integration only supports Envoy for proxying today and is ready for production use in Nomad 0.10.
Q:Are there any plans for more control over the scheduling of queued jobs waiting on available resources?
We haven’t planned it, so please feel free to reach out (file an issue, start a discussion on the forums, etc)! Batch jobs are a popular use case for Nomad, and we’d love to hear what would make them even better.
Q: What is the easiest way to run a stateful application requiring some persistent storage on nomad on AWS using EBS with using any other third party tool like portworx?
The Stateful Workload track on Learn is the best place to start. As of Nomad 0.10, hostvolumes are the preferred way to run stateful workloads. The ephemeraldisk stanza is also useful for stateful applications which can tolerate the loss of a node (such as distributed systems like Elasticsearch which always have more than one copy of data). The upcoming Nomad 0.11 will include improved storage support with CSI plugins.
Q: Are there any plans to support allocating multiple jobs wanting a fraction of a GPU to a single gpu resource?
Unfortunately this request isn’t scheduled yet on our roadmap. Please vote on the issue. Comments containing your use case are often helpful as well. Enterprise customers should make sure their technical account manager is aware of this need.
Q: So this can orchestrate docker containers (replaces swarm)?
Yes! Running Docker containers is probably the most popular way to run services with Nomad. Nomad also supports a variety of other workloads either through official task drivers (e.g. executable binaries via “raw_exec”, or virtual machines via “qemu”) or community drivers (eg podman or FreeBSD jails).
Q: For Erik’s demo: how are you running nomad.. just locally? inside docker?
Just locally. We will provide a repo for the demo later after release and share with the community
Q: Just curious, what is the expected behavior if using prometheus plugin, things are running smoothly and suddenly prometheus is unavailable (like it goes down for some reason)?
In the situation described in the question the autoscaler would do nothing since it's missing data to make any decision.
Q: Will there be any integration between APM and job update strategies? It would be nice to be able to have an update only happen when load was within certain parameters.
Nomad’s horizontal autoscaling that was demoed will integrate with APMs to make job task group count scaling decisions.
There are not currently any plans to use this APM integration in other parts of Nomad (such as the existing update block), but it’s an interesting idea! Please feel free to file an issue or post to our discussion forum. Enterprise customers should make sure their technical account manager is aware of this feature request.
Q: I do not really get the example with scaling by avg_cpu load. I expect many different jobs will run on one nomad client node, wouldn't these rules clash between themselves?
The avg_cpu metric used in the demo represents the average CPU load of the task group in question, not of the entire Nomad client node that an allocation is running on. In the demo, Erik had set the cpu resources of that task to only 50, which very quickly made it go over its 100% (1.0) and scale up.
Q: What happens if your cluster doesn’t have the capacity to scale? Guess cluster autoscaler kicks in, but are you able to configure buffer tasks that can be ejected in the event capacity isn’t there?
Yes. Nomad's usual "no capacity" logic will apply (preemption and pending evals) so that if a infrastructure autoscaler is also in play it could use the number of unplaced allocations as a metric to scale by. This is a natural target for horizontal scaling and we have it in our roadmap to support soon.
Q:Are memory soft limits coming soon with docker containers?
We know that this is an important feature for users and plan on implementing it soon! Unfortunately it is highly unlikely to be in Nomad v0.11.0, but it is definitely on our short term roadmap.
Q: Is there a plan to have an equivalent of a StatefulSet at the job level? Thought: The job would be "stateful" in addition to being a "service" type job, and coupled with "volume" persistence, this could give equivalent semantics of a StatefulSet. Use case: zookeeper, kafka, (possibly redis as master-slave), etc.
While we do not plan to directly implement a StatefulSet job type, Nomad v0.11 should meet most if not all of the needs that StatefulSets address. StatefulSets guarantee uniqueness and ordering. While Nomad has always provided allocations a unique index, Nomad does not have inter-job-dependencies which could imply ordering. A blocking prestart task dependency could provide ordering by blocking on a Consul key or service before running a task.
More complex workflows will be left up to ecosystem integrations with tools like Airflow and Spinnaker.
Q: We wanted to replace all clontabs on a server with nomad scheduling. We wanted to trigger the scripts via a docker ssh container (managed by nomad & Terraform)instead of directly triggering the scripts via nomad agent. Please help me understand if this is a good design and how to implement it?
Nomad supports periodic jobs with crontab-compatible time specifications. These periodic jobs would run your scripts directly with either a Docker container or directly via the exec or raw_exec drivers if Docker containerization is not needed or possible.
The job could be specified using Terraform’s Nomad provider. There should not be any need to use ssh, although Nomad does not restrict a Docker container, exec task, or raw_exec task from running commands like ssh.
Q: Hi, regarding task dependencies- can the new feature run several dependent tasks and on fail restart from the point failed, or we'll need for the Airflow integration? ex: run containerupdatedb -> runbackupdb -> containerdeletedb if failed backup restart from backup
Task Lifecycle Hooks does not support explicit directed dependencies (such as a directed acyclic graph), but similar to Linux Runlevels, it supports several levels of dependencies (e.g. PreStart -> Main -> PostStart -> PreStop -> PostStop). Lifecycle Hooks could be extended to support a PostFail hook. For a complex failure mode deployment pipeline, an Airflow or Spinnaker integration will be needed.
Q: Any plans to add deployment groups ? (e.g. 5 jobs working as a single deployment unit with shared rollback if one of the jobs fail)
No plans currently. Nomad has traditionally treated all jobs independently. That being said we have started exploring features like multi-region jobs which may entail new ways of orchestrating multiple jobs. Please feel free to file an issue or post to our discussion forum. We love getting detailed use cases from users and integrate them into our internal design documents and discussions! Enterprise customers should make sure their account manager is aware of this feature request.
Q: how do you in a large cluster handle performance and avoid split brain when consistency is required - is it configurable? to pick the trade-off you want - and if split brain does happen, how to gracefully recover (if that does occur)?
Nomad runs 2 types of agents (sometimes called “daemons”): servers and clients. Servers perform scheduling operations, and clients execute workloads.
Nomad servers use the Raft consensus protocol to ensure they all maintain the same cluster state (which jobs are running, which nodes are live, etc). It is not possible to relax their consistency guarantees, but due to a number of optimizations (opportunistic concurrent scheduling, operation batching, all state is resident in memory) scheduling should be low enough latency to meet a very high level of job throughput at low latency.
Nomad clients receive their work from servers and heartbeat to maintain their livenes. The period of this heartbeat is configurable. A lower heartbeat interval means work on a crashed client will be rescheduled more quickly than if the heartbeat interval was higher. However many users prefer a higher heartbeat interval to avoid a flaky or slow network from causing unnecessary rescheduling. The default should work for common deployments on common on-premises and cloud infrastructure, but is easily tunable.
Q: We are building greenfield microservices with nomad running docker containers and qemu virtual machines. Are there any tips or best practices for running containers and VMs under the same nomad control plane? Any known issues or things to look out for?
Nomad should handle this well and will schedule both workload types without problem. Roblox had some great advice! Experiment. Bias toward giving your users more control rather than less (if your compliance needs allow it). Learn is our best resource for getting started and best practices.
Q: Is there a plan to have an equivalent of a StatefulSet at the job level? Thought: The job would be "stateful" in addition to being a "service" type job,and coupled with "volume" persistence, this could give equivalent semantics of a StatefulSet. Use case: zookeeper, kafka, (possibly redis as master-slave), etc.
Stateful sets have 2 guarantees:
Currently Nomad itself has no plans to implement job ordering at this time. We're leaving this up to external (ecosystem) tooling since there are so many more lifecycle management concerns with job ordering and dependencies.
Allocation indexes should provide uniqueness if used carefully. Between CSI and task dependencies, Nomad v0.11 will cover many of the use cases of stateful sets
Self-Service Discovery at Scale With Consul at Bloomberg
How Roblox Developed and Uses the Windows IIS Nomad Driver
How We Used the HashiStack to Transform the World of Roblox
Consistent Development and Deployment at Comcast with Terraform