Encourage Data Scientists to Smile: Hybrid ML Ops with Nomad
Jul 24, 2020
See how the field of ML Ops is evolving and how HashiCorp Nomad is a great tool for scheduling and deployment of resources in a machine learning pipeline on hybrid infrastructure resources.
- Josh JordanSr. Solutions Engineer, HashiCorp
Google estimates only 1% of data is being used effectively by organizations. Industry trends in data team spending indicate that organizations want to turn this around. However, to be effective in hybrid infrastructure environments (the reality for most large enterprises), tooling needs to be leveraged that doesn't complicate an already complicated process.
HashiCorp Nomad's Big Data Use Case
HashiCorp’s Nomad is a federated workload scheduler. Its orchestration capabilities include parameterization and dependency specifications, as well as plugin ability with popular DAG (Directed Acyclic Graph) tools like Apache Spark. This makes Nomad an excellent choice for building machine learning (ML) pipelines and ML Ops, especially in a hybrid infrastructure, when interoperability across on-premise and cloud environments is desired.
Building an ML Ops Pipeline with Nomad
In this demo project, HashiCorp solutions engineer Josh Jordan will demonstrate how HashiCorp Nomad is an open source tool capable of bringing smiles to the faces of data scientists everywhere. He will explore Nomad in this role, and as a component of a fully automated and integrated ML Ops practice.
You'll see where Nomad can fit amidst the moving parts of a fully automated, E2E ML pipeline. To do this, we already have an example project to test: a pipeline for training & deploying ML model versions.
Read the companion blog post in addition to watching this demo webinar.
0:00 — What is machine learning operations (ML Ops)?
9:58 — Why Nomad is a good choice for ML Ops
13:21 — Demo: Using Nomad for deploying ML model versions in a pipeline
33:18 — A look at the emerging trends in ML Ops
36:27 — Live Q&A
- Is the “lifecycle hook” for a single group? i.e. if it's a “prestart” hook, it is a task that is run before other tasks in the group?
- How were the “parameters” sent down into the Nomad run?
- Does HashiCorp have published examples of Sentinel policies for Nomad?
- Is everything you showed today available in open source or are any of the features enterprise only?
- Is it possible to integrate with Datadog to track activity and changes with Nomad?
- If you do have a preference for which clients the Nomad jobs run on, how would you specify that within the job?