Inspecting Nomad Events Using Grafana Loki

Watch this demo workflow using Vector to process Nomad event streams, storing it in a log aggregator such as Clickhouse, S3, or a DB (in this demo, Grafana Loki), and adding alerts with PromQL.

Nomad provides an events stream which is helpful in debugging and analyzing Nomad cluster status. Using Nomad events sink, the events can be collected from the stream, processed and enriched with metadata, and stored to Loki/Clickhouse/S3 and other compatible storage providers. This helps you get a bird's eye view on the cluster using their analytical capabilities. Events can help debug the cluster state and alert operators about new deployments, failing allocations, node updates, etc.

What You'll Learn

In this talk, Karan Sharma will showcase how Nomad events sink can help do that, talk a bit about the internals of this program (which will cover how to use the Nomad Golang client SDK to connect to the events stream), how Vector can be configured to process these events, and then dump them into Clickhouse/Loki/PostgreSQL/S3/etc in a pipeline workflow. He'll also showcase examples on how to use Grafana Loki and set alerts with PromQL and show some real world scenarios where alerts are triggered when a deployment fails and a node is unhealthy — all using Nomad events.

More resources like this one