Learn how to run a 3-node HashiCorp Vault cluster as a HashiCorp Nomad Job and automate the cluster initialization.
Part 1 of this blog series, demonstrated how to deploy the infrastructure for a single Nomad server and four Nomad clients. In this installment of the blog series, you will deploy and configure HashiCorp Vault as a Nomad job.
The diagram above gives an overview of how the Nomad jobs will be deployed and some of the Nomad features you will use to enable this. In total, there will be three Nomad jobs deployed to a Vault cluster namespace. However, this installment will focus on the Vault server cluster. Everything will be deployed using Terraform. In this installment, you will be working in the 2-nomad-configuration
directory of the Vault on Nomad demo GitHub repository.
The infrastructure deployed using Terraform in part 1 of this blog series included some outputs that you need to deploy Vault as a Nomad job. These outputs are:
nomad_clients_private_ips
nomad_clients_public_ips
nomad_server_public_ip
terraform_management_token
In order to read the values of these outputs, you can use the Terraform remote state data source to configure the Terraform provider for Nomad.
data "terraform_remote_state" "tfc" { backend = "remote" config = { organization = "org name" workspaces = { name = "1-nomad-infrastructure" } }}
The code above points to the Terraform workspace from part 1 of the series: 1-nomad-infrastructure
. This means you can access outputs from that workspace within this post’s workspace: 2-nomad-configuration
. The full code example of how you use this to configure the provider can be found here.
Nomad has the concept of namespaces, which is a way to isolate jobs from one another. This allows you to create granular ACL policies specific to a namespace.
The first thing you need to do is create a namespace for all of your jobs related to Vault management. The code below creates a namespace named vault-cluster
:
resource "nomad_namespace" "vault" { name = "vault-cluster" description = "Vault servers namespace"}
The next step is to write a Nomad job. Deploy a 3-node Vault cluster with the following parameters:
vault-servers
node pool.Below is the complete Nomad jobspec:
job "vault-cluster" { namespace = "vault-cluster" datacenters = ["dc1"] type = "service" node_pool = "vault-servers" group "vault" { count = 3 constraint { attribute = "${node.class}" value = "vault-servers" } volume "vault_data" { type = "host" source = "vault_vol" read_only = false } network { mode = "host" port "api" { to = "8200" static = "8200" } port "cluster" { to = "8201" static = "8201" } } task "vault" { driver = "docker" volume_mount { volume = "vault_data" destination = "/vault/file" read_only = false } config { image = "hashicorp/vault:1.15" cap_add = ["ipc_lock"] ports = [ "api", "cluster" ] volumes = [ "local/config:/vault/config" ] command = "/bin/sh" args = [ "-c", "vault operator init -status; if [ $? -eq 2 ]; then echo 'Vault is not initialized, starting in server mode...'; vault server -config=/vault/config; else echo 'Vault is already initialized, starting in server mode...'; vault server -config=/vault/config; fi" ] } template { data = <<EOHui = true listener "tcp" { address = "[::]:8200" cluster_address = "[::]:8201" tls_disable = "true"} storage "raft" { path = "/vault/file" {{- range nomadService "vault" }} retry_join { leader_api_addr = "http://{{ .Address }}:{{ .Port }}" } {{- end }}} cluster_addr = "http://{{ env "NOMAD_IP_cluster" }}:8201"api_addr = "http://{{ env "NOMAD_IP_api" }}:8200" EOH destination = "local/config/config.hcl" change_mode = "noop" } service { name = "vault" port = "api" provider = "nomad" check { name = "vault-api-health-check" type = "http" path = "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204" interval = "10s" timeout = "2s" } } resources { cpu = 500 memory = 1024 } affinity { attribute = "${meta.node_id}" value = "${NOMAD_ALLOC_ID}" weight = 100 } } }}
There are three layers to the jobspec: the job, the groups, and the tasks. A job can have one or more groups within it. Each group is a collection of individual units of work that will all run on the same client. A task is an individual unit within a group. A group can have one or more tasks within it.
Let's break down the jobspec above:
The main specifications at the job layer are:
dc1
datacenter. To read more about regions and datacenters, see the Nomad architecture overview.service
, batch
, sysbatch
, and system
. In this case you are using the service type because Vault should be a long-lived job that never goes down.At the group layer you have the following specifications:
3
as the value.vault-servers
node class. You also created a vault-servers
node pool and you can also constrain the job to that.none
, bridge
, host
, and Container Network Interface (CNI)
. In this case, you are running it using the host network mode, which means it will join the host's network namespace.8200
and the cluster communications run on port 8201
. The job configuration above names these ports, so you can reference them later.The task level specifications include:
read_only
permission is set to false
.ipc_lock
capability set to work. This capability is not allowed in Nomad by default and must be explicitly enabled in the client configuration file.api
and cluster
ports are attached to the task.retry_join
stanza for each instance using the .Address
and .Port
values it retrieves from the service to populate the stanza. More on the Nomad service registration in the next section.cluster_addr
and api_addr
configuration values are partially set using these variables to populate the IP address of the node.noop
option has been chosen. Other configuration options are signal
, restart
, and script
.vault
, using the api
port specified at the group layer.
http
, and tcp
. In this case, the health check is an API call to a Vault endpoint so http
has been specified.501
for an uninitialized node, and a 503
for a sealed node. This would mark the service as unhealthy, so to get around that issue, there are a series of parameters that can be added to the path that ensure that Vault responds with a 204
code for sealed and uninitialized nodes, and a 200
code for standby nodes.In the last section, I went through the Nomad templating configuration to render the Vault configuration file.
ui = true listener "tcp" { address = "[::]:8200" cluster_address = "[::]:8201" tls_disable = "true"} storage "raft" { path = "/vault/file" retry_join { leader_api_addr = "http://10.0.101.190:8200" } retry_join { leader_api_addr = "http://10.0.101.233:8200" } retry_join { leader_api_addr = "http://10.0.101.162:8200" }} cluster_addr = "http://10.0.101.162:8201"api_addr = "http://10.0.101.162:8200"
The above snippet is the rendered Vault configuration file from the Nomad template. I'll break down the different configurations:
tcp
and unix
. The purpose of the listener is to tell Vault what address and ports to listen to for cluster operation and api call requests. TLS specifics for the listener can also be configured here; however, as this is a demo, I have not enabled TLS.retry_join
stanzas tell Vault which addresses it should contact to attempt to join the cluster.For a full set of Vault configuration options, see the server configuration documentation.
With the Nomad jobspec complete, Vault can now be deployed to Nomad. The Terraform code below will read the jobspec file and run it on Nomad:
resource "nomad_job" "vault" { jobspec = file("vault.nomad") depends_on = [ nomad_namespace.vault ]}
There is an explicit dependency on the namespace resource declared here to ensure Terraform does not try to deploy this job until the namespace is in place.
So far, the steps taken will provide three Vault servers running on different VMs, but the cluster is not yet formed because Vault has not been initialized and unsealed. The first server to be initialized and unsealed will become the cluster leader and the following two servers will join that cluster. The Terraform provider for Vault does not have the ability to perform these actions by design so instead, TerraCurl can be used. Now to initialize Vault.
resource "terracurl_request" "init" { method = "POST" name = "init" response_codes = [200] url = "http://${data.terraform_remote_state.tfc.outputs.nomad_clients_public_ips[0]}:8200/v1/sys/init" request_body = <<EOF{ "secret_shares": 3, "secret_threshold": 2}EOF max_retry = 7 retry_interval = 10 depends_on = [ nomad_job.vault ]}
This will make an API call to the first node in the pool to initialize it.
200
response to the call./v1/sys/init
endpoint.Vault will respond by returning 3 unseal keys and a root token.
Now that Vault has been initialized, the unseal keys and root token need to be stored as Nomad variables. Nomad variables are encrypted key/value pairs that can be used by Nomad jobs. They have full ACL support meaning that granular policies can be written to ensure that only authorized entities are able to access them.
resource "nomad_variable" "unseal" { path = "nomad/jobs/vault-unsealer" namespace = "vault-cluster" items = { key1 = jsondecode(terracurl_request.init.response).keys[0] key2 = jsondecode(terracurl_request.init.response).keys[1] key3 = jsondecode(terracurl_request.init.response).keys[2] }}
This Nomad variable is only accessible to the vault-unsealer
job and the job must be running within the vault-cluster
namespace.
This blog post has shown how to deploy a 3-node Vault cluster as a Nomad job. It has also taken a deep dive into Nomad, different aspects of the jobspec file including templates and runtime variables, and also Vault server configurations. Part 3 of this blog series will explore deploying some automation to assist in the day-to-day operations of Vault on Nomad.
Learn how to automate the unsealing and snapshotting of HashiCorp Vault using HashiCorp Nomad and Vault Unsealer.
Leverage HashiCorp Vault as a trusted certificate authority (CA) to issue short-lived code signing certificates to a GitHub Actions workflow.
Golden patterns for infrastructure and security automation workflows lie at the core of The Infrastructure Cloud. Here’s how to implement them using HashiCorp Cloud Platform services.