This is a guest post by Grant Joy, Senior DevOps Engineer at Distil Networks.
Distil Networks blocks bots. We protect websites from attacks like site scraping, ticket sniping, and click fraud. Founded in 2011 by three good friends, our company atmosphere is one of friendship and hard work. Distil has over 150 employees and five offices worldwide. Our U.S. offices are in California, North Carolina, and Virginia. Our international offices include London, England and Stockholm, Sweden. We recently partnered with Verizon to rapidly expand our content delivery network.
We have a lot of secrets. These secrets include database passwords, certificates, and private keys. We are serious when it comes to the job of protecting them. Shipping our code with speed and reliability in mind is essential. In support of this goal, we needed a secret storage system with high availability built in.
HashiCorp released Vault in 2015, around the time we were looking for a solution for secrets. Given the early days of the product, we relied on help from the Vault Google group. Later that year we, the operations team, were able to put in place a Vault cluster with HashiCorp Consul as the backend.
We run the cluster on OpenStack with one internal API handling the majority of transactions. Our Ops team has direct access to the backend using the Vault CLI while other teams have access to specific Vault secrets. Our internal API sits behind our firewall, with our public facing API allowed to connect to our internal API. This is a nice way to separate our more sensitive procedures from our public facing API for security and organization. We run Vault completely locked down to our private network with the internal API having the majority of the logical code, such as generating certificates.
» write a secret
vault write secret/file @file.txt
» read a secret back
vault read secret/file
Vault Behind HAProxy
The leader node should receive write traffic directed to the Vault server. The way to handle this is to use the Consul DNS interface. This lets Consul manage all traffic routing to the leader node. The server will redirect all traffic that does not go to the leader node back to Consul. The advertise address of the Consul cluster would then have the connection retry to reach the leader. In our case, Distil wanted to use our existing DNS service and not deal with Consul DNS. To do this, we use HAProxy with a health check. It is a lightweight load balancer and, as of version 7, began including health checks.
These query each Vault node for /sys/leader and, when one responds as leader, traffic is routed to it. Such health checks are also useful in notifying our operations team if something is wrong with the server. We show sample health checks below.
Distil has had this setup running in production for over a year without any issues. We often restart individual machines for system updates. The leader switches automatically, and every node rejoins the cluster when server restart is complete.
While this method worked best for our use case, you might look at Consul DNS before traveling down this path. There is a potential single point of failure disadvantage at the load balancer level.
Vault Behind an API
» later on...
secret_returned = Vault.logical.read secret_path puts secret_returned # prints key
In Distil’s setup, we ended up writing a Ruby command line application using the Commander gem. We bundled that and deployed it to our internal Gem in a Box server. This makes it easy for developers to make tool updates and deploy them to users.
When we started using Vault, it was as a tool to hold a very specific set of secrets. As time went on, we found that it was really easy to integrate it further into other areas we hadn’t expected. We now use it for storing environment variables for applications, storing Let’s Encrypt keys and certificates, and even for random passwords around the office.