Announcing HashiCorp Vault Resource Quotas

With Vault 1.5, we added a new feature called Resource Quotas, which allows you to protect your Vault environment's stability and resource consumption in a predictable way from runaway application through the use of request rate limiting and counters.

Justin Weissig

Vault

Jul 21, 2020

Justin Weissig

A common request we have had with HashiCorp Vault is how to better protect against distributed denial of service (DDoS) attacks. With Vault 1.5, we added a new feature called Resource Quotas, which allows you to protect your Vault environment's stability and resource consumption in a predictable way from runaway application through the use of request rate limiting and counters.

Vault operators can now use Resource Quotas to control how applications request resources from Vault, through the use of:

Rate Limit Quotas (All versions): Allows operators to specify request-per-second quotas. Rate limit quotas are applicable to every node in the Vault cluster, meaning each node will maintain separate counters to enforce rate limits. If the rate limit quota limit is hit on any of the nodes in the Vault cluster, additional requests will be canceled for all clients with an HTTP status code of “429 Too Many Requests”.
Lease Count Quotas (Enterprise Only): Allows operators to specify lease count quotas. If the number of leases in the cluster hits the configured quota limits, additional lease creations will be forbidden for all clients until a lease has been revoked or has expired.

Resource Quotas allow you to protect your Vault environment from misbehaving applications that might inadvertently saturate resources in the Vault cluster through high request rates. By canceling requires over a set rate we can maintain the overall health of Vault.

»How Resource Quotas Work

To learn more about how this works let's look at an example of setting a global rate limit quota via the sys/quotas/rate-limit/<name> endpoint. We can write the desired request rate using the following command:

$ vault write sys/quotas/rate-limit/global-rate rate=500

With a rate set to 500, a client may request at the specified rate of 500 per second. To learn more about these options please see our documentation.

To verify things are working as expected let’s read back the “global-rate” we can execute the following command:

$ vault read sys/quotas/rate-limit/global-rate

Key      Value
---      -----
name     global-rate
path     n/a
rate     500
type     rate-limit

Now, let's say you have a web application that is fetching an API key from Vault every so often (not even close to our rate limit). However, the application runs into an error and gets into a strange state, and starts requesting the secret from over what our rate limit is set at. Ultimately, this protects Vault and ensures that we have a healthy cluster for everyone else, even though this application is misbehaving.

On the applications side, you will see an error message that looks something like the following, where we return an HTTP status code of “429 Too Many Requests” when the rate limit is hit.

Error writing data to kv/webapp/apikey: Error making API request.

URL: PUT http://127.0.0.1:8200/v1/kv/webapp/apikey
Code: 429. Errors:

 request path "kv/webapp/apikey": rate limit quota exceeded

We have many more example use cases, spanning both Open-Source and Enterprise, in our detailed Learn guide, as well as our documentation.

»Monitoring Resource Quotas

Requests that are rejected due to rate limit quota rule violations can be surfaced in a few different places. A client that makes a request and bumps up against the quota will receive an error message as demonstrated above. However, operations staff likely also want to know a client request was rejected due to rate limiting, as this might lead to service interruptions and further debugging of the situation.

Operations have several options when it comes to monitoring Resource Quotas. If audit logging of requests is enabled, you can detect when requests were rejected due to rate limit quota rule violation. Please note, requests that were rejected due to rate limit violation are not logged by default when audit logging is turned off. The following is an example of what an audit logged event looks like on the server side.

{
  "time": "2020-07-17T05:40:54.733026Z",
  "type": "request",
  "auth": {
    "token_type": "default"
  },
  "request": {
    "id": "f15a1a00-c4cb-d479-ed74-f91a2ec233ac",
    "operation": "update",
    "namespace": {
      "id": "root"
    },
    "path": "kv/webapp/apikey",
    "data": {
      "data": {
        "pasword": "hmac-sha256:35a720a99d99595899663838b5d2d6d9039f78ea7d7bbef2a2cfd11717c083cc"
      },
      "options": {}
    },
    "remote_address": "127.0.0.1"
  },
  "error": "request path \"kv/webapp/apikey\": rate limit quota exceeded"
}

Another option is to use our enhanced telemetry Resource Quota Metrics to monitor, visualize, and potentially alert off these types of events. The table below outlines the newly added telemetry metrics which can be useful for monitoring.

Metric	Description	Unit	Type
quota.rate_limit.violation	Total number of rate limit quota violations	quota	counter
quota.lease_count.violation	Total number of lease count quota violations	quota	counter
quota.lease_count.max	Total maximum amount of leases allowed by the lease count quota	lease	gauge
quota.lease_count.counter	Total current amount of leases generated by the lease count quota	lease	gauge

For more information on Telemetry, please see our documentation.

»Next Steps

For more information on Resource Quotas, see our Learn Guide or the documentation. Also, if you enjoy playing around with this type of stuff, maybe you’d be interested in working at HashiCorp too since we’re hiring!