FAQ

Cloud Compliance & Management with Terraform - Cost Estimation & Cost Policy

See Terraform Enterprise's cost estimation feature in action along with cost control policies implemented with Sentinel policy as code.

Speakers

  • Corrigan Neralich
    Corrigan NeralichSolutions Engineer, HashiCorp

Transcript

Hi, I'm Corrigan Neralich, a solutions engineer at HashiCorp. I'm here to talk about Sentinel, HashiCorp's policy-as-code tool that you can use to establish guardrails around your deployments within Terraform.

We've talked about some of the key uses of Sentinel, from requiring that all modules be used from your private module registry to ensure that everyone is adhering to best practice, to ensuring that you're calling out to external systems, such as your own registry, to enforce the usage of the most recent version of a module.

These issues oftentimes are rooted in security. But another key use case is cost: How do I use this to effectively understand the cost of a given provisioning or set of resources that a given user is attempting to deploy? And how do I enforce behaviors accordingly?

What I will demonstrate today is how Sentinel can be used to do this, to control costs and be used as a means of establishing guardrails to prevent over-provisioning of resources.

What cost control policies look like

Here, in my repository shown on the screen, I have a set of Sentinel policies. I want to click into this one to show you how to restrict spend. And here's one that I want to drill a little bit deeper into, which is, How do I effectively control monthly costs?

This could just as easily be percentage of cost or cost over a different time period. In my case, I just want to limit the amount that any individual is allowed to spend in a given workspace. What I've done is set the limit at $500. I don't want anybody to be able to deploy a set of infrastructure that costs more than $500.

How does this work? What this is doing is utilizing 2 components within the Terraform Enterprise solution.

The first is a neat new feature called "cost estimation." What this represents is a callout, through the APIs, to any of the various cloud platforms, AWS, GCP, Azure. What this is doing is, when your plan is running, it's calling out, and based on the resources contained within your configuration files, it's producing an estimated monthly cost for those resources.

At a glance, you can see, before you've ever provisioned, exactly what that might cost. And you can start to craft policies around that.

How cost estimation works

For this example, I've attached a workspace to a source repository that invokes the same module that I've shown previously, which deploys a large cluster in my test environment. To give you a sense of what this might look like, here is that same repository that I've shown you before.

Here is my cost-check repository. It is invoking a module from my private module registry. And it's also invoking the latest version, each of which are their own Sentinel checks, which will be evaluated at the time I do the deployment.

Now, if I hop back into Terraform Enterprise and I access the workspace that's tied to this specific module in my repository, and I attempt to run a test, you'll see Sentinel in action.

As always when I queue a run, you'll see the plan. And you'll start to see those familiar outputs that come with using Terraform.

In this case the plan is presenting what amounts to a dry run. It outputs what Terraform intends to build based on my configuration file definitions. And it gives you an opportunity to confirm that this is indeed what you intended to build, before it's ever built in your environment.

Demonstrating cost estimation

To demonstrate this, I'm going to show you what a run looks like in a workspace where cost estimation is enabled and where policies are evaluated that put limits on amount of spend allowed for this workspace

The first thing that I will do within my organization is enable cost estimation. It's as simple as checking this box to enable this feature within your organization

Now that I've updated and enabled this feature, I'm going to go into this workspace, which is connected to my repository, and I'm going to trigger a new run.

When I'm queuing this up, you'll notice some familiar components. First is the plan. You'll notice the plan outputs here. This is essentially a dry run, where Terraform lets you know what it intends to build based on what resources you've defined in the corresponding and connected configuration file.

This is the same module I've shown before, which is going to go build a Vault, Consul, and Nomad cluster in my environment.

As it's running through this, it's just spitting out all of the intended builds. This is the resource graph that it intends to build.

But you'll notice a new tab here, cost estimation. What this represents is a live callout through the APIs to this cloud platform to provide a real-time estimate, during this plan, of what these resources are going to cost.

If I drill in there, you'll notice it provides an itemized list of cost by resource. You can see the overall cost in my case is going to be over $600. And if you recall, my Sentinel policy that I created and applied to this workspace limited that cost to $500. And so, because it exceeded that amount, it's failed a Sentinel policy check.

In this case, it provides a useful message to the user to let that user know that that policy has failed and that it is not in fact approved.

Making a soft policy

However, this touches on another component of Sentinel. In many cases there are legitimate reasons why a developer or a user or a given project might need to exceed this cost.

So this can be used as a soft check, meaning that, in this case, it's passed the requirement to use only pre-approved modules for my registry, and it's passed the requirement in the Sentinel check to make sure that this is the latest version of that module.

And even though it's failed the proposed cost check, you'll see that there's an "Override and continue" option available. So in this case, an administrator could override this failure and allow this deployment to continue, being aware of the fact that this exceeds that limit that they had imposed.

Alternatively, if this is not approved, a message can be left for a historical record as to why it's not approved. This just provides a very robust historical audit log of exactly what change was intended, what policy failed, and your administrator's comments as to whether or not they allowed it to continue and why.

In my case, I'm going to go ahead and discard this, because I, as an administrator, am not approving this cost. It just makes it very easy to surface this to my organization. I can communicate with the developers or the individual that attempted to deploy this. And we can make sure that they update their configuration or leverage a module that's going to stay within the bounds of their approved budget.

More resources like this one