terraform

Testing HashiCorp Terraform

Learn testing strategies for HashiCorp Terraform modules and configuration, and learn how to run tests against infrastructure.

How do you know if you can run terraform apply to your infrastructure without affecting critical business applications? You can terraform validate and terraform plan to check your configuration, but will that be enough? Whether you’ve updated some HashiCorp Terraform configuration or a new version of a module, you want to catch errors quickly before you apply any changes to production infrastructure. In this post, I’ll discuss some testing strategies for HashiCorp Terraform configuration and modules so that you can terraform apply with greater confidence. You’ll learn how infrastructure tests fit into your organization’s development practices, the differences in testing modules versus configuration, and approaches to manage the cost of testing.

I included a few testing examples with HashiCorp Sentinel. No matter which tool you use, you can generalize the approaches outlined in this post to your overall infrastructure testing strategy. In addition to the testing tools and approaches in this post, you can find other perspectives and examples in the references at the conclusion.

Ideally, your infrastructure testing strategy should align with the test pyramid, which groups tests by type, scope, and granularity. The higher up the pyramid you go, the fewer tests you should have for that level of the pyramid. Higher-level tests in the pyramid take more time and cost to create or configure resources.

Test pyramid for infrastructure testing

In reality, your tests may not perfectly align with the pyramid shape. The pyramid offers a common language to describe what area a test can cover to verify configuration and infrastructure resources. I’ll start at the bottom of the pyramid with unit tests and work my way up the pyramid to end-to-end tests. Manual testing involves spot-checking infrastructure for functionality and has a high cost in time and effort.

»Unit Tests

At the bottom of the pyramid, unit tests verify individual resources and configurations for expected values. They should answer the question, “Does my configuration or plan contain the correct metadata?” Traditionally, unit tests should run independently, without external resources or API calls.

You can use terraform fmt -check and terraform validate as rudimentary unit tests. For additional test coverage, you can use any programming language or testing tool to parse the Terraform configuration in HCL or JSON and check for statically defined parameters, such as provider attributes with defaults or hard-coded values. However, none of these tests verify correct variable interpolation, list iteration, or other configuration logic. As a result, I usually write additional unit tests to parse the plan representation instead of the Terraform configuration.

Configuration parsing, terraform fmt -check, and terraform validate do not require active infrastructure resources or authentication to an infrastructure provider. Unit tests for the plan representation require Terraform to authenticate to your infrastructure provider and make comparisons. These types of tests overlap with security testing done as part of policy as code because you check attributes in Terraform configuration for the correct values.

For example, your Terraform configuration parses the IP address from an AWS instance’s DNS name and passes it to a target group for a load balancer. At a glance, you don’t know if it correctly replaces the hyphens and retrieves the IP address information.

locals {
 ip_addresses = toset([
   for service, service_data in var.services :
   replace(replace(split(".", service_data.node)[0], "ip-", ""), "-", ".") if service_data.kind == var.service_kind
 ])
}

You ran terraform plan to manually check the IP address and continue to add more configuration to the module over time. As a result, it takes time to scroll through the planned changes to check your IP address. To solve this problem, write two unit tests with HashiCorp Sentinel to check parameters in the configuration’s plan and automate the IP address verification. One test checks that the target group does not use the default node address, and the other verifies that the target_id matches a valid IP address.

aws_lb_target_group_attachment_does_not_use_node_address = rule {
   all aws_lb_target_group_attachments as target_group_attachment {
       target_group_attachment.values.target_id not in
           consul_terraform_sync_service_node_addresses
   }
}
 
aws_lb_target_group_attachment_has_ip_address = rule {
   all aws_lb_target_group_attachments as target_group_attachment {
       target_group_attachment.values.target_id matches
           "^\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}$"
   }
}

If you do not use HashiCorp Sentinel, you can use your programming language or configuration testing tool of choice to parse the plan representation in JSON and verify your Terraform logic. Additionally, unit tests can validate:

  • Number of resources or attributes generated by for_each or count
  • Values generated by for expressions
  • Values generated by built-in functions
  • Dependencies between modules
  • Values associated with interpolated values
  • Expected variables or outputs marked as sensitive

Overall, unit tests run very quickly and provide rapid feedback. They also communicate the expected values of configuration across your team and organization. Since they run independently of infrastructure resources, unit tests have a virtually zero cost to run frequently.

»Contract Tests

At the next level from the bottom of the pyramid, contract tests check that a configuration using a Terraform module passes properly formatted inputs. Contract tests answer the question, “Does the expected input to the module match what I think I should pass to it?” Contract tests ensure that the contract between a Terraform configuration’s expected inputs to a module and the module’s actual inputs has not been broken. You can use the same testing framework as your unit tests to check that a Terraform configuration passes the right inputs to a module.

Instead of using a separate testing framework for contract tests, use a custom validation rule. For example, use a custom validation rule to ensure that an AWS load balancer’s listener rule receives a valid integer range for its priority.

variable "listener_rule_priority" {
 type        = number
 default     = 1
 description = "Priority of listener rule between 1 to 50000"
 validation {
   condition     = var.listener_rule_priority > 0 && var.listener_rule_priority < 50000
   error_message = "The priority of listener_rule must be between 1 to 50000."
 }
}

In addition to custom validation rules, you can use Terraform’s rich language syntax to validate variables with an object structure and check that the module receives the expected input. In the AWS load balancer case, add a map representing service objects and their expected attributes and type.

variable "services" {
 description = "Consul services monitored by Consul-Terraform-Sync"
 type = map(
   object({
     id        = string
     name      = string
     address   = string
     port      = number
     kind      = string
     meta      = map(string)
     tags      = list(string)
     namespace = string
     status    = string
 
     node                  = string
     node_id               = string
     node_address          = string
     node_datacenter       = string
     node_tagged_addresses = map(string)
     node_meta             = map(string)
   })
 )
}

Note: Consul Terraform Sync generates the services object outlined in the example.

Contract tests quickly catch misconfigurations to modules before applying them to live infrastructure resources. You can use them to check for correct identifier formats, naming standards, attribute types, and value constraints such as character limits or password requirements.

Unit and contract tests may require extra time and effort to build, but they allow you to catch configuration errors before running terraform apply. For larger, more complex configurations with many resources, you should not manually check individual parameters. Instead, unit and contract tests quickly automate the verification of important configurations and set a foundation for collaboration across teams and organizations. Lower-level tests communicate system knowledge and expectations to teams that need to maintain and update Terraform configuration.

»Integration Tests

With lower-level tests, you do not need to create external resources to run them. The top half of the pyramid includes tests that require active infrastructure resources to run properly. Integration tests check that a configuration using a Terraform module passes properly formatted inputs. They answer the question, “Does this module or configuration create the resources successfully?” A terraform apply offers limited integration testing because it creates and configures resources while managing dependencies. You should write additional tests to check for configuration parameters on the active resource.

Should you verify every parameter that Terraform configures on a resource? You could, but it may not be the best use of your time and effort! Terraform providers include acceptance tests that resources properly create, update, and delete with the right configuration values. Instead, use integration tests to verify that Terraform outputs include the correct values or number of resources. They also test infrastructure configuration that can only be verified after a terraform apply, such as invalid configurations, nonconformant passwords, or results of for_each iteration.

Depending on your integration testing framework, you may need to write scripts or automation to terraform apply for test resources, run the tests, and terraform destroy the resources. Some frameworks, such as Terratest or kitchen-terraform, orchestrate this sequence for you. When choosing a framework, consider the existing integrations and languages within your organization. Integration tests help you determine whether or not to update your module version and ensure they run without errors. Since you have to set up and tear down the resources, you will find that integration tests can take 15 minutes or more to complete depending on the resource! As a result, implement as much unit and contract testing as possible to fail quickly on wrong configurations instead of waiting for resources to create and delete.

»End-to-End Tests

After you apply your Terraform changes to production, you need to know whether or not you’ve affected end-user functionality. They answer the question, “Can someone use the infrastructure system successfully?” For example, application developers should still be able to deploy to HashiCorp Nomad after you upgrade the version. Operations team members should still be able to examine system metrics in their monitoring tools. End-to-end tests can verify that changes did not break expected functionality. To check that you’ve upgraded Nomad properly, you can deploy a sample application, test the endpoint, and delete it from the cluster. To check that the monitoring tool has system metrics, you can check if it contains data from your system in the last five minutes.

You can write end-to-end tests in any programming language or framework. Frameworks like Terratest and kitchen-terraform can also be used for end-to-end tests. You can add an API call in kitchen-terraform to check an endpoint after creating infrastructure. I have also used both frameworks to provision virtual machines on AWS VPC networks and verify their connectivity as end-to-end tests for network configuration. End-to-end tests usually depend on an entire system, including networks, compute clusters, load balancers, and more. As a result, these tests usually run against long-lived development or production environments.

»Testing Terraform Modules

When you test Terraform modules, you want enough verification to ensure a new, stable release of the module for use across your organization. To ensure sufficient test coverage, write unit, contract, and integration tests for modules. A module delivery pipeline starts with a terraform plan and then runs unit tests (and if applicable, contract tests) to verify the expected Terraform resources and configurations. Then, run terraform apply and the integration tests to check that the module can still run without errors. After running integration tests, destroy the resources and release a new module version.

Pipeline for Terraform module testing

For a full example of testing a module in Terraform Cloud, refer to a module built for Consul Terraform Sync. The module uses a dedicated Terraform Cloud workspace with an attached Sentinel policy of its unit tests. The workspace uses a CLI-driven workflow since its integration tests have external dependencies. I manage the module’s release through a GitHub Actions workflow.

The workflow runs unit tests written in Sentinel against a Terraform Cloud workspace. The module contains contract tests in the form of variable validation, which will verify valid inputs for any configurations that depend on the module.

Passing unit tests with Sentinel policy applied to workspace

Upon merging the changes, my GitHub Actions workflow runs integration tests written in Terratest. They create a load balancer, listener rule, and target group to verify that the module configures additional listener rules and target groups. After the integration tests pass, I can tag and release a new version of the module.

Passing integration tests to create and delete module resources

Note: We have ongoing research for terraform test, which supports module acceptance testing. Check out the prototype.

When testing modules, consider the cost and test coverage of module tests. Conduct module tests in a different project or account so that you can independently track the cost of your module testing and ensure module resources do not overwrite environments. On occasion, you can omit integration tests because of their high financial and time cost. Spinning up databases and clusters can take half an hour or more. When you’re constantly pushing changes, you might even create multiple test instances! To manage the cost, run integration tests after merging feature branches and select the minimum number of resources you need to test the module. If possible, avoid creating entire systems. Module testing applies mostly to immutable resources because of its create and delete sequence. The tests cannot accurately represent the end state of brownfield (existing) resources because they do not test updates. As a result, it provides confidence in the module’s successful usage but not necessarily in applying module updates to live infrastructure environments.

»Testing Terraform Configuration

Compared to modules, Terraform configuration applied to environments should include end-to-end tests to check for end-user functionality of infrastructure resources. Write unit, integration, and end-to-end tests for configuration of active environments. The unit tests do not need to cover the configuration in modules. Instead, focus on unit testing any configuration not associated with modules. Integration tests can check that changes successfully run in a long-lived development environment, and end-to-end tests verify the environment’s initial functionality. If you use feature branching, merge your changes and apply them to a production environment. In production, run end-to-end tests against the system to confirm system availability.

Pipeline for Terraform configuration testing

Failed changes to active environments will affect critical business systems. In its ideal form, a long-running development environment that accurately mimics production can help you catch potential problems. From a practical standpoint, you may not always have a development environment that fully replicates a production environment because of cost concerns and the difficulty of replicating user traffic. As a result, you usually run a scaled-down version of production to save cost. The difference between development and production will affect the outcome of your tests, so be aware of which tests may be more important to flagging errors or disruptive to run. Even if configuration tests have less accuracy in development, they can still catch a number of errors and help you practice applying and rolling back changes before production.

»Conclusion

Depending on your system’s cost and complexity, you can apply a variety of testing strategies to Terraform modules and configuration. I explained the different types of tests and how you can apply them to catching errors in Terraform configuration before production, and how to incorporate them into pipelines. Your Terraform testing strategy does not need to be a perfect test pyramid. At the very least, automate some tests to reduce the time you need to manually verify changes and check for errors before they reach production.

For a more comprehensive list of Terraform testing tools, check out this repository with a list of infrastructure testing tools. In addition to existing community tools, we’d love your feedback on our prototype for terraform test, which offers module acceptance testing.

Additional resources on practices and patterns for testing Terraform include:

To learn about using Sentinel on Terraform Cloud, review our tutorial on Learn.

Questions about this post? Add them to the community forum!


Sign up for the latest HashiCorp news

By submitting this form, you acknowledge and agree that HashiCorp will process your personal information in accordance with the Privacy Policy