Demo

Automating Chaos Engineering GameDays with Terraform

Watch this demo of a chaos engineering toolbox implemented with Go and Terraform.

Hope is definitely not a strategy. Designing systems and provisioning infrastructure for failure is essential when building distributed systems. However without verifying that the systems handle failure is a way of hope. Chaos engineering is the discipline of experimenting on a software system in production to build confidence in its resilience capability. This discipline is an innovator practice and many developers do not know the foundations, motivations, practices, and principles.

What Are Chaos Gamedays?

Chaos Gamedays are an ideal way to introduce the engineering teams. They are a useful tool for building confidence in the resiliency capacity of a system. They involve forcing certain failure scenarios in production to verify that the assumptions about fault tolerance match reality. And although Gamedays provide a valuable ROI to the companies, there are a lot of activities that have to run manually. Some examples include installing the chaos tools in the infrastructure, creating communication channels and documenting the observations during the experiments.

Terraform and Chaos Engineering

HashiCorp Terraform is an open source tool that enables teams to define infrastructure as code. It codifies APIs into declarative configuration files that can be treated as code and shared amongst team members. Custom Terraform providers are an excellent option to automate the steps involved in the execution of Chaos Gamedays.

What You'll Learn

In this talk, Yury Yineth Niño Roa will explain chaos engineering and chaos gamedays. Learn why chaos gamedays are essential to adopting a chaos maturity model. This talk will demo a chaos engineering toolbox implemented with Go and Terraform that automates the steps involved in the execution of chaos gamedays. Finally, there will be a review of results, conclusions, and challenges identified while using the chaos engineering toolbox.

Don't forget to check out this classic HashiConf talk on Production ChaosMonkey with Terraform as well.

Slides

You can find the slides for this presentation on Speaker Deck.

More resources like this one

  • 3/15/2023
  • Presentation

Advanced Terraform techniques

  • 2/3/2023
  • Case Study

Automating Multi-Cloud, Multi-Region Vault for Teams and Landing Zones

  • 2/1/2023
  • Case Study

Should My Team Really Need to Know Terraform?

  • 1/20/2023
  • Case Study

Packaging security in Terraform modules