Learn about the pros and cons of using mono repositories and multi repositories along with the most logical use case for each.
A commonly asked question on HashiCorp forums, posts, blogs, and even at conferences or webinars is: Should my organization use a monolithic source repository (mono repo) or multiple source repositories (multi repo)? The short answer is this: It will depend on your organization and has everything to do with organizational patterns. This post will discuss the nuances of using each approach and when you should eventually break your mono repo into a multi repo.
For the purposes of this blog post, a mono repo keeps many Terraform configurations as separate directories in a single repository. By comparison, a multi repo approach organizes each Terraform configuration in a separate repository. What happens when you run terraform init? It uses go-getter to download all needed modules and, in essence, behaves like a mono repo. While Terraform supports the local referencing of modules, it handles the sourcing of remote modules, which lends well to a multi repo structure.
When we refer to mono repos, do we include application code and its infrastructure? In this blog, we focus on a mono repo for infrastructure components, such as networking, compute, or software as a service (SaaS) resources.
Mono repos work if you have a personal project or a smaller team, and you need visibility into all of the infrastructure you're creating and uniform access to your configurations. What kind of mono repo structure will help you get the most out of your collaboration efforts and infrastructure as code? In a mono repo, divide your modules into a separate folder with the smallest grouping of resources and their dependencies. For example, you create individual module folders for AWS lambda (
function), queue, and virtual network (
> tree my-company-functions └── modules ├── function │ ├── main.tf // contains aws_iam_role, aws_lambda_function │ ├── outputs.tf │ └── variables.tf ├── queue │ ├── main.tf // contains aws_sqs_queue │ ├── outputs.tf │ └── variables.tf └── vpc ├── main.tf // contains aws_vpc, aws_subnet ├── outputs.tf └── variables.tf
To version modules, you can copy the module folder and append a version number to it. Otherwise, you might need to use some complex repository tagging to achieve versioning.
Then, separate environment configurations into individual folders per business domain, product, or team. The following example represents two business domains, one related to collecting document metadata and the other translating them, and two environments, production and staging.
> tree my-company-functions ├── modules ├── production │ ├── document-metadata │ │ └── main.tf │ └── document-translate │ └── main.tf └── staging ├── document-metadata │ └── main.tf └── document-translate └── main.tf
The configurations for production and staging reference the
modules directory to create the function, queue, and network. This can be per business domain, product, or team. Some examples of dividing by business domain or team could be:
To apply changes to configuration, you must develop a continuous integration pipeline to reference differences in each subdirectory, change directories, and apply changes in each directory individually.
If you find yourself spending more time maintaining your build system logic to accommodate your infrastructure mono repo, you may want to break down your mono repo into multiple repositories.
A multi repo can better support granular access control and configuration changes. If you have a large team that collaborates on a complex infrastructure system, multiple source repositories allow you to localize changes and lessen the blast radius of failed infrastructure updates across the system. You can scope changes to the teams responsible for the infrastructure.
There are many approaches to organization your multi repo. For example, you can divide each module into its own repository. In the case of the serverless function, queue, and network, you would create individual repositories for AWS lambda (
function), queue, and virtual network (
network). Individual business units or products would reference these remote modules. Environments would be captured by subdirectories in each product or business repository.
Use release tagging to handle the versioning of modules. By separating the modules into their own source repositories, you can test them independently, allow dependent configurations to reference the module version, and update the provider version with the module version.
You can further structure your multi repo with separate repositories for each business domain, product, or team. Use subdirectories within these repositories to separate environments, which offers visibility into configuration differences between environments. Since you use a repository for each business domain or product, continuous integration pipelines have fewer subdirectories to recursively check for changes. You can optimize your pipelines for each configuration or module.
There is no right or wrong answer when discussing the use of mono repos and multi repos. By taking a step back and observing different organizational patterns, we can determine which environment structure works best for us.
For more information on separating configurations for environments, take a look at the Terraform Recommended Practices documentation. best practices for code for Terraform Cloud workspaces, review our documentation on code organization and workspace structure. To restructure your Terraform for production, review our blog on refactoring configuration. For best practices and pitfalls in a large Terraform mono repo, check out lessons learned from Terraform at Google. To try a hands-on example of breaking up a mono repo into separate dev and prod envoronments with a module shared between them, follow the Learn tutorial, Separate Development and Production Environments.
A summary image in bullets:
The HashiCorp Terraform AWS provider has surpassed one billion downloads — here’s how we got there, and what to look for next.
Run tasks can now be accessed from the HashiCorp Terraform Registry, helping developers discover run tasks they can integrate with their existing Terraform workflow.
Cloud Development Kit for Terraform (CDKTF) 0.14 makes it easier to add and upgrade Terraform providers.