Learn the Day 0, Day 1 and Day 2 activities around infrastructure provisioning and get strategies for dynamically updating infrastructure using HashiCorp Terraform and Consul-Terraform-Sync.
Speakers: Devarshi Shah and Sabeen Syed
Hi, all. Welcome to HashiConf EU. I hope that you all are having a good time so far.
My name is Sabeen Syed. I am a senior engineering manager here at HashiCorp, working on Consul.
I am joined today by my colleague, Devarshi Shah, who is a senior product manager, also working on Consul.
We're going to talk to you about infrastructure automation.
We've all been asked, either directly or indirectly, to release faster. We all have customers who have high expectations, and we want to meet those expectations or beat those expectations.
A major bottleneck is the ability to quickly provision infrastructure and update the infrastructure that our solutions use.
Let me walk you through some ideal scenarios where we think that infrastructure automation can benefit.
The first is around multi-tiered applications.
Suppose you are using a multi-tiered application and run into a bug or an issue in production. Wouldn't it be great to be able to take your production infrastructure and recreate that in a test environment?
And, after you figured out that bug, to then be able to destroy and clean out that test environment.
Another scenario is, if you are using a network device, something like a load balancer or a firewall, wouldn't it be great if you could dynamically update those devices without having to go through a ticketing system?
A ticketing system means you have to wait for someone to check your ticket and then to manually update the device.
This is where we feel that infrastructure automation can benefit the most.
Let's take it from the top.
his is the current landscape. A developer comes and builds their application. After that, when they want to deploy their application, this is where the bottlenecks surface.
Manually having to provision and update your infrastructure doesn't work at the pace of developers Neither does it help us to release quickly.
So how do we tackle this?
Before I get into the day to day, I'd like to walk you through how HashiCorp approaches infrastructure automation.
The first is around flexibility and a declarative style. This is where infrastructure as code comes in. Infrastructure as code basically means that you are able to manage your infrastructure using configuration files.
The second is reducing time to value. HashiCorp does this by using automation by default. This helps our customers to deploy faster and therefore iterate faster and innovate faster.
The third is consistency. We like to use the same tools and workflows to make it seamless for our customers.
The slide on screen shows the first 2 phases of the infrastructure lifecycle: planning and delivery.
In these 2 phases, we define what the requirements are, we design what our infrastructure is going to look like, and then we build that out.
These map to the Day 0 and Day 1 activities. HashiCorp addresses this using Terraform.
The third phase of the infrastructure life cycle is operation. This is where we focus on maintaining, optimizing, and updating our infrastructure.
This is done in the Day 2 activities. We apply the network infrastructure automation use case here, and the Consul team has developed a tool called Consul-Terraform-Sync.
As you can tell by the name, it uses Consul and Terraform in conjunction to solve for the use case of network infrastructure automation.
Let's go a little deeper into the Day 0 and Day 1 activities.
This is where we're using Terraform to provision our infrastructure. Terraform allows you to provision servers, databases, cloud resources, network devices, and much more.
This diagram on screen shows all the Terraform providers that we have for some of our network devices. Examples are A10 (see walkthrough of A10 provider), F5, Palo Alto, Checkpoint, Cisco. And we have more in the pipeline.
Terraform carries out immutable deployments. This means that whenever there's a change, there will be a new deployment.
This makes it easier to figure out what software is running on our infrastructure. It also makes it trivial for us to be able to deploy a previous version of that software.
Terraform is declarative in nature, which means that you can specify your desired end state of the infrastructure.
Terraform also has a large community. We have over 1,000 providers, and 150 of them are Terraform-verified.
Also, big congrats to the Terraform team on their release. They just released 1.0, and that was announced today. That just shows the maturity of the product.
Now, Devarshi Shah is going to continue into the Day 2 details around network infrastructure automation using Consul-Terraform-Sync.
Devarshi, go ahead and take it away.
Thank you so much, Sabeen.
You heard Sabeen speak about the challenges with Day 0 and Day 1 application delivery and infrastructure provisioning and how Terraform helps address those.
But if you look at it, infrastructure, application delivery, and infrastructure operators spend most of their cycles in Day 2, when the infrastructure and application have already been deployed.
This is where your day-to-day tasks come in. Most of these tasks are still manual and driven through ticketing-based workflows.
Let's take a look at an example. When it comes to Day 2 application scaling workflows, it may be due to business needs.
For example, say I want to scale up an application to support a new initiative that's coming up. The application developer essentially asks the server team, "I want to scale up my application. Can you provide some servers or deployment resources for that?"
You probably need a ticket. Once the server admin has provisioned the servers and workloads, it goes to the network teams. And it's across different personas within the network team.
For example, once you deploy the server, you need IP addresses for it, and there's another team that handles it.
You probably need a ticket for that. The workflow for provisioning those is manual.
Once you have those IP addresses, you need to update the pool members on load balancers.
You again need a ticket for that.
After updating the pool members, you need to update switch configuration and make sure that the application workloads are receiving the traffic.
After that, you probably want to upgrade the security policies and the rules that are associated with this new workload that you've created.
As you can see, this is a manual, ticketing-based workflow, which takes days to weeks for a lot of organizations.
And one thing about these workflows is that scale-up is easy, but scaling down and making sure your resources are cleared out correctly is challenging.
Not a lot of organizations do that efficiently.
It's essentially a problem where the workflow starts with service deployment and has issues with automation.
We have in the HashiCorp portfolio 2 products which help out with these challenges.
First is Consul, which is essentially the service networking platform from HashiCorp that has a catalog of all the services that are running in your infrastructure. The second is Terraform, which is the industry-leading tool for infrastructure as code.
Why not have them work together? You have Consul, which is maintaining the catalog of all your services.
Through Consul-Terraform-Sync, we enable a publisher-subscriber paradigm where you, as a user, can say, "I want to subscribe to services within the Consul catalog, and depending on the changes to those services, I want to trigger some automation on my downstream infrastructure using Terraform."
You have a mechanism where service updates are driving your infrastructure updates as well.
The thing about Consul-Terraform-Sync is it is very helpful to automate those manual, ticket-driven workflows.
The installation is pretty lightweight, just like every other HashiCorp product and tool.
Lastly, it can work across any provider in the Terraform ecosystem.
The fundamental concept about Consul-Terraform-Sync is a task, and a task is essentially allowing the user or the operator of Consul-Terraform-Sync to say, "I want to subscribe to a certain set of services from the Consul catalog and, based on any changes to those services, I want you to get the right set of variables and trigger the right set of Terraform workflow for the source or the module that's identified in the source."
In this example, the services we want are web and API, and the source is a module for Cisco ACI, software-defined network-based solution.
Consul-Terraform-Sync (CTS) allows a very simple publisher-subscriber paradigm. You can have multiple such definitions for a task, and you can have multiple providers within a task as well. It's a very flexible way of subscribing to the service catalog.
When you are dealing with infrastructure, it's always useful to have information about the changes that have been made to the infrastructure and the status of the automation that's working on that infrastructure.
So we've enabled you to have updates about the task status, status of CTS, as well as the historic status of all the tasks that have been executed by CTS.
Another important aspect is making sure that the credentials that you pass to Consul-Terraform-Sync are essentially secure and you avoid exposing credentials or secrets in plaintext.
Which is why we've also built an integration with HashiCorp Vault, where you can get credentials for your infrastructure devices through the Vault KV credential integration in Consul-Terraform-Sync.
A lot of our enterprise customers rely on Consul namespaces as a model for multi-tenancy. And we've incorporated that into Consul-Terraform-Sync as well.
ou as a user of Consul-Terraform-Sync subscribe to services, not just in the default namespace, which is available in the open source version, but also to other namespaces that you may create through Consul Enterprise.
Lastly, a lot of times users who use Consul and users who use Consul-Terraform-Sync to manage that infrastructure may be different.
The metadata that the users of Consul might have added to the registry for the services may not be enough for users of CTS to make changes on their downstream infrastructure.
So we've also given users the capability to add their own metadata into the automation for Consul-Terraform-Sync. One example for that could be that you can add the virtual routing and forwarding for a networking device through this sort of workflow.
We launched Consul-Terraform-Sync in March, and we worked together with leaders of the industry to have patterns for their devices and their technologies available as well for users.
I'd like to thank A10 Networks, AWS, Checkpoint, Cisco, F5, NS1, Palo Alto Networks, and VMware Avi networks for creating those patterns to address the challenges that Consul-Terraform-Sync is working towards.
Users and technology partners can build their own patterns for their own devices using the model that Consul-Terraform-Sync has, and it's pretty easy for you to build your own custom workflows through this mechanism.
We spoke about the challenges that Consul-Terraform-Sync helps address as well as how it works. But I think one of the key things for us to realize is the quantitative value-add of Consul-Terraform-Sync.
We've been working with a customer who's deployed Consul-Terraform-Sync in production. This customer uses Consul for service discovery as well as service mesh use cases.
I want to compare and contrast what that environment looked like before and after using Consul-Terraform-Sync in terms of resources.
Prior to Consul-Terraform-Sync being used in their workflows to automate their ingress as well as firewall address groups, they had multiple engineers involved in the process, and it was a very ticket-driven workflow.
Post Consul-Terraform-Sync, no humans were harmed. In terms of time, the entire process of deploying that ingress and updating the firewall address groups took about 3 to 5 days.
But with CTS and Consul being the source and Terraform being the automation engine, it took them about 150 seconds to achieve that workflow. And in terms of resources, prior to Consul-Terraform-Sync, they had a lot of resource idling and wastage.
But because Consul-Terraform-Sync relies on Consul as the source of truth and tracks services going up and down and being cleared out, there was an automatic cleanup in that process.
For Consul-Terraform-Sync, there are 3 core areas of investment that we are working on.
First is we are working with our existing launch partners to have deeper workflows across their technology stacks, as well as working with newer partners in terms of the application delivery controller (ADC) space, the SDN space, and the DNS management solution space.
I'm really excited about that.
Consul has a lot more capabilities, or just the service discovery aspect of it. So the second area we are investing in is we are looking at newer triggers to trigger tasks in Consul-Terraform-Sync.
For example, say I see a change in Consul KV. How can I enable network flow?
Or say Consul intention got updated. How does CTS work with that?
Lastly, Terraform is consumed in a lot of different ways. Terraform is primarily used as an open source tool, but we have Terraform Enterprise as well as Terraform Cloud, which derive enterprise value for a lot of our customers.
We are looking at providing more enterprise value by integrating CTS with Terraform Enterprise and Terraform Cloud to support remote operations and approval workflows.
Really exciting times for Consul-Terraform-Sync and infrastructure automation in general from HashiCorp.
This is it for me, and I hope you enjoy the rest of the HashiConf EU sessions. Thank you.
Advanced Terraform techniques
Using Consul Dataplane on Kubernetes to implement service mesh at an Adfinis client
Automating Multi-Cloud, Multi-Region Vault for Teams and Landing Zones
Should My Team Really Need to Know Terraform?