How Consul Eliminates The Need For East-West Load Balancers
Jun 11, 2019
When compared to load balancers directing east-west network traffic, Consul can provide an alternative solution for each feature that load balancers provide with less financial expense and fewer strains on the network.
Founder & Co-CTO, HashiCorp
A common practice when deploying new applications to the cloud is to place load balancers in front of them. The way this looks architecturally is that you have a number of services of the same type. Let's call this service A—you would put a load balancer directly in front of it. You might have a number of services here so A, B, C, D, etc.,—but let's visualize one.
» Why use load balancers?
There are 4 main reasons why people tend to put load balancers in front of their services. One is service discovery itself: When you need to find how to talk to A—instead of knowing all the IP addresses of each one, you only need to know the address of the load balancer, which you could keep relatively static. Makes that easy.
Number 2 is fault tolerance or failure tolerance. If one of these goes bad and dies for whatever reason or becomes unavailable, the traffic to the load balancer helps because it will send it to a healthy one rather than an unhealthy one.
Again, this makes it easier than talking directly to these.
Number 3 is actually balancing load. With a load balancer you could be confident that that thing is going to make sure it goes to this one first, and then this one, and then this one—and then around in a circle or some sort of load balancing scheme. This helps when you have a lot of traffic coming in. You don't need to worry as much about the load on any individual service.
The last reason that you want to use this is some level of security or access control. A load balancer gives you a central point to enforce some sort of access control—usually IP-based access control. You're saying over here 10.0.0.1 can access 10.0.0.2—and these are maybe on some other IPs over here. But this access control is here so you can lock it down and say who could access what.
» The downsides of east-west load balancing in the cloud
This is a traditional reason to do this but in a cloud environment or in an environment with a number of services where instead of having four, A, B, C, D, you have 400 or 40. This starts to get prohibitively expensive and complicated in a number of ways.
One of the ways that this gets difficult is if you have a number of services all over the place. For each group of services you need to have a load balancer—so you go from needing maybe four load balancers to needing 40. From a hardware and software perspective this could get very financially expensive.
Another complicated issue is coordinating the updates to these load balancers. As you deploy new services behind it—if you add a new service A—somehow this load balancer needs to be configured to know about its IP so it can route traffic here.
When you have four services or some single digit number of services, that's not too hard to do. But as the number of services grows and, as they come up and come down more frequently like they do in modern cloud environments, this becomes—from a process perspective—very expensive.
The last complication is the security aspect. The security in this case is often still IP-based and IPs work well in traditional on prem, four wall environments because you carefully control all the IPs and you're not moving servers around very often. But in a modern scheduled world, where the actual IP of a machine—let’s say this is 0.40—might represent a number of different applications, it's hard to enforce this access control unless you end up splitting those applications across multiple servers. In which case you're not making the most out of your data center.
Your cost is x3 when you could have fit them all onto one—but you have no way to represent the security from an IP level here. This becomes very expensive again both in cost and security because any time you deploy a new service,
- It needs its own machine
- You need to update the rules in this load balancer so that access is allowed
This updating of rules could take a very long time. Ideally, you could do it automatically—and that's exactly the world that we're shifting to in a modern cloud environment.
» Eliminating east-west load balancers with HashiCorp Consul
With a tool like Consul, you could solve all 4 of these issues by eliminating the east-west load balancers all together.
In a world with Consul—what you have instead is—you still have your services—A—here, like before, and you have your client accessing them here. Let's call that—C—for the client.
When it wants to make a request, what it does instead is ask Consul for the addresses for A— and it responds with all 3 addresses here—and it does that via DNS. What it would ask for instead is a.service.consul—and after it does that DNS request, it gets all three. We lean on DNS as the method for service discovery. That solves problem number one that we talked about before. Discovery is now DNS.
Health checks (Circuit breaker)
The second reason you use load balancers for is fault tolerance. So, if we look at fault tolerance—the way Consul handles this is by having health checks on all of these services and the machines they run on. It's constantly checking whether these are healthy here.
If at any moment it becomes unhealthy for any reason whatsoever—CPU load is too high, the actual network becomes unavailable, anything—within microseconds it eliminates that from the responses for service discovery.
So we gain fault tolerance with health checks and expose those health checks and the results from the discovery process. This all happens pretty much instantly.
The third reason load balancers are used is actually balancing load. For balancing load—again, we could rely on the properties of DNS here. When Consul returns the addresses it will randomize them. Sometimes it's 312, sometimes it's 321, sometimes it's 123. But every time you ask Consul for a server it randomizes these results—because usually clients will pick the first one.
This way you're getting a different one between different results. If it's down for whatever reason the client will automatically use the second one. That's a way we can do load balancing. What we find in practice—even with thousands of services—is this behaves very well in terms of balancing load.
The last reason is access control. For access control, there are multiple options. In one way, you could still use IP-based security if you wanted to. Consul has a way to notify any client whenever the services change. If you add a service here and register it with Consul then Consul could immediately notify any software that this happened—so you could still use that notification mechanism to update iptables or on-server firewalls to do IP-based protection. It’s easy to automate—very scalable— so you could still do IP-based.
» Service to service authorization
Then, as another option, there's a feature in Consul called Connect, which allows you to do service to service authorization. You could represent rules like: “C can talk to A” or “C cannot talk to A.” You represent them at this high level so it doesn't matter what number of servers you're introducing.
Here you could have 4, you could have 400. You could have any number of clients but it's always one rule. This makes it easy and powerful to scale because Consul will make sure that any access is represented by these rules.
So you have both IP-based and you have service-based security—and you could see how using Consul and using DNS can solve all the same challenges that load balancers were used for. But when you do it this way you eliminate at least one piece of hardware and software for every single cluster of services.
In a more traditional, previous generation environment you might only have 4 or 5 or 6 services so it's not that much being replaced. But in a modern microservices, cloud oriented world, you probably have ian order of magnitude more services, so the costs and complexity begin adding up.
In addition to just wanting to move to a more scheduled, dynamic environment, this really works because you'll have services and servers coming in and out at a multiple times a minute or hour scale—so you could coordinate the changes necessary to ensure all these properties still. In this way, you could eliminate the need for east-west load balancers.