Microservices and service-oriented architectures require a change in your load-balancing strategy. Learn why you should stop using load balancers to route traffic between services in this more modern architecture, and why you should use service discovery instead.
When we talk about what's changed in modern application architecture, there's this bigger shift from monolithic applications—where a single application has many sub-components and capabilities delivered as one application—to many discrete services that are being deployed independently.
This is a microservice or service-oriented architecture, where we have many, many more services that are communicating with each other, and composing to form a larger bit of functionality.
As we make this change, what suddenly changes about our requirements is, * a) we have a lot higher scale, there's many more instances that we need to be routing to. And * b) these things are much more ephemeral, so there's a lot more elasticity in terms of scaling up and down, but also dynamicism in terms of these things being relatively short-lived compared to the monolithic versions of them.
In the monolithic world what we used to do is traditionally use hardware load balancers, and these would be manually managed. If we deployed new versions of our monolith we would then file a ticket and someone would update the load balancer to add a new instance, and then it's able to get traffic after days or weeks. This was okay because the churn was relatively low, and even when we upgraded our application we were upgrading it on the same machine. So it was already part of our load balancer and didn't require change.
Now as we've embraced microservices architectures and cloud-based platforms, both of those assumptions change. Now there's much more change in terms of scaling up, scaling down, and deploying applications, but there's also a churn of the underlying VM. We're much less likely to reuse a machine and deploy the new application on the same hardware. Rather, we'll destroy that machine and boot a new machine running the new version of our application. So now all of a sudden, all of these changes require us touching our load balancers constantly.
A more modern approach is to use a service registry, a tool-like Consul. Now, as new instances boot, they get automatically registered inside of this catalog—the central registry—that knows, what are all the services, where are they running, what's their current status? And then we drive our load-balancing against this registry.
Applications can either natively query the registry using something like DNS and discover and connect to their upstreams without going through a load balancer. Or we use the registry to automatically populate and drive the configuration of the load balancers. So as soon as a new instance boots it gets put into the registry, then that registry is basically repopulating and reconfiguring our load balancer to instantly send traffic to it.
This avoids the manual management that we traditionally would have done, and allows us to accelerate how quickly we can get traffic to these applications and deal with the dynamic, elastic nature of infrastructure.