RDS performance self-tuning with Consul at Bleacher Report
Sep 26, 2018
When web traffic shifts to mobile app traffic, Bleacher Report's infrastructure automatically moves Amazon RDS resources to the API / app side with no downtime. See how this is possible with the help of HashiCorp Consul.
Bleacher Report is the second largest sports website, says Benson Wu, a DevOps manager for the company. Their site meets the challenges of being a blogging platform as well as a personalized content stream for millions of readers.
While architecting their site on Amazon Web Services, they ran into service limits on RDS read replicas. How did they overcome the limits and come up with some great strategies for adjusting RDS performance? HashiCorp Consul played a big role.
» Overcoming RDS limitations
There's a limit of five read replicas for each RDS master. After creating the master and five replicas, they would query AWS CloudFormation for that replica list and create another five, cascading to a second tier.
At that point, they had plenty of read replicas, but they needed a way to keep track of them all.
» Using Consul to track read replicas
Every time a new cascade of RDS read replicas is created at Bleacher Report, their identifying information is added to a YAML file. That YAML file is kept in Consul, a service discovery tool that provides a service registry and acts as a source of truth for Bleacher Report.
» Dynamic performance tuning for web vs. app
The Bleacher Report site is a monolith with two components: A frontend, for the website view, and an API for viewing content through the mobile app. Over time, traffic has been steadily shifting from the website experience to the mobile app, which means more strain is being put on the API.
With Consul, Wu's team can shift RDS replicas from the frontend list to the API list, so that when user load shifts from web to mobile, the website doesn't expend more resources than it needs, and the mobile end gets the extra horsepower it needs to run smoothly with more users.
» Performance self-tuning with no downtime
Their system can sense these shifts automatically, and the change to RDS replica assignment is made without any operator intervention. HAproxy and Consul Template detect these changes on the frontend and do a long-poll to Consul. It checks these changes into the YAML file and the continuous integration workflow commits the change to their production site.
The configuration change would be difficult to do without Consul. You might need to restart the site to propagate that config change to multiple services. But Consul handles this nicely with a key/value store that allows quick service configuration changes without downtime.
» Treating read replicas like cattle, not pets
The snappy performance-tuning configuration changes also take place on Bleacher Report when an RDS replica is lagging or not performing properly. They don't ask why, they simply use another automated process to let Consul know, so it can take that replica out of the configuration file and spin up a new one.
Spinning up and destroying predictable templates of infrastructure is the essence of immutable infrastructure: a philosophy of operations that is quickly becoming a best practice in IT.
With Consul, Bleacher Report is taking full advantage of immutable infrastructure while also automating the performance and configuration management of their website, with zero downtime.