See how content moderation service Chatsight uses HashiCorp Nomad and Consul to speed deployment and maximize uptime.
This guest post was written by Marcus Naughton, Founder of Chatsight, a provider of modern and inclusive content-moderation infrastructure for social platforms.
Chatsight provides low-latency and highly available content moderation services powered by artificial intelligence and delivered via an API. We’ve served 3.4 million requests since January 2021, with deployments across multiple cloud vendors and backed by services like Cloudflare. During that time, we’ve used HashiCorp Nomad and Consul to support our service and maximize uptime for our customers.
Developing moderation tools takes an enormous amount of our time, but we can’t afford to neglect our need to deliver our application to consumers quickly. We make changes to our code frequently — sometimes we push a dozen updates live per day. This presents a problem — content moderation is a sensitive task that requires continuous fine-tuning, but how can we deploy updates rapidly without the risk of the constant change creating disturbances for our customers?
Like many entrepreneurs and developers, we are confident in implementing business ideas with code, but have little experience in building and running a reliable application platform to deliver our solutions to end users. HashiCorp Nomad and Consul gave us a simple architecture to build and operate as well as an easy experience to update applications. That lets us ship and iterate our products faster.
On my earlier projects, I relied heavily on approaches like systemd that became increasingly difficult to manage at scale. Updates had to be pushed by restarting the service, which created blackouts while the runtime stopped and started. I knew I had to find a better approach and eventually I settled on using containers. But containers seemed really alien to me: “How do you even run these things?”
After coming to grips with creating container-based applications, my next step was to learn how to deliver and deploy these containers. I tried using Kubernetes, but while this approach was definitely viable, I felt out of my depth. The overhead of learning and maintaining Kubernetes was too much of a time sink. In addition, high availability was not a default Kubernetes offering.
Although I had a decent knowledge of natural language processing (NLP) and machine learning (ML) development when I started building Chatsight, I knew absolutely nothing about how production teams develop and deploy changes and updates. The Chatsight platform API reacts in about 2/10ths of a second from the moment it receives a request, so it has to be constantly available to service our customers’ live stream chats, as well as other customers who hit our API every second.
So I was left with a real challenge: How do I start orchestrating containers without having to learn Kubernetes from scratch, when I really had no understanding of how productionized application development worked?
With all of this in mind, HashiCorp Nomad turned out to be an amazing way to offload this responsibility without incurring much technical debt. We rely on Nomad to safely deploy new containers to our clusters, ensuring they are healthy and eliminating downtime with native rolling updates.
Similarly, after relying on load balancers from the cloud service providers, we didn’t initially realize that another tool — HashiCorp Consul — could provide a solution to our growing service discovery headaches.
Essentially, Consul’s service discovery capabilities tell Nomad when a container is updated, and Nomad will immediately provision new containers, notify Consul of the change, and deploy our update without ever interrupting our service to our customers.
Having service discovery automatically managed by Consul and Nomad was a massive help in reducing our reliance on cloud load balancers. We now use a dedicated HAProxy instance to route our users API requests, with DNS-based service discovery provided by Consul.
When we began deploying Nomad, it was amazing to see that it supported the ability to connect with other nodes automatically in our cloud environments, and allowed us to easily join new servers to the cluster without having to manually make them aware of each other first. At the same time, Nomad simplified our VM architecture by reducing the number of wasteful standby nodes. Now, when a machine goes down, the containers lost are replaced immediately on another suitable node in the cluster, with minimal downtime.
When we needed greater control over Nomad, we found its highly detailed API gave us exact control over what was running. This let us create tools to customize Nomad into exactly what we needed it to be.
For example, we use an internal tool, Sarif, as a broker for Nomad. This lets us expose Nomad to application-level code triggered by events from runtimes like Node.js. This allowed us, for example, to offer audio moderation on demand, creating a container in reaction to an API request. Additionally, we rely on Nomad to routinely manage intensive natural-language batch processing tasks when signaled by Sarif.
We’ve been empowered by having access to an application orchestrator like Nomad that natively supports anything we’ve asked of it, even allowing us to build onto it where and when we needed to. This frees up time spent on application deployment and lets us invest that effort into product development to help drive the business.
Clearly, Chatsight has benefited greatly from the simplified experience offered by Nomad and Consul. It allowed us to create the event-based architecture that powers our entire platform, and to maintain 100% uptime since January 2021 while processing millions of comments and pushing updates live on a daily basis.
Editor’s note: If you’ve never deployed apps with HashiCorp Nomad, watch this demo showing you how to deploy your first application with Nomad in 20 minutes.
Auto-Config is a highly scalable method to distribute secure properties and other configuration settings to all Consul agents in a datacenter.
From maturity to security, take a closer look at the networking and service mesh trends uncovered in the new HashiCorp State of Cloud Strategy Survey.
Compliance-driven network infrastructure automation with Consul-Terraform-Sync 0.3 is now generally available for HashiCorp Terraform Enterprise.