Seeking stability in a volatile environment
Digital banking has reshaped the world for private consumers and big businesses alike. Today, users can make deposits, issue payments, open accounts, and apply for loans from virtually any device at any time — and the Q2 Platform powers those solutions.
Q2 provides the backend infrastructure and systems to more than 800 financial institutions and fintech companies that enable exceptional financial experiences for their customers. Today, the company is the driving force behind 10% of digital banking experiences and processes financial transactions that equate to roughly 5% of the national gross domestic product (GDP).
But as the company's customer roster grew and the range of use cases evolved, so did the need for replacing its time- and resource-intensive manual infrastructure provisioning practices. Keeping those systems online and up to date is a challenge.
"In the digital banking and financial services world, high availability and rapid delivery of new solutions is imperative for remaining competitive and relevant," says Jordan Hager, vice president of hosting architecture (including DevSecOps, site reliability, & architecture). "We needed a way to simplify our software deployments and build a continuous delivery model that would help us stay ahead of evolving market and financial services trends. In our industry, we look at technology as the great equalizer that allows community financial institutions to compete."
Time is money, and manual approaches take both
Since its founding in 2004, Q2 has served the financial technology needs of more than 18 million consumers. Firms ranging from well-known international banks to innovative fintech startups have relied on Q2 to help them deliver the features, functionality, and engaging experiences their customers demanded in exchange for adopting the new services and creating additional revenue streams.
For years, Hager's team, including Site Reliability Engineers Bijan Rahnamai, Daniel Dreier, and Team Lead Cody Jarrett, manually provisioned and deployed infrastructure required to deliver Q2's products. Any new feature or service request from customers or the internal support team required the IT team to selectively assign server space based on available resources, priority, and other factors. But the intricacy and increased velocity of the requests made keeping up with demand increasingly challenging.
More importantly, the existing processes didn't allow developers to prototype and deploy new applications or update existing ones without risking costly downtime. "On a peak day, our platform processes as many as 1.2 million logins per hour. COVID-19 has accelerated the adoption of digital platforms, and our solutions are as critical to consumers and small businesses now more than ever. Downtime is very costly, and our customers expect anytime, anywhere access to our systems." Hager says. "Manually orchestrating the moving pieces not only makes quickly bringing new features to market more difficult, but it also creates a host of problems with monitoring the health of different services to make sure they're actually working."
Q2's first-generation orchestration platform was an in-house developed product that the team quickly realized wouldn't meet its growing need to scale. "Building a new environment with the tool still took up to a week just to balance network loads and provision applications and didn't address any of the health monitoring and performance management features we wanted," he says. "It was pretty clear we needed to reset our expectations and find a system that could simplify our deployment operations and automate health checks on all our applications."
Enabling application modernization and faster upgrade
Improve service availability and resilience
Eliminate error-prone manual orchestration processes
Provision new applications faster across multiple clouds and vendors
Beating market expectations with faster deployment and higher availability
Eager to adopt a simple and efficient way to provision applications, the team began the search for a new orchestrator. In particular, Q2 wanted an agnostic platform that could support both containerized and non-containerized workflows and allow the team to work in any cloud or on-premises environment.
After evaluating a range of other options like Kubernetes and DC/OS, Q2 chose HashiCorp Nomad because of its intuitive operations and administration, product-agnostic capabilities, and hybrid multi-cloud support.
With Nomad, the Q2 team gets a pool of computing resources accessible across their on-premise and cloud environments and a tool that automates where a workload is deployed for simple, efficient orchestration. Nomad's ability to orchestrate legacy and new workloads enables the team to run mixed workloads in Linux and Windows side-by-side in Nomad clusters for greater visibility and control.
"Nomad saved us dozens of hours a week just on basic things like setting up new configurations or adding to servers," Hager notes. "What used to take a week now takes a few minutes and eliminates the barriers that used to prevent us from shipping new features and services as quickly our customers had hoped."
Bullish on the future
In addition to replacing extensive manual processes with automated ones, Q2 engineers and developers were also able to enhance app health monitoring. Using the native integration with HashiCorp Consul, the team leveraged the tool's automated health checks to create a de facto app performance monitoring tool with self-healing capabilities.
"The system's health checks allow our developers to prototype and deploy software in a few minutes and simply roll it back if they see any errors. We don't have to wait for something to break in order to fix it," says Hager. "More importantly, it dramatically simplifies the task of keeping everything up and running because we can automatically restart particular microservices in the background if needed, and our team can move on to other priorities."
According to Hager, Nomad's resilience and reliability in the face of an unexpected threat to its core business give him and the team the confidence to aggressively pursue modernization and scaling initiatives.
"Nomad proved to be the missing piece to our technology puzzle. Now we're expecting to double our upgrade velocity in the next 12-18 months and continue lowering downtime risk beyond the 65% reduction we've achieved over the past three years," said Hager. "We're already growing at 25% year-over-year, so having a scalable solution like Nomad that we know will work just how we want and expect it to will be a pivotal part of sustaining our growth and continuing to evolve our market-leading platform."
Reduced build time for new environments from weeks to minutes
Enabled a projected 2x increase in upgrade velocity
Helped reduce unplanned downtime by 65% in three years
Automated provisioning and deployment to save dozens of work hours per week
Achieved scale to support emergency workloads
Q2 is using HashiCorp Nomad's workflow orchestration for higher availability and to deploy applications and upgrades in an efficient, scalable way.
Jordan Hager Vice President of Hosting Architecture Q2 Software
Jordan Hager is Q2's vice president of hosting architecture, responsible for DevSecOps, site reliability, and systems architecture. Prior to joining the company in 2011, Jordan served as the vice president of IT, among other roles, for a Texas-based utility and environmental services firm for nearly a decade. Jordan is highly skilled in scaling operations in high-growth businesses, site reliability, high-availability architecture, open source technology, and various database technologies.
- AWS, Microsoft Azure, private datacenter
- Workload type:
- Linux, Windows