Sr. Software Engineer - Backend - Nomad
HashiCorp is a fast-growing startup that solves development, operations, and security challenges in infrastructure so organizations can focus on business-critical tasks. We build products to give organizations a consistent way to manage their move to cloud-based IT infrastructures for running their applications. Our products enable companies large and small to mix and match AWS, Microsoft Azure, Google Cloud, and other clouds as well as on-premises environments, easing their ability to deliver new applications for their business.
About the Role
Our team builds and maintains Nomad: a highly-scalable, flexible distributed cluster orchestrator. Nomad helps teams run varied workloads including containerized applications, VMs, and batch jobs such as CI builds and ML model training. PagerDuty, Cloudflare, Roblox, Pandora, and many other large organizations run Nomad in production today.
Our customers run Nomad on tens of thousands of nodes and rely on our tools to operate their critical infrastructure and applications. They care deeply about reliability and performance and so do we.
Alongside Nomad’s core product, we develop and maintain client and plugin APIs that support everything from device capability detection to pluggable networking backends for Nomad-scheduled jobs.
Some of the challenges for our team include:
- Ship and support new Nomad scheduler features for cluster operators, developers, and product integration
- Extend Nomad’s core platform to support emerging architectures (ARM64) and workloads (FaaS, WASM, IoT), particularly those at the network edge
- Develop high-level tools for Nomad operators, including observability hooks, debugging and introspection tools, and job definition and deployment aids
- Support a diverse community of users, from expert OSS practitioners and infrastructure teams at Fortune 500 companies to first-time systems operators working on their homelab or first test deployment
- Build an extensible plugin architecture to support and grow an ecosystem of plugins for shared concerns like runtime drivers, devices, and logging
- Help both internal and external users understand, apply, and contribute to a shared body of knowledge, including reference architectures, best practices, and example workloads
Much of our work and libraries are open source. Nomad and its supporting libraries are written in Go. Our API is used by a broad ecosystem of clients including our own CLI and web interfaces, other HashiCorp products, and community and enterprise tools built on Nomad.
In this role, you can expect to:
- Help define and build the next generation of Nomad features and architectural decisions
- Apply your knowledge of modern Linux system internals, networking, storage, and infrastructure patterns to Nomad and its sibling products at HashiCorp
- Program predominantly in Go, learning from and contributing to a team committed to continually improving their skills
- Work closely with Cloud Platform and SRE teams -- both inside and outside HashiCorp -- to help them build efficient systems and processes
- Contribute to Nomad’s open-core product and community
- Own the full lifecycle of feature development from design through testing, release and support.
- Work with various cloud partners (AWS, GCP, Azure…)
- Participate in our periodic, low-volume community support and on-call rotations
You may be a good fit for our team if:
- 4+ years of experience in a systems language like Go
- You have prior experience working in high performance or distributed systems; while we strive to hire at a variety of experience levels, this particular opening may not be well-suited for candidates who are very new to the field
- You design with efficiency, empathy, and instrumentation in mind (performance tuning, monitoring, capacity planning, root cause analysis)
- You’ve worked with open source and/or “open core” products, including locally-hosted and managed products
- You have awareness of the broader orchestration ecosystem and the Platform-as-a-Service space
- You are curious about academic computer science research, particularly distributed systems papers such as Raft and Paxos variants, and enjoy learning more about the challenges of consistency at global scale.
- You enjoy working with operations, security, and application teams both internal and external to HashiCorp
- You’re comfortable navigating ambiguity and embracing change
What is our hiring process like?
The below serves as a basic outline; we may choose to add or remove steps based on the information that we gather during the process.
- Introductory Call with someone from our recruiting team.
- First Interview with an Engineering Manager
- Interview Loop with additional team members, with the following panel:
- Technical Code Pairing interview
- Code Review interview
- Communication and Collaboration interview
- Systems and architecture interview
- If applicable, a final conversation with the Engineering Manager for the team you would be joining
You should expect to do some amount of programming live during your interviews. We do our best to accommodate your programming language of choice, but you should ideally have your development environment set up and ready to start from a fresh copy of an upstream repository.
About your Application:
As a remote team, collaboration and communication are a critical aspect of how we work. A cover letter is a great way to provide a sample of how you communicate.
In your cover letter, please describe why you're interested in working at HashiCorp, and on the Nomad team in particular. Specifics of your past experience that are relevant to this role are great to include too.
HashiCorp embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. We believe the more inclusive we are, the better our company will be.