"Hashinetes" - Push the boundaries by combining Kubernetes and the HashiCorp suite
In this talk, Google’s Kelsey Hightower freely admits that some of the ideas he demonstrates are “irresponsible,” yet fun. But other parts could be extremely valuable in your environment.
In his own unique style, Kelsey Hightower shows you:
- How to do service discovery across multiple Kubernetes clusters, using Consul (without hearing, “You can’t”)
- How to extend Kubernetes’ Secrets implementation, using Vault to dynamically provision service accounts (and avoid needing a spreadsheet to keep track)
- How to run Kubernetes and Nomad side-by-side (and why it’s OK to have two schedulers—if you’re a responsible adult)
Senior Developer Advocate, Google Cloud
*Kelsey H: * This is gonna be a very irresponsible talk. Some of the things I'm going to do today are irresponsible. Don't go back to work and say, "Kelsey said..." okay? It's gonna be irresponsible but we're gonna have fun, we're gonna push the boundaries of a few things. So if you see something odd, straighten back up, you smile, okay? That's what we're doing today, alright? Are we cool with that?
Mmmm maybe I've gotta go back to the bullet points, I have some PowerPoint. Do you want PowerPoint? Someone's like, "Hell yeah, bullet points!" No.
So today we're gonna talk about two of my favorite communities, Kubernetes, and this whole HashiCorp community, the whole HashiStack thing. I use a lot of these tools, sometimes together, sometimes in some weird unapproved ways. It's my computer though, I can do whatever I want. But today, we're gonna make a new term, you guys ready? The new term is called Hashinetes. Now you can't buy it, I will totally take your money, but this is not a new HashiCorp product so do not look at the product page, there will be no press release.
So Hashinetes, does this sound silly? What is this Hashinetes business? Well it starts off pretty sane [00:01:42]. Kubernetes, my first experience with Kubernetes was it felt like this cloud operating system. If you've used Terraform you feel that Terraform abstracts away all the cloud providers and you express yourself in Terraform and it makes things happen. Kubernetes is like that for me, but at a much higher level. If my app needs a load balancer, Kubernetes hides the whole implementation detail and reconciles it and keeps it in place actively. I can't remember the last time I actually used the native tools to spend up infrastructure at all because I usually start with Kubernetes.
But Kubernetes doesn't solve all problems. What happens when you have multiple Kubernetes clusters? Anyone ever try to do service discovery across two or three Kubernetes clusters? You can't! It's not designed for that, it's designed for intra-cluster service discovery and it does a fantastic job. But if you add something like Consul, you can configure it in a way, console works natively to kind of bridge that world. And I'm not talking about the IP's of each containers. If you tried to bridge the IP's of your containers you're going to run out of IP space and it's going to be embarrassing as you attempt to re-IP production. Doesn't work out well, I've tried it before.
Then secrets, Kubernetes has a very basic secrets implementation. It's great for bootstrapping things, we use it to bootstrap Kubernetes itself. We've added features around encryption at rest, we added features around limiting what nodes can see. But it doesn't do everything. Now, for a lot of people in the Kubernetes world they're asking, "Why is all of this attention on Vault? Why did Vault on Kubernetes make it to the front page of Hacker News? We already have secrets. Thou shall not bring in another secret implementation.” It gets real tribal. Anyone have any product tattoos? That was a thing in the ‘90s you would meet people with FreeBSD tattoos. You're like, people are not gonna use that all the time, you don't know man. That ain't coming off. Shout out to all the Solaris tattoos in the audience. There's some Solaris tattoos, all the people with long sleeves, it's hot as hell, but you got a long sleeve shirt on because of Solaris.
This is a great stack, this is the Hashinetes. The other day I was experimenting, if you're a manager in the audience, if your people aren't experimenting it's probably your fault. You don't have to clap if you report to your manager. Just kind of do this for me, thank you. Let them play a little. So I was experimenting I was like, "Hey I have all this compute-" and sometimes I write these little statically linked binaries and having to put it in the container, which is the price of admission for Kubernetes, even I get pissed off sometimes. And I can do it in a single step, but sometimes I want to avoid it. And I cheat, no one’s looking, I reach for Nomad [00:05:03].
So in my world, I have a little bit of this going on. And people are like, "Is that two schedulers?" And I'm like, "Yeah, I got two." And people say you're crazy, does this seem pretty odd? But you do it all the time. Most people have some hypervisor underneath. You use some top provider and when you click to create that VM or make it API call or use Terraform there's a scheduler placing your VM somewhere in the infrastructure. And you're totally okay with this. And then you do what? You get another one, on top. And do the same thing. But you think you're sane. So, in this world, I want people to say, "Hey these are just interfaces to give us compute in the way we want to consume."
Some schedulers give us a raw machine that we want to SSH into. Some of these schedulers are like pazzes that just take our app and find the right place to run it and we don't care about the machine abstraction. But there are some cases, and I think Circle CI recently put out a really good blog post about why we use both. I'm not telling you to use both, I'm just saying be irresponsible sometimes. You're an adult you get to do that.
So I wanna take us through this, ready for some terminal? They told me it was okay to use the command line. All you PowerPoint people, I'll get back to you with some slides. We're gonna start with this cluster, look at the name of it. Hashtag Hashinetes. So we have these seven nodes in this cluster and I have a few things running there. So in this cluster I have console and volt. Now, these are two stateful services, and most people say you shouldn't run stateful stuff in containers. I'm like, "But why?" Containers themselves aren't the problem for stateful, it's usually the underlying platform may not be able to pair up your storage. So how does Kubernetes make that any different?
I mean, some databases you will probably lose your data. So don't go and run everything in there. But some things do make it easier. Console being one of them. The fact that console does replication and has the ability to heal under certain conditions. It's part is to make Kubernetes life much easier. So if I come here and I say KUCTL get pods we'll see, if Wifi actually works ... You guys should clap cause WiFi works. The rules are the more you clap the more favor I get from the demo gods, cause they wanna see how this goes.
So we have this three node console cluster and like one does you have to run the release that they did yesterday, that's what we got going down. So that means if anything breaks it's not my fault. If you work on console I will look at you. That means I need help, don't leave me here stranded. Alright so we have this three node console thing but the other thing that we get is this thing called a persistent volume claim. So this is where all I do on my manifest is say, "Hey, I need storage of this kind." Based on where I'm running, on Prim it could be [inaudible 00:08:32] or Anifes. On Amazon it would be their elastic block, on Google Cloud it would be their persistent disk. Don't need to think about that.
Kubernetes job is to actually provision the storage, give it the ID, and make sure that the storage can be mounted in the zone where the workload needs to live. And give me stable name to make sure that they are always paired back together. So all the stuff that you used to do around orchestrating stateful things, Kubernetes will meet you probably more than halfway. So, given that, I can actually rely on Kubernetes to do the right thing. If these apps die they will ensure they're paired with the right storage even if they move to a different machine, great.
Again, I'm gonna skip one of the cases but if I had a VM onto the side you can actually delegate DNS to console for the console domains. So what that looks like is, we have this thing called KUBDNS and what I can do to KUBDNS and say hey Kubernetes is responsible for all of the service discovery that happens in the cluster. But if I'm running console side by side I can also do something like this with this config and tell KUBDNS that if anyone looks up that console delegate that to the console service running in cluster. So that allows me to actually go over a broader landscape than what I currently do on Kubernetes.
The next thing is volt. All my Kubernetes people like, why are you using volt? What's the use case? The first thing that got me hooked on volt was the dynamic per visioning of secrets. But I want people to be clear on what we're doing with volt. It took me a long time to understand why I do have all of these systems, why do we even need volt? You ever thought about that? Why does volt even exist? Now some people say, you just need it, we've got all these secrets, it's better than putting it on disk. And the more you think about it, and I talked to colleagues and they say, if you think about it, volt is an identity translator. You have something that you know and trust, a tailor certificate, a jot token, and the problem is your database or reddis or some other system has no idea how to use that identity or trust the thing that gave it to you. But it does understand things like usernames or passwords. So you trade volt for one of those things that it trusts and you go back to it.
So you can imagine a world if everything understood fiber mine certificates. But that's not where we are, so volt is required. So here's the magic sauce, if your DBA, I've seen DBA's do this before. Hey, I need a username and password for my app, I got you covered. Then they do this, what are you doing? They're like, "Hold on I got you." And they keep on clicking, I'm like, "Dude your mouse battery is running low, probably from all of that clicking." There's better ways of doing this. And it's like, I have the gooey. So this isn't a modern thing for some people, you can click into here. How many usernames do you need? I need three. They're like, "No problem, got you covered." Have you ever seen their database? Anyone know what a pivot table is. Yeah, on their databases you can build pivot tables, it's real programming language. So here we have users and then what you see people do is click in there and say, "What tables do you want to access?" I'm like, "Oh stop. You're supposed to know this, just give me credentials." So the magic I show my friends is I scroll all the way down, that's what I'm gonna show you, magic.
I'm gonna bring in volts, so volts installed, I unseal it too. Are you guys like me? Do you do this? How many people, just make noise if this is you, just be honest though don't be trying to be fake, be real honest. Is that you? Hell yeah! Defeat the whole security mechanism. You got the air force like dude I have all the keys on this laptop. You are just not doing this correctly. Okay, so we're all on the same page at least.
So volt is ready, so let's do a volt status here. Great, we're unsealed. So this is what I do. I have all these configs in place that volt is going to dynamically create secrets for you, but yesterday they announced the ability to use service accounts in Kubernetes to skip one of these bootstrapping steps. How do you automate giving out secrets end to end? It's super hard if it's not built into the platform. So in Kubernetes we have what we call service accounts. So service accounts are a way for us to give identity to these tokens and give them a set of permissions. We have rich our back inside of Kubernetes.
So if you have this you can actually assign, and listen to me here, do not get volt, the default service account. Do not get volt, the admin service account. Give it a service account that can do nothing but log into volt. I don't want you to be on news. This is serious. Okay back to being irresponsible.
You got service accounts and what you do on Kubernetes is you tell your workloads what service accounts they should use. So one of the workloads I have is this very simple job. So this job is a lot of configure, skip most of it but I am doing things very secure. That's one thing I won't compromise on, so I'm using TLS end-to-end [00:14:18]. If you look here I'm saying I want this app to use this particular service account. The nice thing is this is built into Kubernetes, Kubernetes will inject the service account at run time. Volt only has to trust Kubernetes and the holder of that service account to identify who you are.
And in volt you attach all these [00:14:40] to policies and so forth, so here's the trick. We run this now. QCTL get pods, we have our control playing. And now we're going to run this thing and observes what happens, it's going to go fast though. We're gonna run it, QCTL get pods. It's a job that goes so fast that it finished. We're gonna do it in slomo now. We're gonna type a little slower as this is gonna change anything but it won't. We'll just look at the exit work. And they're like, "What did you do? You just ran a job, whoopdidoo." So we do QCTL logs, and then what we wanna do is see what actually happened here.
This particular app grabs the service account, reads it, presents it to volt at login, volt trades it for a real token that it can use. And it goes out and says, "Hey give me database credentials." And you see here that it got this username. They're like, "Dude your spreadsheets gonna get big if you're doing that." I was like, "but there's a better way." So we come over here and we look, and they're like, "Where's the user?" Let's do it again, I'll slow it down.
One more time, we're gonna delete the work. So it's QCTL delete jobs worker. Alright now we're gonna run it again really fast and then we're gonna switch really quick. You saw me tab, the keyboard shortcuts. The username's there. That was my fellow Kubernetes user, he was like, "Sweet! I'm deleting all my spreadsheets." So now it's there. And they're like, "Who's job is it to clean that up?" So you look at it and you scratch your head. We could put a search engine in the spreadsheets there's API's for spreadsheets these days. And you look at the workload, and it's not there anymore. It stopped running.
So what we really want this to do is, let's see, tick tock tick. Their minds are blown. Who took it away? Magic! You wait three days before you tell them it was volt. So volt and Kubernetes to me make a really good pairing because I think this kind of capability really matches this idea of these dynamic workloads. So we have this nice stack console, volt, what else?
Well, sometimes I wanna run things in nomad, cause oh man managing two schedulers. These big tools, you need to think about them, how do you deploy them? You need Terraform, you need all of these things. I'm like, I can just use the Kubernetes API, I can represent it. So, the beta came out yesterday I was like, yeah, that's like production ready? Less support but still production ready, that's the way I think about it. So I was like, I'm gonna run it in my cluster. How many people think it's hard to run two schedules, or even one schedule, by itself? You guys are all badasses, you're lying, pretty badass.
One way I thought about experimenting with this is what if you could just say, give me a nomad cluster. Let's just try it and see. So someone calls me, just ignore it. I'm gonna see if I can make this work. I see WiFi is not on my phone so this is when you do LTE for the win. I don't know if this is smart, let's just go with it. Talk to ... What is this thing called again, anyone wanna remind me of the name? You guys don't know Hashinetes? I worked really hard on that name too.
Talk to Hashinetes.
OK Google: Sure, getting the test version of Hashinetes.
Hashinetes: Hello Kelsey.
Kelsey H: Hello.
Hashinetes: I see you're tempting the demo gods again over conference Wi-Fi. That's extra bold. How can I be of service?
Kelsey H: Alright so we need Nomad. We're just gonna watch and see what this thing can do. Let's start with a simple question. Deploy Nomad.
*Hashinetes: *Creating a three-node Nomad cluster.
Kelsey H: Awesome.
Hashinetes: I hear you like schedulers so I deployed Nomad using Kubernetes so you can have a scheduler in your scheduler.
Kelsey H: Irresponsible. So we're gonna allow this to bootstrap. And what's happening underneath the covers is we're using the same mechanism of this is a stateful application that's also clustered, we need to do them in a specific order. We need to omit the storage and make sure the storage is ready, mount it to the machine, configure Consul to have Consul join the larger cluster so we can do the service discovery to fully bootstrap this thing. There's an entire config behind this. So as we're bringing this up we wanna see the cluster form, and then once the cluster forms you're gonna see if that fancy dashboard is any good. Anyone play with the dashboard yet? You haven't played with the dashboard? I showed my wife these things she's like, "Why are you showing me ... What am I supposed to get out of this?"
We have three nodes here, is this actually a working cluster? Truth be told this didn't work this morning so I'm a little nervous right now. One thing we can do is I'm gonna source this Nomad environment variable because we're doing this over the web, and we're gonna say, "Nomad server." Like I know what I'm doing, gotta do it with confidence man you just type harder. Let's see if this dashboards available.
Do you know how long it took me to figure out how to do client TLS auth with the browser? Anyone ever got that to work? No, it doesn't work easy but I got it to work and I'm very proud of myself. So, we're gonna do now is try to hit that fancy dashboard of theirs. So we grab this and it's on 464 ... If you don't think I'm using TLS mutual auth go ahead and try it I will laugh at you on the inside. So click this, and we got some servers.
Now you can drill around here and this is pretty nice, I'm like okay GUIs are good. Now you ask yourself where did the workers go? If you're thinking that I would deploy the workers inside the same cluster and run them on Docker that's ridiculous. I'm irresponsible, that's ridiculous. You should run them in their own node pool, right? In the cloud or even in your own environment -- you usually have the ability to have some elastic pools, so use them to your advantage. If we have a certain set of jobs that aren't optimized for what Nomad wants to do, just create another node pool.
We're gonna come over here like a badass with the mouse and we're gonna click on what you already know, instance groups. Grab one of these and we'll add a few -- how many should we do? Throw out a number. audience: A thousand!
Kelsey H: A thousand divided by a hundred cut in half that's five. You're calling the shots. So, we're gonna spin up these nodes -- and I'm not trying to promote a top provider right now, this is my on-prem. I work at Google, this is just my normal infrastructure. Legit? Do you see how fast it came up? [Aside] Marketing, I've done my part.
So we have nodes. Now the nodes are doing bunch of stuff so this is what happens when a speaker is trying to buy time for things to bootstrap in the background. This is what we do we wave our hands and then we may do this and say, "Did you see that?" Knowing you didn't see anything. But we have these five nodes. Now the goal with these five nodes -- if it works -- is that they will show up here, at some point. I'm gonna hard-refresh. So you ask yourself, "Did you push it too far?" Cause there's a limit to this thing, did we go too far?
We got five nodes. I'm feeling good about myself. So you click around -- the GUI is actually really nice by the way. Maybe you shouldn't see some of my secrets here, I'm a little worried about this one, but overall it's good. This is why I turned on the TLS auth cause I didn't want you guys to do weird things to my thing. So all the clients are now fully bootstrapped joined the cluster, I can even do cross cluster service discovery. They both have volt integration, so what next? Run some jobs.
So we go to jobs. The jobs integration is actually super nice so let's see what we can do with that. Maybe we can talk to Nomad now, we talked to Kubernetes let's see if we can talk to Nomad. Run the ... I don't even know the name of the job. Run the Ping Nomad job.
OK Google: Here's some results from the web.
Kelsey H: Run the Ping Nomad job.
OK Google: Here are the top search results.
Kelsey H: It's trolling me. Run the Ping Nomad job.
OK Google: These are the top results.
Kelsey H: I'm not abandoning ship I'm gonna go through this one. Talk to Hashinetes.
OK Google: Alright, let's get the test version of Hashinetes.
Hashinetes: Hello Kelsey.
*Kelsey H: *Run the Ping Nomad job.
Hashinetes: How many?
*Kelsey H: *12.
Hashinetes: Creating 12 Ping tasks in the Nomad cluster.
Kelsey H: Thank you.
Hashinetes: When Seth said pick a scheduler I'm not sure this is what he had in mind but I've got to admit, this is extra dope.
*Kelsey H: *12 running.