Hear the journey of how Capgemini built a Azure Kubernetes microservices app for the Norwegian Maritime Authority and managed all the infrastructure and deployment with Terraform.
"Automate All The Things" is the new creed by which all applications (should) live nowadays. But, what does this mean in reality?
Eugene Romero from Capgemini showcases a real life application built around all the buzzwords: cloud-native, Kubernetes, microservices, Docker, and many more. He discusses how he automated the building, testing, and deploying of the app with Terraform.
Real-Life End-to-End Building, Testing, and Deploying of a Buzzword-Heavy Application. I was looking at the list of the different hallway talks, and I have the longest title. I don't know if that means anything, but at least it means that I need to make this possibly shorter.
But I was looking at it — trying to think — is there something I should cut? And I decided against it, even though it is a mouthful. I decided against it because I think the example I'm going to be showing does touch on all these different aspects.
It is real life. It's a real client. It's a real application we are building. We do end-to-end. There is no manual work involved. Everything is automated. We build, we test, and we deploy. And it has all the buzzwords. We'll talk about that in a minute.
But yes, my name is Eugene Romero, I am a senior cloud and DevOps engineer, and I work for — in Norway we call it — Capgemini. I think in English, they usually call it Capgemini. But anyway, you might have heard of us, a big consulting company originally from France.
A little more about myself: I've been in the infrastructure and software development world for about 15 years — a little over that — getting old there. Also a Linux nerd myself. That's why this entire world of DevOps and automation and all that just fits like a glove — it's exactly the stuff I like doing.
My education is in electronics. When I'm not working with software or with DevOps, I am usually restoring and modifying old gaming systems, especially Game Boys. If you like to talk Game Boys, come look for me after. We can discuss.
Let's talk a bit about the project that I will be discussing. First, the client is the Norwegian Maritime Authority. This is a public sector client from Norway. And already, when you say public sector, sometimes people start to lose interest. But in this case, in Norway, in the public sector, they have a very big push for making everything modern, for going cloud native.
They're actually throwing money at this. And if there's one thing that Norway has is money. They're throwing money at this problem and they're actually coming up with some good solutions. This is the client.
The application is an online solution for sailors to obtain their licenses. Turns out that when you are a sailor — a professional sailor, I should say — you need to have a license depending on what you do on the ship you work on.
Because of that, there is a system; there is a set of rules that you have to follow. There are maybe some different courses you have to take — or some education, maybe some experience, amount of hours on a ship. All of these things — it's very structured.
But up until now, this was a manual process where, when someone wanted to get whatever the next license was for their career, they would have to send in a bunch of paperwork. Someone would have to do this manually. We've been automating this whole process for them.
And the reason, yes, because it was tedious, it was manual. And in the end, it is just a bunch of rules. It's, if this, then that. If he has so many hours, then he might be available for this — or eligible for this. Or if he has taken this education.
I did talk about buzzwords. We only have 15 minutes. Sadly, I cannot do an entire build and a demo of our application, but we do use all the main ones in this application. We have microservices — very big thing.
Everything runs in Kubernetes inside of Azure — or Azure — as I've heard the Brits call it. We use the Azure Kubernetes service there. Everything with the infrastructure is set up as infrastructure as code. We follow only DevOps methodologies. Everything is cloud native, there is nothing on-premises. Quite buzzword-heavy, we could say.
And, of course, the one that brings us here today Terraform. Everything that we do, we do with Terraform. And it's been quite a challenge because all these bits, they are all modern DevOps things,but they are also different things — they're not necessarily the exact same category. It has been interesting trying to use a single platform to deploy, build, and test all of these different bits.
First, simplicity. And when I say simplicity, one thing I mean is we wanted to have a single tool to perform all our DevOps tasks. We didn't want to have a bunch of different tools that we had to learn and maintain.
Also, thinking about the ones coming after us, maybe we're onboarding new people. Or if I'm moving on to the next project, the new person — will they be able to pick up on these systems? As we all know, the more tools you have, the more complexity there is in this system.
Also, we enjoyed the provider availability that there is in Terraform. I think I saw earlier today 2,000+ providers on the Terraform Registry. And it doesn't mean that they are all necessarily of the highest quality — a lot of them are custom-made by people. But we did find that there were enough providers for basically all the things that we wanted to do within our application.
Terraform also is fairly easy to read and understand. The syntax is quite clear. Once you have an idea of what you're looking at. We know that — I don't have to get into this too much. But it is simple to, more-or-less, know what we are building.
Finally, the documentation — we have enjoyed this very much. The different providers — especially the official ones — have very clear documentation that tells you clearly what things are. We have managed to use this quite extensively. Almost on a day-to-day basis, when we are developing on Terraform, we are going on the Terraform documentation website and always comparing that stuff — and comparing it to the different things that we use with Azure, Docker or any others.
First, we have multiple identical environments. This is something that you've probably heard about a lot that it is an advantage. Also, companies don't always want to do it because of cost — because of other things.
But by leveraging Terraform, we have a development, testing and a production environment that are identical — however, of course, saving costs. For example, something might be smaller or a development model in our dev environment, while it might be a production — let's call it, type, or model — in our production environment. But the beauty of it is that we can use the same Terraform code for all our environments.
This is very important to us. We don't want to have different code that deploys to every environment. Because, as you know, once you start having different code, something that happens in one place is not going to happen in the other place — and you're not going to be able to figure out why. This is important to us.
Then, have different variables per environment. So, as we have our pipelines that deploy — first, to dev, then to test, and then to production — we just feed variables that tell the pipeline for this environment: I want you to create this and this, but I want this many of them — or I want this SKU, whichever.
We then do our compliance and integration testing in our dev environment. This works for us because we are a small team. We have noticed that if the team grows a lot, we will have to change this a bit. Once you have different people developing in the same environment, sometimes their changes are going to get in each other's way. But for now, we are a small team. This has been working for us.
So, we take our code, we take our Terraform, and we apply it. Or first, we build it. We do some compliance testing. You might have heard of a tool called Terraform Compliance. If you haven't, it's not an official HashiCorp tool, but it is available online. It is well-maintained, and we have found it to be quite useful for us because our client has some regulations around compliance — the way things need to be named, and where things must be deployed.
We do our compliance testing in our dev environment. And if that passes, then we do a deploy — or a terraform apply — of our code in that dev environment and run a smoke test, make sure that nothing blows up, make sure that nothing explodes.
If we are satisfied there, then we perform our system and acceptance testing in the test environment. This means that this is a little bit more stable environment. This is where the client can go in and see what it is that we are developing.
We use other things such as FeatureGates so that we only show what we want to show. This means that we have an environment that is very similar to production. Our client can go in, they can make sure that it looks the way they want it to look. And also, we can make sure that we are building what we want to build before moving this code over to production.
We also need to separate our infrastructure and application pipelines to make this work. What exactly is it I mean by that? This means we have found having our infrastructure, which is — let's call it, the things that the application runs on — could be the Kubernetes cluster, can be storage accounts, etc. This is deployed on one pipeline, on its own. Then the things that actually deploy our application — this can be our Docker Microservices into Kubernetes,our APIs into an API management solution — this is then deployed in its own pipeline.
There are a few reasons for this. First, we wish to reduce complexity. If there is one thing true about Terraform, it can get very verbose, very fast. It can get very noisy, and you end up with lots and lots of files.
We wanted to reduce that as much as possible. We figured that separating these two things would mean you didn't have so much code to maintain on each one of these code bases.
Also, we wanted to avoid dependency issues. It does happen that, at times — especially when it comes to that application layer — when we are going to deploy certain things, these things depend on things that already exist.
They might need to get some data about something. Or they might need to get some information about something. We have found that creating it as we create the other resource that it depends on does not work. Therefore, we separate these. We have one set of pipelines for the infrastructure. Then the other set of pipelines for the application itself.
Let's get into that in a little bit more detail. The infrastructure code and connected pipeline then builds and deploys our Azure infrastructure. And this could be — these are just some of the components that are in there. It could be an API management gateway, an elastic pool of SQL databases,an Azure Key Vault, an elastic search instance.
These we can say are the things that the application runs on. We don't necessarily configure them too much. We might configure them some, of course, but only what makes sense so that we can have an empty skeleton that our application will now be deployed into.
Then the next side of it is the application code and pipeline, which builds, tests, and deploys the application components. As mentioned, this can be the APIs,the things that run inside of Kubernetes. For example, we use Linkerd. Or it can be Prometheus — all the different bits that actually make up the application.
our microservices as well — these are built as Docker images. We publish them to a Docker registry in Azure. And then, by means of Terraform, we pull them into Kubernetes and set them up. Our database migrations as well, we have those scripted. By means of Terraform, we are also able to run those in our databases.
So, what are some of the providers that we use for this? As I mentioned, we did enjoy the fact that there are many different providers you can use inside of Terraform. For example, we use Azure AD and Azure RM.
Both are official Terraform providers. We use these for all the main Azure things. Anything we need to do with active directory users or service principles, we can use Azure AD. Azure RM, creating anything inside of Azure. We also use Helm, Kubernetes, and a custom one by an internet developer named Gavin Bunney. I'm sorry. If you are here, I am destroying your name.
We use these for Kubernetes itself. As you know, Kubernetes is its own little microcosmos. It does require a lot of tinkering and massaging, so we use those three providers to create anything and everything inside of our Kubernetes clusters.
We also use a few helper tools such as random, TLS or another custom one called PKCS #12. We use this especially for things such as secrets, certificates, all these things that we don't want to be generating ourselves.
An important note here — all these things — do remember that anything and everything you create in Terraform will be reflected in the Terraform state. I'm not going to get into that because of time now. Of course, you should always make sure that you are keeping your Terraform state secure. But we use this for things — especially such as internal certificates — so we know that we can protect, and we don't need to be thinking too much about manually renewing.
Finally, we use the external and null providers for things such as database migrations and custom scripts that there might not be an existing provider for.
It is not a perfect world. It has been running very well, but, of course, there is always room for improvement. We have things, for example, such as whenever there are changes in the Azure API — it might be that something in the provider that has been working no longer works because Microsoft has released something new.
Usually, these things get resolved fairly quickly. But for this, we have found it is important to keep our providers up to date. Something similar with the Kubernetes API, as you know, this is something that is still in development. There are a lot of beta APIs. We find this is something that we constantly have to stay on top of.
Sometimes some custom providers, we find, maybe work for us, in the moment, but after a year or two, they are no longer maintained. It can be that this becomes a challenge.
This also goes hand-in-hand with some specific edge cases where we might need to use a script here and there, for which we might have to use, again, the external provider, or maybe a null resource.
Finally, doing local development against a remote state can always be complicated, especially for new developers, because we want to have a central state that everyone works against. But this can get complex when you're bringing new people in. This has been an ongoing challenge as well.
First, speed has been one of the biggest ones. We have reduced deployment times from days to minutes. And when I say days, I'm talking about some of the older applications that this client had — where a new release meant having a four- or five-page document for things you had to manually do before performing a release. Now we just press a release-to-production button and it is done. This is something they really value.
There's also a lot more trust in the environment — and also in our disaster recovery. We even go in every six to 12 months We'll get a new subscription and actually deploy everything into it and see where it breaks. Usually, things will break because this is like herding cats. There are a lot of moving parts all the time. But at least we know that in case of emergency, that will be working for us.
Finally, visibility. We have an increased understanding of the workings of our application and infrastructure. We have found sometimes with Azure, if you create things through the portal, Azure will create things in the background and not tell you about them because it will simplify this for you.
But then you come and start wondering how does this work, or what is connecting this bit to this bit? By writing our infrastructure as code, we have found we understand a lot more about what all these different moving parts are — so that we, too, can stay on top of them.
That's it. My Twitter, @theEugeneRomero, my website damn.engineer. I usually write in there about Terraform, cloud, Linux, whatever catches my fancy at the moment. I do have some stickers. Come find me later, and I can give you one of those. That's it.