Securing Kubernetes
Hi, Y’all! It’s been way too long since I last wrote anything. I might list a bunch reasons for the long silence (among other things, I’ve found myself painting miniatures again — awesome!), but that’s probably not interesting to anybody so let’s get down to business shall we?
On this occasion, we’ll be looking into a pretty juicy topic. We’re going to marry HashiCorp’s Vault With a Kubernetes cluster for the purpose of serving (dynamic) secrets from Vault. You can follow along with my templates if you wish. Source code here.
Disclaimer
I’m not going to teach how to use the different tools that I’ve used setting up the experiment below. That would be tedious to read and to write.
So what is Vault?
So in short, it’s a one-stop-shop for all your secrets handling needs. For example, just imagine the hassle of renewing SSL certificates for an internal load balancer in AWS. The internal CA service that ACM offers is great, but it does come with a cost of 400$ (last I checked) a month which is quite expensive.
With Vault, you get a single API to renew, fetch and revoke your certificates — so no more openssl <try to remember the commands you use once a year>
. Now you can easily write a small program (read: lambda handler) to renew your certificates for a fraction of the cost of an ACM provisioned internal CA.
Today though, we’re interested in injecting key-value pairs to containers being spun on a Kubernetes cluster. Normally people use the secrets API of Kubernetes to store and serve secrets as environment variables. Now, this approach is completely fine, but you really should take care of encrypting ETCDs disks and limit access to the secrets API if you take this route, and most people do.
Using Vault, however, to provision and store your secrets, is more operator and especially developer friendly IMO. “Why?” You might ask. Well firstly, secrets are only ever stored encrypted, even if the storage being used is not. Vault takes care, that nothing leaves its security boundary decrypted. It also takes care of decrypting when secrets are read from Vault. Another thing is easy IAM. Setting up fine-grained access policies is really simple. For example, here’s the Kubernetes service account policy I made for this exercise:
Easy!
For developers, Vault offers a single API endpoint for all of their secrets needs, including PKI certificates, cloud accounts etc. And the best part? Everything is auditable, TTL stamped, and operator controlled.
So let’s get cracking to see Vault in action.
Setting the environment
Now we set up a highly available Kubernetes and Vault clusters in AWS. You can follow along, but take note that I’m using a registered public domain for ease of use. You’ll need to either fork and adjust some parameters in the repo or get yourself a public domain to follow along.
Prerequisites
Kubectl.
Helm.
Ansible.
Terraform.
Packer.
Kops.
Vault CLI (optional).
AWS CLI (optional).
Pre-create SSH keys and output them to the appropriate directories, for example
Do the same for
Topology
Step-by-step
1. Bootstrap the Terraform state
This will provide an S3 bucket and a DynamoDB table for Terraform remote state handling
2. Setup basic networking
For this step, you’d be best to comment out vault.tf
and kubernetes.tf
. The latter will be rewritten soon.
If everything went okay, you should see subnet IDs and a VPC ID in the output something like:
3. Packer build the Vault AMI
Next, let’s create the AMI that will be used for Vault instances. Copy-paste one of the utility subnet IDs to the Packer template:
And then, run:
Take note of the build AMI ID, as we will need it soon. Now, let’s provision the Kubernetes cluster. I’ll be using KOPS and a ready-made bash script I wrote that incorporates output variables from Terraform.
4. Provision stuff
The script automatically exports a kubernetes.tf
template and data directory with ASG launch configurations and AWS roles and policies. You’ll want to comment out at least the provider
block and the vpc_id
output block as they overlap with, but personally, I also commented out all the network-related outputs because of redundancy. They’re already being output elsewhere.
Now, if you run:
You should have a functional Kubernetes cluster with three masters and three minions divided into three AZs.
Just as a side note: you don’t need to provision your environment as I have. Using a managed service like EKS is probably best for practice's sake. We’re not really interested in setting up the environment, only about using Vault with Kubernetes.
That said, I do personally like emulate stuff as much production-ready as I can. You don’t step into (all) the pitfalls if you play it safe. “Sweeping the minefield” at this point saves a lot down the road for sure.
Just a couple of steps more to get everything up and running. Next, we provision Vault. Uncomment the template if you haven’t already, input the AMI ID from the previous step and modify the allowed SSH CIDR for the correct address of the bastion node. Then run:
5. Configure Vault and Kubernetes
First, provision a couple of “helper” stuff to Kubernetes:
Especially the vault-handler
service account is of interest here. We need it to authenticate with Vault. Now, to actually connect to Vault UI, which is accessible (only) internally inside our VPC, we will install an OpenVPN server to our cluster. For this, you’ll need to use Helm and the OpenVPN chart. Remember to go through the default settings. I personally used the custom-value.yaml
.
Then fetch the SA token and cert like this:
Modify correct IPs for the vault-nodes in ssh.config
and Ansible’s inventory
file. SSH yourself into a Vault node and initialize your Vault cluster: vault operator init
. This will be the only time that Vault outputs the unseal keys, so take care of them!!! Ideally, you should distribute the keys among five different people, but for the purpose of this exercise, I’ll copy-paste the output to Ansible Vault. As you probably figured out already, we will be using Ansible to do SSH stuff (this is what I generally like to do), because all manual labor is evil.
Add the unseal keys, service account token and root CA certificate to vault-nodes.yml
as follows:
And then run the playbook:
The final step is to add a DNS record to Route53 for Vault and volá, we’ve got a live (production-ready) Vault up and running.
The absolute last thing to do is to open access for Vault to talk with Kubernetes. For that, I’m adding the below AWS Security Group Rule configuration block to the kubernetes.tf
template (anywhere in the template is fine).
To use secrets from Vault
So now, we’ve finally at a point where we can see some Vault magic. I’ve got a demo application ready-made if you don’t feel like using your own. There are a couple of points to note. Firstly, I’m adding an init container to the Kubernetes deployment manifest that fetches application secrets from Vault and writes them to disk that the parent container also maps.
So I fetch the secrets to a JSON dump, parse it to an environment variable file that I then source in the Dockerfile:
Logs should show the current values:
Now go and change the secret values of said environment variables in Vault.
And kick the pod kubectl delete pods hyperspace-56fd89dbbd
. Wait for the new pod come to life and check the logs.
Yey! It worked!
Before you go, please remember to delete everything terraform destroy -auto-approve
:)
TL;DR
Kubernetes secrets is a mediocre solution at best, Vault is better but way more complex to set up. Choosing the best option is always a case-by-case call.
If you need help with Amazon, Kubernetes, or Vault or just interested in anything CNCF has to offer, don’t hesitate to contact your friendly neighborhood DevOps company!