Challenges of manual SSL certificate renewal in Kubernetes with Let's Encrypt
One of our partners runs a huge system on Kubernetes. Using Let's Encrypt, we issued SSL certificates. These certificates are valid for 90 days, then you need to renew them and issue new ones. These certificates were attached to frontend applications in Kubernetes Secrets. Due to the fact that these certificates were issued manually, they may have expired since the issuer forgot to issue a new one.
Our Kubernetes secrets are stored in gitlab. 'Sops' encrypts these secrets. This repository contained the actual certificates, and if a new certificate was issued and the issuer failed to replace the old certificate in the cluster secret but forgot to do so in the gitlab repository, the old certificate would override the new certificate when the secrets were deployed from the repository. Thereafter, the websites became unreliable and started throwing SSLHandshakeErrors. Obviously, this isn't what we want.
Simplifying certificate operations: achieving automation and flexibility in Kubernetes
We wanted to find a way to issue, manage, and renew these certificates automatically to avoid the problems mentioned above. We were able to easily add new domains to handle their certificates, alongside the automaticity.
The solution
The solution’s main component is Lego. It’s a Let’s Encrypt client and ACME library written in Go.
We use Lego as a Job(for creating certificates) and Cronjob(for renewing certificates) in our Google GKE cluster. We use Persistent Volumes to mount the certificates to the necessary Pods. To obtain certs we use our DNS provider. Lego has a really wide range of compatibility for DNS providers. The workflow is if we need to manage a new certificate, or wildcard certificate, then first we specify a Kubernetes Job to execute Lego with the run command. Lego solves a DNS-01 challenge using the specified provider. If it has a positive outcome, then Lego creates the certificate. After it we only need to specify a Kubernetes Cronjob with the renew command to check the expiry of the certificate periodically.
The implementation
The implementation was the follows:
- First we need to create a Persistent Volume and a Persistent Volume Claim to specify for Lego where to put certs.
- After it, apply PV and PVC with kubectl.
- If these are done, then we can go ahead to write our Job
apiVersion: batch/v1
kind: Job
metadata:
name: lego-cert-creation
spec:
ttlSecondsAfterFinished: 120
template:
spec:
containers:
- name: lego-cert-creation
image: goacme/lego:<LEGO_VERSION>
env:
- name: DNSMADEEASY_API_KEY
valueFrom:
secretKeyRef:
key: apiKey
name: dnsmadeeasy-secret
- name: DNSMADEEASY_API_SECRET
valueFrom:
secretKeyRef:
key: secretKey
name: dnsmadeeasy-secret
args:
- --accept-tos
- --domains=<DOMAIN>
- --email=<EMAIL>
- --path=/lego
- --dns=dnsmadeeasy
- run
volumeMounts:
- mountPath: /lego
name: lego-certificate-claim-volume-reference
restartPolicy: OnFailure
volumes:
- name: lego-certificate-claim-volume-reference
persistentVolumeClaim:
claimName: lego-certificate-claim
- ttlSecondsAfterFinished: After finishing the job, it will be deleted after 120 sec.
- <LEGO_VERSION>: The Lego version you want to use for Lego
- Env: in the env section we use a secret we deployed to the cluster which contains the necessary data for Lego to be able to communicate with the DNS provider. We use DNSMadeEasy, but other providers may need other data to be able to communicate with their API.
- args: — accept-tos: By setting this flag to true you indicate that you accept the current Let’s Encrypt terms of service. (default: false) — domains: The domain you want to issue the certificate. If you want to issue certificates for multiple domains you just need to define a –domains argumentum for each domain. You can use a wildcard certificate here. — email: Email used for registration and recovery contact for Let’s Encrypt. — path: Directory to use for storing the data. (default: “./.lego”). It’s **/lego **since we attached the PV into that path inside the container. — dns: The DNS provider you want to use.
- run: Register an account, then create and install a certificate.
- The Cronjob for renewing the certificate looks very similar to the Job
apiVersion: batch/v1
kind: CronJob
metadata:
name: lego-cert-renewal-cronjob
spec:
schedule: 1 7 * * 1
jobTemplate:
spec:
template:
spec:
containers:
- name: lego-renewal
image: goacme/lego:<LEGO_VERSION>
env:
- name: DNSMADEEASY_API_KEY
valueFrom:
secretKeyRef:
key: apiKey
name: dnsmadeeasy-secret
- name: DNSMADEEASY_API_SECRET
valueFrom:
secretKeyRef:
key: secretKey
name: dnsmadeeasy-secret
args:
- --accept-tos
- --domains=<DOMAIN>
- --email=<EMAIL>
- --path=/lego
- --dns=dnsmadeeasy
- renew
volumeMounts:
- mountPath: /lego
name: lego-certificate-claim-volume-reference
restartPolicy: OnFailure
volumes:
- name: lego-certificate-claim-volume-reference
persistentVolumeClaim:
claimName: lego-certificate-claim
- schedule: A cron schedule. We set the job to run every Monday at 7:01 AM.
- args: — renew: Renew a certificate. Let’s Encrypt certificates are valid till 90 days. It renews a certificate if it expires less than 30 days.
- After applying Cronjob too, we must make sure that our services uses these certificates. As a result, if I used the sandbox API key and secret, it did not work, so if you are implementing this solution, you should use the real credentials.
The conclusion
Using this solution, we can create and maintain SSL certificates automatically. In the future there won’t be issues because of the expired certs. Lego is very easy to use and supports a wide range of DNS providers.