After having upgraded an EKS cluster from Kubernetes 1.19 to 1.20, we found out that unfortunately the creation / renewal of SSL certificates was silently not working anymore with cert-manager v0.9.1 (due to the stopping of SelfLink propagation in Kubernetes 1.20). This situation is well described here.
We were hesitating between upgrading cert-manager 1 version at the time to the last one, of uninstall it completely and reinstall the last version. This second option worked well at the end, without any kind of downtime regarding HTTPS served traffic, but it needed a lot of tests and tries on our end.
I decided to share the outcome here in order to help you if you are in this exact situation. Here is the recipe (it assumes that you used kubectl to setup cert-manager in the first place, not Helm) :
- Make sure the secrets used by your secured Ingresses are not in the cert-manager namespace :
kubectl get secrets -A
- Backup somewhere your existing Let's Encrypt private key which is stored in a Secret (normally in cert-manager namespace)
- Delete existing cert-manager manifest :
kubectl delete -f cert-manager.yaml(if you used Helm for the setup, them use Helm to uninstall it)
- Delete existing cert-manager namespace :
kubectl delete ns cert-manager(if it stucks to "Terminating" state then check this)
Make sure you do not have any remaining cert-manager resources running this command, the output should be
(NotFound): Unable to list
Prepare your existing Ingresses by adding an annotation on each one of them with the right value, you can either use
acme.cert-manager.io/http01-ingress-class(if it's enough for your Ingress Controller to pick up and expose the Ingress solver that will be created with this IngressClass) or
In our case the first one was not enough : our Ingress Controller is only picking up Ingress ressources which have annotation
kubernetes.io/ingress.classcontaining the value described at its starting
So we created one ClusterIssuer per Ingress Controller in our cluster : with ingressTemplate spec you can easily add annotations to the Ingress solver that will be generated by cert-manager. That way your Ingress Controller will pick it up and expose it, your ACME challenge will be resolved and your SSL certificate generated.
It goes that way, in your Ingress:
annotations: kubernetes.io/ingress.class: "your-class" cert-manager.io/cluster-issuer: "letsencrypt-your-class" kubernetes.io/tls-acme: "true"
And your ClusterIssuer:
apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-your-class namespace: "cert-manager" spec: acme: email: firstname.lastname@example.org server: https://acme-v02.api.letsencrypt.org/directory privateKeySecretRef: name: your-letsencrypt-private-key solvers: - http01: ingress: ingressTemplate: metadata: annotations: kubernetes.io/ingress.class: "your-class"
Apply the change for all your existing Ingress ressources.
- Get the cert-manager manifest following this link
- Edit the file and replace all "kube-system" mentions by "cert-manager" if you are sure (99% yes) that you will only run one cert-manager configuration across your cluster, that will avoid you the error messages
Internal error occurred: failed calling webhook "webhook.cert-manager.io": ... x509: certificate signed by unknown authority
- Apply this modified file
- Apply the Secret of your Let's Encrypt private key you backed up above, in cert-manager namespace
- Apply the ClusterIssuer(s) you prepared above, in cert-manager namespace
- Check logs of 3 pods in cert-manager namespace to be sure that they all started correctly. If you see this message for cert-manager pod, it's pretty much nothing
cert-manager will now sync your annotated Ingresses and existing Secrets by creating new Certificates linking them.
Follow the related Challenges created :
kubectl get challenges -AThere should be a lot at the beginning but it's supposed to decrease, otherwise it means that the setup above was not correct. If you fixed something, you can force clean the Challenges by running
kubectl delete --all challenges --namespace=your-namespace
At the end you should have no remaining Challenges :
kubectl get challenges -Aand all Certificates should be at Ready state :
kubectl get certificates -A | grep "alse"
You should be all good now and up to date regarding cert-manager! Your previously generated SSL certificates are mostly still used by the way : they are just now more annotated if you describe then (you will see legacy annotations
certmanager.k8s.io/* and new ones
cert-manager.io/*) but it's OK.
I hope this article will help you in some way, if you have any comment or question please do not hesitate to leave a comment. I wish you a great day!