How do you deal with peaks of traffic in Kubernetes?
To autoscale the Ingress controller based on incoming requests, you need the following:
- Metrics (e.g. the requests per second).
- A metrics collector (to store the metrics).
- An autoscaler (to act on the data).
Let's start with metrics.
The nginx-ingress can be configured to expose Prometheus metrics.
You can use nginx_connections_active
to count the number of active requests.
Next, you need a way to scrape the metrics.
As you've already guessed, you can install Prometheus to do so.
Since Nginx-ingress uses annotations for Prometheus, I installed the server without the Kubernetes operator.
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
$ helm install prometheus prometheus-community/prometheus
NAME: prometheus
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
I used Locust to generate some traffic to the Ingress to check that everything was running smoothly.
With the Prometheus dashboard open, I checked that the metrics increased as more traffic hit the controller.
The last piece of the puzzle was the autoscaler.
I decided to go with KEDA because:
- It's an autoscaler with a metrics server (so I don't need to install 2 different tools).
- It's easier to configure than the Prometheus adapter.
- I can use the Horizontal Pod Autoscaler with PromQL.
Once I installed KEDA, I only had to create a ScaledObject, configure the source of the metrics (Prometheus), and scale the Pods (with a PromQL query).
KEDA automatically creates the HPA for me.
I repeated the tests with Locust and watched the replicas increase as more traffic hit the Nginx Ingress controller!
Can this pattern be extended to any other app?
Can you autoscale all microservices on the number of requests received?
Unless they expose the metrics, the answer is no.
However, there's a workaround.
KEDA ships with an HTTP add-on to enable HTTP scaling.
How does it work!?
KEDA injects a sidecar proxy in your pod so that all the HTTP traffic is routed first.
Then it measures the number of requests and exposes the metrics.
With that data at hand, you can trigger the autoscaler finally.
KEDA is not the only option, though.
You could install the Prometheus Adapter.
The metrics will flow from Nginx to Prometheus, and then the Adapter will make them available to Kubernetes.
From there, they are consumed by the Horizontal Pod Autoscaler.
Is this better than KEDA?
They are similar, as both have to query and buffer metrics from Prometheus.
However, KEDA is pluggable, and the Adapter works exclusively with Prometheus.
Is there a competitor to KEDA?
A promising project called the Custom Pod Autoscaler aims to make the pod autoscaler pluggable.
However, the project focuses more on how those pods should be scaled (i.e. algorithm) than the metrics collection.
During my research, I found these links helpful:
- https://keda.sh/docs/2.10/scalers/prometheus/
- https://sysdig.com/blog/kubernetes-hpa-prometheus/
- https://github.com/nginxinc/nginx-prometheus-exporter#exported-metrics
- https://learnk8s.io/scaling-celery-rabbitmq-kubernetes
And finally, if you've enjoyed this thread, you might also like:
Top comments (1)
Autoscaling Ingress Controllers in Kubernetes allows dynamic scaling of Ingress controller instances based on traffic demand, ensuring efficient routing of external traffic to services. By using Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler, Kubernetes can automatically adjust the number of replicas to meet resource needs. This approach helps maintain high availability and optimal performance for services, like those hosted on 104fashionmag.com/ even during fluctuating traffic loads.