The Ops Community ⚙️

Cover image for Autoscaling Ingress controllers in Kubernetes
Daniele Polencic
Daniele Polencic

Posted on

Autoscaling Ingress controllers in Kubernetes

How do you deal with peaks of traffic in Kubernetes?

To autoscale the Ingress controller based on incoming requests, you need the following:

  1. Metrics (e.g. the requests per second).
  2. A metrics collector (to store the metrics).
  3. An autoscaler (to act on the data).

What you need to scale the ingress controller

Let's start with metrics.

The nginx-ingress can be configured to expose Prometheus metrics.

You can use nginx_connections_active to count the number of active requests.

nginx_connections_active to count the number of active requests

Next, you need a way to scrape the metrics.

As you've already guessed, you can install Prometheus to do so.

Since Nginx-ingress uses annotations for Prometheus, I installed the server without the Kubernetes operator.

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
$ helm install prometheus prometheus-community/prometheus
NAME: prometheus
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Enter fullscreen mode Exit fullscreen mode

I used Locust to generate some traffic to the Ingress to check that everything was running smoothly.

With the Prometheus dashboard open, I checked that the metrics increased as more traffic hit the controller.

Testing active connections

The last piece of the puzzle was the autoscaler.

I decided to go with KEDA because:

  1. It's an autoscaler with a metrics server (so I don't need to install 2 different tools).
  2. It's easier to configure than the Prometheus adapter.
  3. I can use the Horizontal Pod Autoscaler with PromQL.

How KEDA works

Once I installed KEDA, I only had to create a ScaledObject, configure the source of the metrics (Prometheus), and scale the Pods (with a PromQL query).

Example of a ScaledObject for Prometheus scaler in KEDA

KEDA automatically creates the HPA for me.

I repeated the tests with Locust and watched the replicas increase as more traffic hit the Nginx Ingress controller!

Scaling the Ingress Controllers with KEDA

Can this pattern be extended to any other app?

Can you autoscale all microservices on the number of requests received?

Unless they expose the metrics, the answer is no.

However, there's a workaround.

KEDA ships with an HTTP add-on to enable HTTP scaling.

How does it work!?

KEDA injects a sidecar proxy in your pod so that all the HTTP traffic is routed first.

Then it measures the number of requests and exposes the metrics.

With that data at hand, you can trigger the autoscaler finally.

Keda HTTP add-on architecture

KEDA is not the only option, though.

You could install the Prometheus Adapter.

The metrics will flow from Nginx to Prometheus, and then the Adapter will make them available to Kubernetes.

From there, they are consumed by the Horizontal Pod Autoscaler.

Prometheus adapter architecture

Is this better than KEDA?

They are similar, as both have to query and buffer metrics from Prometheus.

However, KEDA is pluggable, and the Adapter works exclusively with Prometheus.

Similarity between KEDA & the Prometheus Adapter

Is there a competitor to KEDA?

A promising project called the Custom Pod Autoscaler aims to make the pod autoscaler pluggable.

However, the project focuses more on how those pods should be scaled (i.e. algorithm) than the metrics collection.

During my research, I found these links helpful:

And finally, if you've enjoyed this thread, you might also like:

Top comments (1)

Collapse
 
anderson135831 profile image
Anderson

Autoscaling Ingress Controllers in Kubernetes allows dynamic scaling of Ingress controller instances based on traffic demand, ensuring efficient routing of external traffic to services. By using Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler, Kubernetes can automatically adjust the number of replicas to meet resource needs. This approach helps maintain high availability and optimal performance for services, like those hosted on 104fashionmag.com/ even during fluctuating traffic loads.