The Ops Community ⚙️

Arseny Zinchenko
Arseny Zinchenko

Posted on • Originally published at on

Kubernetes: tracing requests with AWS X-Ray, and Grafana data source

Tracing allows you to track requests between components, that is, for example, when using AWS and Kubernetes we can trace the entire path of a request from AWS Load Balancer to Kubernetes Pod and to DynamoDB or RDS.

This helps us both to track performance issues — where and which requests are taking a long time to execute — and to have additional information when problems arise, for example, when our API returns 500 errors to clients, and we need to find out which component of the system is causing the problem.

AWS has a service for tracing called X-Ray, where we can send data using AWS X-Ray SDK for Python or AWS Distro for OpenTelemetry Python (or other languages, but we’ll talk about Python here).

AWS X-Ray adds a unique X-Ray ID to each request and allows you to build a picture of the full “route” of the request.

Also, in Kubernetes we can trace with tools like Jaeger or Zipkin, and then build the picture in Grafana Tempo.

Another way is to use the X-Ray Daemon, which we can run in Kubernetes, and add the X-Ray plugin to Grafana. See Introducing the AWS X-Ray integration with Grafana for examples.

Additionally, AWS Distro for OpenTelemetry also works with AWS X-Ray-compliant Trace IDs — see AWS Distro for OpenTelemetry and AWS X-Ray and Collecting traces from EKS with ADOT.

Today, however, we will be adding an X-Ray collector that will create a Kubernetes DaemonSet and a Kubernetes Service to which Kubernetes Pods can send data that we can then see either in the AWS X-Ray Console or in Grafana.


IAM Policy

To access AWS API from X-Ray daemon Pods, we need to create an IAM Role, which we will then use in the ServiceAccount for X-Ray.

We still use the old way of adding IAM Role via ServiceAccounts, see Kubernetes: ServiceAccount from AWS IAM Role for Kubernetes Pod, although AWS recently announced the Amazon EKS Pod Identity Agent add-on — see AWS: EKS Pod Identities — a replacement for IRSA? Simplifying IAM access management.

So, create an IAM Policy with permissions to write to X-Ray:

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": [
Enter fullscreen mode Exit fullscreen mode

Save it:

IAM Role

Next, add an IAM Role that the Kubernetes ServiceAccount can use.

Find the Identity provider of our EKS cluster:

Go to the IAM Roles, add a new role.

In the Trusted entity type, select Web Identity, and in Web identity select the Identity provider of our EKS, and in the Audience field — set the AWS STS endpoint:

Attach the IAM Policy created above:

Save it:

Running X-Ray Daemon in Kubernetes

Let’s use the okgolove/aws-xray Helm chart.

Create x-ray-values.yaml file, see the default values in values.yaml:

  annotations: arn:aws:iam::492***148:role/XRayAccessRole-test
  region: us-east-1
  loglevel: prod
Enter fullscreen mode Exit fullscreen mode

Add a repository:

$ helm repo add okgolove
Enter fullscreen mode Exit fullscreen mode

Install the chart into the cluster, this will create a DaemonSet and a Service:

$ helm -n ops-monitoring-ns install aws-xray okgolove/aws-xray -f x-ray-values.yaml
Enter fullscreen mode Exit fullscreen mode

Check the Pods:

$ kk get pod -l
aws-xray-5n2kt 0/1 Pending 0 41s
aws-xray-6cwwf 1/1 Running 0 41s
aws-xray-7dk67 1/1 Running 0 41s
aws-xray-cq7xc 1/1 Running 0 41s
aws-xray-cs54v 1/1 Running 0 41s
aws-xray-mjxlm 0/1 Pending 0 41s
aws-xray-rzcsz 1/1 Running 0 41s
aws-xray-x5kb4 1/1 Running 0 41s
aws-xray-xm9fk 1/1 Running 0 41s
Enter fullscreen mode Exit fullscreen mode

And Kubernetes Service:

$ kk get svc -l
aws-xray ClusterIP None <none> 2000/UDP,2000/TCP 77s
Enter fullscreen mode Exit fullscreen mode

Checking and working with X-Ray

Create a Python Flask HTTP App with X-Ray

Let’s create a service on Python Flask that will respond to HTTP requests and log X-ray IDs ( ChatGPT promt  — “Create a simple Python App with AWS X-Ray SDK for Python to run in Kubernetes. Add X-Ray ID output to requests”):

from flask import Flask
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.ext.flask.middleware import XRayMiddleware
import logging

app = Flask( __name__ )

# Configure AWS X-Ray
XRayMiddleware(app, xray_recorder)

# Set up basic logging

def hello():
    # Retrieve the current X-Ray segment
    segment = xray_recorder.current_segment()
    # Get the trace ID from the current segment
    trace_id = segment.trace_id if segment else 'No segment'
    # Log the trace ID"Responding to request with X-Ray trace ID: {trace_id}")

    return f"Hello, X-Ray! Trace ID: {trace_id}\n"

if __name__ == ' __main__':, host='', port=5000)
Enter fullscreen mode Exit fullscreen mode

Create requirements.txt:

Enter fullscreen mode Exit fullscreen mode

Add Dockerfile:

FROM python:3.8-slim

COPY requirements.txt .
RUN pip install --force-reinstall -r requirements.txt


CMD ["python", ""]
Enter fullscreen mode Exit fullscreen mode

Build a Docker image — here we use a repository in AWS ECR:

$ docker build -t 492*** .
Enter fullscreen mode Exit fullscreen mode

Log in to the ECR:

$ aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 492***
Enter fullscreen mode Exit fullscreen mode

Push the image:

$ docker push 492***
Enter fullscreen mode Exit fullscreen mode

Run Flask App in Kubernetes

Create a manifest with Kubernetes Deployment, Service, and Ingress.

For Ingress, enable logging into an AWS S3 bucket — logs will be collected from it to Grafana Loki, see Grafana Loki: collecting AWS LoadBalancer logs from S3 with Promtail Lambda.

For Deployment, set the AWS_XRAY_DAEMON_ADDRESS environment variable, with the URL of the Kubernetes Service of our X-Ray Daemon:

apiVersion: apps/v1
kind: Deployment
  name: flask-app
  replicas: 2
      app: flask-app
        app: flask-app
      - name: flask-app
        image: 492***
        - containerPort: 5000
          - name: AWS_XRAY_DAEMON_ADDRESS
            value: "aws-xray.ops-monitoring-ns.svc.cluster.local:2000"
          - name: AWS_REGION
            value: "us-east-1"
apiVersion: v1 
kind: Service
  name: flask-app-service
    app: flask-app
    - protocol: TCP
      port: 80
      targetPort: 5000
kind: Ingress
  name: flask-app-ingress
  annotations: "internet-facing" "ip" '[{"HTTP": 80}]' access_logs.s3.enabled=true,access_logs.s3.bucket=ops-1-28-devops-monitoring-ops-alb-logs
  ingressClassName: alb
  - http:
      - path: /
        pathType: Prefix
            name: flask-app-service
              number: 80
Enter fullscreen mode Exit fullscreen mode

Deploy it and check Ingress/ALB:

$ kk get ingress
flask-app-ingress alb * 80 10m
Enter fullscreen mode Exit fullscreen mode

Make a request to the endpoint:

$ curl
Hello, X-Ray! Trace ID: 1-65e1d287-5fc6f0f34b4fb2120da8bbec
Enter fullscreen mode Exit fullscreen mode

And we see the X-Ray ID.

We can also see it in the Load Balancer Access Logs:

And in the X-Ray itself:

Although, I expected the Load Balancer to be in the request map too, but it wasn’t.

Grafana X-Ray data source

Add a new Data source:

Configure access to AWS — here it’s simple with ACCESS and SECRET keys (see X-Ray documentation):

And now we have a new data source in Explore:

And a new type of visualization — Traces:

And somewhere in another post I will probably describe the creation of a real dashboard with X-Ray.

Originally published at RTFM: Linux, DevOps, and system administration.

Top comments (0)