The Ops Community ⚙️: Lucy Linder

AWS S3 multipart uploads from unauthenticated users? presigned URLs (😕) vs federation tokens (😃)

Lucy Linder — Tue, 27 Jun 2023 12:00:00 +0000

I had a very interesting use case lately: being able to upload a file to S3 without being signed in to AWS and taking advantage of multipart uploads (large files) in Python. This made me dig deeper into AWS presigned URLs, and multipart uploads. And the fun part is, the final solution doesn't use any of it! Curious? Read on!

The use case
Attempt 1: REST + presigned URLs 😕
- A simple PUT with a presigned URL
- Multipart uploads with presigned URLs
- The implementation
- The problems
Attempt 2: temporary credentials 😃
- About federation tokens
- (Multipart) uploads with federation tokens
- Are federation tokens safe?
Conclusion

🔖 I created this Table of Contents using BitDownToc. If you are curious, read my article: Finally a clean and easy way to add Table of Contents to dev.to articles 🤩

The use case

I have an API with its own authentication mechanism that uses AWS S3 as a file storage and provides a CLI to simplify the user experience. Through the CLI, users can do cool stuff such as uploading files (used later to do other cool stuff, but the details do not matter). In the current implementation, the CLI sends the files (that can be multiple GBs!) to the API (using a POST 😬), which subsequently handles the upload to S3.

To make it more efficient, I want the CLI to upload files directly to S3 and leverage AWS multipart uploads. Multipart upload means splitting a large file into chunks that can be uploaded in parallel (faster) and retried separately (more reliable).

In summary, I need the ability to:

upload a file to S3 without "real" AWS credentials (or at least with limited temporary permissions provided by the API), and
use the S3 multipart upload mechanism.

Attempt 1: REST + presigned URLs 😕

From the AWS documentation:

You can use presigned URLs to grant time-limited access to objects in Amazon S3 without updating your bucket policy. [...] The credentials used by the presigned URL are those of the AWS user who generated the URL.

You can use presigned URLs to allow someone to upload a specific object to your Amazon S3 bucket. This allows an upload without requiring another party to have AWS security credentials or permissions.

Presigned URLs for upload contain a bucket, a path, and an expiration date. You can use the link multiple times (it will replace the object) until the expiration.

A simple PUT with a presigned URL

To generate a presigned URL for upload with the boto3 s3 client, I can use either generate_presigned_post or generate_presigned_url. I prefer the latter, as it returns a single URL ready for use instead of an URL plus some fields that need to be passed with the PUT.

Given the required environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) are present, here is the API side:

import boto3

def gen_presigned_url_for_post(
  bucket_name, object_name, expiration=3600
):
    # boto3 will read credentials from the environment
    s3_client = boto3.client("s3")
    return s3_client.generate_presigned_url(
        ClientMethod="put_object",
        Params={"Bucket": bucket_name, "Key": object_name},
        ExpiresIn=expiration,
        HttpMethod="PUT",
    )

And the user (CLI) side:

import requests
from pathlib import Path

def use_presigned_url(url, local_file):
    # Same as curl -X PUT --upload-file <local-file> <url>
    files = {"file": Path.open(local_file, "rb")}
    # Important: clear headers!
    requests.put(url, files=files, headers={"Content-Type": ""})

Putting them together:

url = gen_presigned_url_for_post("my-bucket", "foo/data.dump")
use_presigned_url(url, "local.dump")

This works, but a single PUT is limited to 5 GB! For large files, AWS recommends using multipart uploads, which have many advantages, including improved throughput (parallel upload) and quick recovery from network issues (retry only the failed part).

Multipart uploads with presigned URLs

Multipart upload is a 3-steps process:

Initialization - you tell AWS your intent to upload a file in parts. It returns a unique id (UploadId) that you need to upload parts and finish/cancel the upload.
Part upload - with each upload, you pass the UploadId plus a unique PartNumber of your choice; AWS returns an ETag.

Part numbers can be any number from 1 to 10,000, inclusive. A part number uniquely identifies a part and also defines its position within the object being created. If you upload a new part using the same part number, the previously uploaded part is overwritten. The part size should be between 5 MiB to 5 GiB. There is no minimum size limit on the last part of your multipart upload (see Multipart upload limit).
Completion - you finish the upload by sending to AWS both the UploadId and the list of (PartNumber + Etag) for each part. AWS assembles the parts into a single file (following the PartNumbers order) and deletes the individual parts.

(Note that parts stay around in S3 until you finish/cancel the upload - that incurs charges! Don't forget to set some cleanup policy for dangling parts if you use this solution.)

So far, so good. Now, what about presigned URLs? Well, the unauthenticated client only performs part uploads (step 2). However, each part upload has different parameters (thanks to the PartId), hence requiring a different presigned URL. In other words, you need as many presigned URLs as the client will have parts to upload!

This process is discussed in the boto3 issue entitled How to use Pre-signed URLs for multipart upload. To make it clearer:

The implementation

How does this translate in Python code? First, the API side:

import logging
import boto3

logging.basicConfig(level=logging.INFO)

class SomeAPIWithAWSAccess:
    def __init__(self, bucket: str):
        # ↓ AWS client. Requires environment variables!
        self.s3 = boto3.client("s3")
        self.bucket = bucket
        self.logger = logging.getLogger("API")

    def upload_multipart_request(self, key: str, num_parts: int):
        self.logger.info(f"API: starting multipart for {num_parts} URLs.")
        # Initialize the multipart upload
        res = self.s3.create_multipart_upload(
            Bucket=self.bucket,
            Key=key,
        )
        upload_id = res["UploadId"]

        # Generate the presigned URL for each part
        urls = []
        for part_number in range(1, num_parts + 1): # parts start at 1
            url = self.s3.generate_presigned_url(
                # The s3 operation is "upload_part"
                ClientMethod="upload_part",
                Params={
                    "Bucket": self.bucket,
                    "Key": key,
                    "UploadId": upload_id,
                    "PartNumber": part_number,
                },
            )
            urls.append((part_number, url))

        # Create a callback that can be called when
        # the upload is finished on the user side
        def finish_callback(parts):
            self.logger.info("API: finishing multipart upload.")
            self.s3.complete_multipart_upload(
                Bucket=self.bucket,
                Key=key,
                MultipartUpload={"Parts": parts},
                UploadId=upload_id,
            )

        # Return the URLs and the callback
        return urls, finish_callback

And now the user (CLI) side:

from math import ceil
from pathlib import Path
import requests

logger = logging.getLogger("user")

bucket = "my-bucket" # CHANGE_ME: bucket name
remote_location = "foo/data.dump" # CHANGE_ME: path in the bucket
local_file = "data.dump" # CHANGE_ME: file to upload

chunk_size = 5 * 1024 * 1024  # 5 MB (minimal part size)

def get_num_parts(file_path):
    # TODO: make part sizes even
    filesize = Path.stat(file_path).st_size
    return ceil(filesize / chunk_size)

num_parts = get_num_parts(local_file)

# Start a multipart upload
logger.info("asking for presigned URLs.")
api = SomeAPIWithAWSAccess(bucket)
urls, i_am_done = api.upload_multipart_request(
    remote_location,
    num_parts,
)

# Upload each part
parts = []
with Path.open(local_file, "rb") as f:
    for part_number, url in urls:
        logger.info(f"uploading part {part_number}.")
        chunk = f.read(chunk_size)
        res = requests.put(url, data=chunk)
        if res.status_code != 200:
            print(f"{res.status_code} {res.reason} {res.text}")
            exit(1)
        # we have to append etag and partnumber of each parts
        parts.append({"ETag": res.headers["ETag"], "PartNumber": part_number})

logger.info("calling finish.")
i_am_done(parts)

To test it, create a random large file of 12MB, export the AWS credentials, and call the program. Don't forget to change the bucket name in the code above!

# AWS
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1

# Create a "large" file to upload
dd if=/dev/urandom of=data.dump bs=12m count=1
# Call the program
python aws-multipart-upload.py

The problems

I am now able to do multipart uploads with presigned URLs! However:

the user needs to know in advance the number of parts and understand how multipart uploads work (at least Etag and PartNumber),
the user receives URLs... That can't be used with the boto3 client! It is thus his responsibility to implement parallel uploads, retries, etc. The work is huge!
using multipart uploads is very inefficient for small files (< 5MB).

In other words, using multipart uploads with presigned URLs doesn't bring any advantages, except if you are willing to spend days implementing your own upload logic on the user side...

Attempt 2: temporary credentials 😃

What I would like is to be able to use boto3 features on the user side (to get parallel uploads and retries for free) without giving him any permission other than uploading a specific file to a specific s3 bucket location.

How can I do that? Enter federation tokens!

About federation tokens

As explained at length in the docs Comparing the AWS STS API operations, AWS Security Token Service (STS) provides multiple ways of creating temporary credentials: assuming roles, session tokens, and federation tokens. For my use case, federation tokens are perfect, as they support:

credentials lifetime (i.e. expiry), and
custom inline policies.

Custom inline policies mean I can create a throw-away policy on the fly and attach it to the federation token upon creation. Here is an example that only allows file uploads to path {PATH} in bucket {BUCKET}:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowFileUpload",
            "Effect": "Allow",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::{FILE}/{BUCKET}"
        }
    ]
}

Note that "AllowFileUpload" encompasses both regular and multipart uploads.

(Multipart) uploads with federation tokens

Using federation tokens is so easy I don't even need a UML diagram this time 😉. From the API point of view, I just have to call STS' get_federation_token endpoint with the right parameters:

from uuid import uuid1
import boto3

EXPIRE_SECONDS = 3600 # 1h validity

class SomeAPIWithAWSAccess:
    def __init__(self, bucket: str):
        self.sts = boto3.client("sts")
        self.bucket = bucket

    def generate_federated_token(self, key: str):
        name = f"upload-{uuid1()}"[:32]
        bucket = "services-api-local-testing"
        # The magic is here: the policy only allows file
        # uploads to bucket/key. 
        policy = """{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Sid": "AllowFileUpload",
                    "Effect": "Allow",
                    "Action": "s3:PutObject",
                    "Resource": "arn:aws:s3:::{}/{}"
                }
            ]
        }""".format(
            bucket,
            key,
        )

        # Get the federation token
        res = self.sts.get_federation_token(
            Name=name,
            Policy=policy,
            DurationSeconds=EXPIRE_SECONDS,
        )

        # Return only the relevant information
        return {
            "access_key_id": res["Credentials"]["AccessKeyId"],
            "access_key_secret": res["Credentials"]["SecretAccessKey"],
            "session_token": res["Credentials"]["SessionToken"],
            "expiration": res["Credentials"]["Expiration"].isoformat(),
            "bucket": bucket,
            "key": key,
        }

The user (CLI) side can thus use boto3 to upload files:

import boto3

bucket = "my-bucket" # CHANGE_ME: bucket name
remote_location = "foo/data.dump" # CHANGE_ME: path in the bucket
local_file = "data.dump" # CHANGE_ME: file to upload

api = SomeAPIWithAWSAccess(bucket)
creds = api.generate_federated_token(remote_location)

boto3.client(
    "s3",
    # Use the creds to login to AWS
    aws_access_key_id=creds["access_key_id"],
    aws_secret_access_key=creds["access_key_secret"],
    aws_session_token=creds["session_token"],
).upload_file(
    # Upload the file
    local_file,
    creds["bucket"],
    creds["key"],
)

Way easier. But what about multipart uploads? The beauty of this solution is that the boto3 s3 client takes care of everything. We can see this by looking at the upload_file TransferConfig options:

# Default options used by upload_file
TransferConfig(
    # Automatically use multipart uploads for files >= 8M
    multipart_threshold=8388608,
    # Do uploads in parallel
    use_threads=True,
    # Use at most 10 threads
    max_concurrency=10,
    # Other transfer options
    multipart_chunksize=8388608,
    num_download_attempts=5,
    max_io_queue=100,
    io_chunksize=262144,
    max_bandwidth=None,
)

And of course, AWS clients exist for other programming languages, so a user that does not use the CLI is not stuck with Python 😊.

Are federation tokens safe?

Contrary to a presigned URL, a user can log in to the AWS console with an access + secret key. However, since I attached a very restrictive policy to the federation token, it won't let him do or see anything (except menus). In other words, as long as the policy is sane, there is no security risk involved in returning a federation token to an untrusted user.

Conclusion

In this article, we looked at AWS presigned URLs, and how to make them work with multipart uploads. This is however complex and fails to deliver the desired advantages: parallel uploads and separate retries need to be coded on the client side.

We then looked at federation tokens, and how they make the whole process easier: the user can upload files to S3 using the AWS client, which takes care of all the heavy lifting. Moreover, the federation token has very limited permissions and expires after a while, making it as secure as presigned URLs.

With love, @derlin

Installing HashiCorp Vault + ExternalSecrets Operator on Kubernetes: the easy way

Lucy Linder — Wed, 08 Mar 2023 11:33:07 +0000

Want to play with Vault and ExternalSecrets, but don't want to spend a day setting them up? Here is the perfect repo for you.

I recently had to test the new ExternalSecrets operator and its capabilities when using HashiCorp Vault as a backend. I spent some time figuring out how to install them on a local K3D cluster and wanted to share it so you won't have to.

⮕ ✨✨ https://github.com/derlin/externalsecrets-with-hashicorp-vault-kubernetes-easy-install ✨✨

IMPORTANT: this is for test purposes only, it is not suitable for production!

About Vault and ExternalSecrets
Installing Vault and ExternalSecrets
- Prerequisites
- Procedure
Accessing the vault

About Vault and ExternalSecrets

Kubernetes' Secrets resources are a way to store sensitive information. Those Secret resources may be created directly (YAML files), from a Helm Chart, from Kustomize, etc. If you follow a gitops approach (you should!) those YAML files, Helm Charts, etc. will live in a git repo somewhere. But you don't want to commit sensitive information in git repos, so how to proceed?

A good approach instead is to use a secret management system (there are plenty to choose from: AWS Secrets Manager, HashiCorp Vault, Google Secrets Manager, Azure Key Vault, IBM Cloud Secrets Manager, etc.), and to have a way to retrieve those secrets dynamically from your Kubernetes cluster. This is where ExternalSecrets shines.

ExternalSecrets is a cluster-wide operator that you install once. Then, instead of creating a Secret directly, you create an ExternalSecret (a custom resource) that defines what secrets to retrieve, and from which backend. The operator then creates the Secret for you.

Backends are configured by creating SecretStore or ClusterSecretStore resources, which hold the connection information to a given secret management system. Each ExternalSecret must reference one of those secret stores, so the operator knows from which backend it should retrieve secrets.

The documentation at https://external-secrets.io/main/ is quite good, so I will stop here.

Installing Vault and ExternalSecrets

To simplify the installation, I use helmfile, which uses helm under the hood.

Prerequisites

Hard requirements

helm (brew install helm)
helmfile (brew install helmfile)

Soft requirements:

The helm diff plugin (helm plugin install https://github.com/databus23/helm-diff). This is necessary if you plan to use helmfile apply and helmfile diff
k3d to be able to spawn a local Kubernetes cluster on Docker (brew install k3d)

Procedure

First, clone the following repo: https://github.com/derlin/externalsecrets-with-hashicorp-vault-kubernetes-easy-install.

Start a k3d cluster:

k3d cluster create test --api-port 6550 -p "80:80@loadbalancer"

Install Vault, the ExternalSecret operator, and a ClusterSecretStore by running the following at the root of the repository:

helmfile sync

The above command will:

Launch a Vault instance in the vault namespace, and configure it with a token root,
Install the ExternalSecrets operator in the es namespace,

Create a ClusterSecretStore resource named vault-backend in the default namespace, which connects to the Vault. You can reference it in an ExternalSecret resource using:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
# ...
spec:
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
# ...

Create a secret in the vault under the path secret/foo with one property, hello.

Done! Now, you can test the operator by creating an ExternalSecret resource and wait for the Secret test to be created:

kubectl apply -f extsecret-example.yaml

Accessing the vault

The setting above automatically creates the secret/foo for you. To access the vault interface and add more secrets, create a port forward to access the vault:

kubectl port-forward -n vault vault-0 8200

You can now go to http://localhost:8200 and log in with the default token root.

To access the vault using the command line (and assuming the port-forwarding is still on):

export VAULT_ADDR=http://127.0.0.1:8200
export VAULT_TOKEN=root

vault kv get secret/foo

You can also set secrets programmatically using kubectl exec:

kubectl exec vault-0 -n vault -- vault kv put secret/foo app-secret-key=123

one Docker image to rule them all

Lucy Linder — Mon, 27 Jun 2022 12:21:34 +0000

I just found out nixery !

Nixery is a Docker-compatible container registry that is capable of transparently building and serving container images using Nix.

Images are built on-demand based on the image name. Every package that the user intends to include in the image is specified as a path component of the image name.

The path components refer to top-level keys in nixpkgs and are used to build a container image using a layering strategy that optimises for caching popular and/or large dependencies.

In other words, you start with the base image, nixery.dev/, and then lists the packages and tools you want available. Usually, you start with the shell metapackage, followed by any NixOS package(s).

This is very handy when working with Kubernetes.

Examples

Command format to run an ephemeral pod on Kubernetes

kubectl run -it --rm --restart=Never \
   --image=nixery.dv/<PACKAGES> \
   <NAME> -- <CMD>

Connect to a database using psql, assuming the service is called my-db:

kubectl run -it --rm --restart=Never \
  --image=nixery.dev/postgresql \
  --env PGPASSWORD=some-password \
   psql -- psql -h my-db -U some-username

Test the connectivity to a pod:

kubectl run -it --rm --restart=Never \
  --image=nixery.dev/shell/unixtools.ping \
  ping -- ping keycloak.cluster.local

Get a shell with curl, grep and nc commands:

kubectl run -it --rm --restart=Never \
  --image=nixery.dev/shell/curl/gnugrep/ping/netcat \
  shell -- bash

Limitations

For those not familiar with NixOs, it may be troublesome to find the package name that will bring you the executable you need. Here are some:

psql → package postgresql
ping → package unixtools.ping
grep → package gnugrep
nc → package netcat

Also, I wasn't able to run with root permissions, meaning I could not run iptables -L (with the package iptables). Maybe I missed something ? Let me know in the comments !

helmfile: a simple trick to handle values intuitively

Lucy Linder — Mon, 20 Jun 2022 15:18:30 +0000

helmfile is a very nice and powerful tool to manage multiple Helm charts declaratively. However, there is one area in which I find it suboptimal: the handling of values / environment values.

Let's go over how it works, and see how we can make it better. If you don't like to read, skip to My tip on using values in helmfile (or read the TL;DR in the repo linked below).

For a full example, check out this code !
👉 ✨ https://github.com/derlin/helmfile-intuitive-values-handling ✨ 👈

Values in umbrella charts (pure Helm)

Coming from the Helm world, I am used to using umbrella charts, where all the default values for my charts are defined in one single values.yaml:

# globals are available to all sub-charts 
# using .Values.global.*
global:
  domain: dev.example.com

foo:
  # default values passed to the sub-chart called 'foo'
  image: nginx
  tag: latest

bar:
  # default values passsed to the sub-chart called 'bar'
  mode: local
...

When some values need to be overridden per environment, I simply create a file <env>.yaml and pass it to helm using --values. For example:

# in environments/prod.yaml
global:
  domain: prod.example.com # override the domain for all

foo:
  image:
    tag: 1.19 # use a stable docker image 
...

To deploy to prod:

helm install my-umbrella-name . \
   --values environments/prod.yaml

With helmfile though, there is no easy way to reproduce this behavior (well, there is actually, keep reading 😉).

Values in helmfile

In helmfile, one defines default values for a chart using the releases.<name>.values:

releases:
  - name: foo
    ...
    values:
      - image:
          repository: nginx
          tag: latest

To add global values, there is an equivalent environments.default.values, but this only makes values available to the templates... It doesn't attach those values automatically. In other words, the following does nothing:

releases:
  - name: foo
    ...
environments:
  prod:
    values:
      - prod: true

To make it work, we need to add some values template to release foo (and all other releases), for example:

releases:
  - name: foo
    ...
    values: 
      - {{ toYaml .Values | nindent 8 }}

Now, prod: true will be passed to foo upon helmfile -e prod ...

The environment values are passed to all release templates, not releases! That is, they can be used inside gotmpl templates/files listed under release.<name>.values, but are not attached directly ...

This is already too complex to follow.

My tip on using values in helmfile

Instead of trying to understand how all those values work (and creating specific .gotmpl files for each release), here is how I managed to mimic the umbrella chart behavior regarding values with helmfile (one default value file + one file per environment, with global section and <release-name> sections).

First, create a folder called environments. In it, create a default.yaml file, and specify the default values for each release and the globals using the "umbrella chart syntax":

global:
  # ... values passed to all releases
foo:
  # ... values passed to release foo
bar:
  # ... values passed to release bar

Next, create as many files as you have environments (environment prod → environments/prod.yaml) and override only what needs to be overridden (compared to default).

In the helmfile, configure each environment to read from default.yaml and the specific environment values:

# in helmfile.yaml
environments:
  default:
    values:
      - environments/default.yaml
  prod:
    values: # apply default first, then prod
      - environments/default.yaml 
      - environments/prod.yaml

Now, here is the trick.
Create a magic gotmpl file that will extract both the global section and the release-specific section of the values:

{{/* in env-magic.gotmpl */}}

{{/* 
extract both global and <release-name> sections from
.Values, and merge them (giving precedence to release
specific values.
Note: missing entries are fine.
*/}}
{{ merge (.Values | get .Release.Name  dict) (.Values | get "global"  dict) | toYaml }}

And attach this magic file to all releases in the helmfile:

# in helmfile.yaml
releases:
  - name: foo
    ...
    values:
      - &env env-magic.gotmpl # use a YAML anchor for DRYness
  - name: bar
    ...
    values:
      - *env # reference the anchor
...

That's it! Now, you can simply edit the files in environments/, and don't have to think about (or touch) values in helmfile anymore.

Example

A complete example (and a different explanation) is available here:

derlin / helmfile-intuitive-values-handling

How to manage values (globals, environment, release-specific) intuitively within helmfile

How to handle helmfile values nicely

helmfile is a very powerful tool, but his way of handling release values is daunting for the beginners (and experts). I propose here a simple pattern to handle values, that is completely generic, intuitive and works in all situations.

Read the article !
👉 ✨ helmfile: a simple trick to handle values intuitively ✨ 👈

TL;DR

This repo reproduces the way values are handled in umbrella charts (Helm Charts with sub-charts).

If you don't like to read but want to experiment instead:

clone this repo,
customize the different release values by editing environments/default.yaml,
override values per environment by editing environments/<envName>.yaml (available environements are local and prod),
see for yourself how your changes work by running: helmfile -e <env> write-values and see the output.

To apply this in your helmfile:

ensure you reference env-magic.gotmpl under all release values (releases.<releaseName>.values) and,
ensure…

View on GitHub

Written with ❤ by derlin

Helm templates: do not use tpl in vain

Lucy Linder — Mon, 30 May 2022 17:56:36 +0000

From the documentation:

The tpl function allows developers to evaluate strings as templates inside a template. This is useful to pass a template string as a value to a chart or render external configuration files. Syntax: {{ tpl TEMPLATE_STRING VALUES }}

When writing generic Helm charts or libraries, calls to tpl are often overused, and for a good reason: they give lots of flexibility, and allow chart users to avoid repetition in the values.yaml file.

The tpl function is however costly, and slow, and this is why it should be called only when necessary.

`tpl` function not performant #8002

himmakam posted on Apr 27, 2020

I have a scenario of having one umbrella chart with many sub-charts as dependencies. When I have the templates folder in umbrella chart, helm lint takes more time like almost 30 to 40 minutes. When this folder is not there, helm lint returns very fast. Can I know the reason.running in debug mode not giving any logs. It is helm v3.

View on GitHub

How to limit calls while keeping the flexibility ? My solution is to wrap all calls to tpl using the following helper function:

{{- define "bettertpl" -}}
  {{- $tpl := .value -}}
  {{- /* handle cases where .value is a yaml object */ -}}
  {{- if not (typeIs "string" $tpl) -}}
    {{- $tpl = toYaml $tpl -}}
  {{- end -}}
  {{- /* only call tpl if there is at least one template expression */ -}}
  {{- if contains "{{" $tpl -}}
    {{- tpl $tpl .context }}
  {{- else -}}
    {{- $tpl -}}
  {{- end -}}
{{- end -}}

As the code should make it clear, bettertpl adds two features to the regular tpl:

it allows to template anything (not just string), as dict, list, etc. will be converted to string first, and
it only calls the slow tpl when needed: if the string doesn't contain at least one {{ ... }}, we know we can just print it as is.

Usage:

before: {{ tpl .Values.foo . }}
after: {{ include "bettertpl" (dict "value" .Values.foo "context" .) }}

You find it too verbose ?

Note that I use named arguments for better readability. You can get rid of them and use lists instead:

{{ include "bettertpl" (list .Values.foo .) }}

In the template above, change .value → first . and .context → index . 2 to read from list arguments instead.

Is it worth it ? As an example, I recently migrated a gitops repository with an umbrella chart of around 20 sub-charts. After a refactoring allowing me to use only one base chart for all, my helm template went from <1s to more than 15 seconds...

I was able to take it down to 4 seconds by limiting the calls to tpl with this simple trick, which is quite an improvement.

kubectl run: spawn temporary docker containers on Kubernetes

Lucy Linder — Thu, 26 May 2022 15:02:43 +0000

When working with Kubernetes, there are times when you wished you had a specific tool to help debug a problem, visualise some data, or take some actions. Well, there is actually an easy way: deploy a docker container with the necessary tool directly to your cluster with one command, use it, and let it be destroyed as soon as you do not use it anymore !

`kubectl` run command

The run command creates and runs a particular image in a pod.

run will start running 1 or more instances of a container image on your cluster
kubectl run NAME --image=image [--env="key=value"] [--port=port]
[--dry-run=server|client] [--overrides=inline-json]
[--command] -- [COMMAND] [args...]

The image can be anything, as long as it can be pulled from the cluster. With the two options --attach/-it (wait for the Pod to start running, and then attach to the Pod / open a shell) and --rm (delete the pod after it exits), it is the perfect way to get the right tools into the cluster for a short while.

Example: interact with a database using psql

Let's say your project runs in the namespace myproject and the micro-services use postgres on RDS (managed relational database on AWS). RDS doesn't come with a nice postgres admin tool, and your infra colleagues only gave you adminer to interact with it. You want psql.

From your terminal, ensure you are connected to the right kubernetes (kubectl config current-context) context and run:

# run bash in a container with psql installed 
# on your namespace
kubectl --namespace myproject \
  run -it --rm psql --image=postgres:13 -- bash;

Running this command, you suddenly have a shell in your cluster with psql installed. You can now run the following to have the psql prompt:

PGPASSWORD='my-user-pwd' psql postgres -U my-user \
  -h hostname-of-the-db-cluster.rds.amazonaws.com

In case you have network policies in place, you can easily add the needed labels to your psql pod using the -l option:

kubectl run ... -l "db-access: true" -l "role: alice" ...

Once you exit the shell, the pod should disappear.

Example: Kafka UI

Let's say you have a Kafka cluster running, and you need to debug the messages going through. There is no Kafka visual interface on your cluster.

Why not just a port forwarding ?
Why not just set a port forwarding (kubectl port-forward my-kafka 9092) and run some tool locally you ask ? Well, the tool would be able to connect, but the advertised hostname will point to a host only available from within kubernetes. Getting messages will thus fail (unless manually editing /etc/hosts, which is not possible if the tool runs on a Docker container).

Kafka visualisation tools are a plethora: Kafka UI, Kafka Magic, kafdrop to cite a few.

Let's take kafka-ui as an example:

kubectl --n myproject run -it --rm kui \
  --port 8080 \
  --image=provectuslabs/kafka-ui:latest \
  --env "KAFKA_CLUSTERS_0_NAME=main" \
  --env "KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092" \

The KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS must match the name of the pod/service of the kafka broker in the cluster. If you use strimzi, it would look something like <cluster-name>-kafka-bootstrap:9092.

Now, you just need to port-forward the kui port to access it from your local machine:

# on another terminal !
kubectl -n myproject port-forward kui 8080

You have kafka-ui available on http://localhost:8080. Once you finished, close the terminal running kubectl run and the kui pod will be deleted.

The same logic applies to other tools: run the docker image in the cluster, then port-forward.

Wrap-up

kubectl run is one heck of a magic tool to add to your debugging toolbox. As the kafka example shows, it may also be a way to avoid crowding your kubernetes cluster with management tools that are only used once in a while. Don't maintain, just run when needed !

Written with ❤ by derlin

helmfile: difference between sync and apply (helm 3)

Lucy Linder — Thu, 26 May 2022 08:57:43 +0000

Helm and helmfile are great tools to automate kubernetes deployments. However, they have some subtleties that are sometimes hard to understand and may lead to catastrophic problems. One of them is the difference between helmfile sync and helmfile apply, a question raised many times, for example in StackOverflow.

Running helmfile -h, the explanation of those two commands is:

sync → sync all resources from state file (repos, releases and chart deps)
apply → apply all resources from state file only when there are changes

But what does it mean exactly ? What are the differences and pitfalls ? Let's dive in together, starting at the basics of Helm 3 up to helmfile.

Note: if you are familiar with how Helm 3 upgrades work, you can skip directly to the last section.

Helm states

The first thing to understand is how Helm stores states.

Helm generates the Kubernetes manifests to apply to a Kubernetes cluster by "compiling" a Chart's templates against some values, that can come from the chart's values.yaml or the values override (defined in helmfile, passed using the --set option of the cli, etc.). All those information together (chart, values, options) are what we will call a state.

Whenever you install a release, Helm stores this state in a secret called (the v1 suffix being the revision):

sh.helm.release.v1.{RELEASE_NAME}.v1

This secret is simply a compressed, base64-encoded JSON stored in a single key - release -, which contains everything needed to reconstruct exactly the helm chart, and to re-apply it with exactly the same values to reconstruct the release.

It can be inspected using the following command (see this gist):

kubectl get secret sh.helm.release.v1.<RELEASE_NAME>.v<REV> \
  -o jsonpath='{.data.release}' | base64 -d | base64 -d | gzip -d

Content of the helm-release secret
If you decode this secret, you'll see a JSON that contains:

name, namespace, and version of the release
list of all chart files (name + base64 content of all files, excluding templates/*, Chart.yaml and values.yaml)
metadata (content of Chart.yaml)
values (content of values.yaml)
templates files (name + base64 content)
config (value overrides via cmd or helmfile)
values schema (content of values.schema.json)
hooks
info → current state of the release in Kubernetes (e.g. "install complete"), dates of first/last deployment, etc.
actual Kubernetes manifest (output of all rendered templates, this time in plain text)

When you upgrade a release, the new state is stored in a new secret, with the version incremented to the new revision:

sh.helm.release.v1.{RELEASE_NAME}.v{REVISION}

How helm 3 upgrade/rollback works

Now, let's understand how Helm decides what to do during an upgrade.

⚠️ Every time you run helm upgrade or helm rollback, a new revision (and secret) is always created, whether or not there are changes.

Helm 2 and two-way merge

Back in Helm 2, the upgrade process was plain and simple: helm reconstructed the old state by decoding the helm-release secret of the current revision, and compared it with the desired state to create the different patches to apply. This is known as two-way merge:

old state → desired state

The desired state can be reconstructed from a helm-release secret (rollback), or from a new version of the chart + values (upgrade).

The important point is that Helm 2 didn't take the live state into account, that is, what is effectively present in the cluster. In other words, if you modified anything manually (add a value to a ConfigMap, or a sidecar container in a deployment), this change was not seen at all by Helm, and could either be left as-is, disappear, or be overwritten depending on Helm old/desired states.

Helm 3 and three-way strategic merge

Helm 3 introduced a brand new way of computing patches required for upgrades and rollbacks, known as three-way strategic merge.
The article Three-way merging: A look under the hood, gives a good explanation of what three-way merging means (heavily used in git), while Helm doc's section Improved Upgrade Strategy: 3-way Strategic Merge Patches focuses more on what it means in Helm.

But simply put, Helm 3 now takes the live state into account:

(old state → desired state) → (live state → desired state)

The rules are (fields = key+value):

+ (add) → new fields in the desired state not present in the old state are added (overwriting any live state)
⌫ (remove) → fields existing in the old state that are not present in the desired state are removed (even if their value changed in the live state)
± (overwrite) → fields in the live state that are also present in the desired state but have a different value are updated (whatever the old state)
∅ (ignore) → the rest is left unchanged (e.g. new fields in the live state stay)

Moreover, the patches do merge operations, meaning maps are deep-merged (vs completely replaced). This is a huge improvement from Helm 2. Among others, it means that in Helm 3:

it is possible to add a sidecar container to a deployment manually, or a new data entry in a ConfigMap. If they are not managed by Helm (no fields in the generated manifests about it), they will stay unchanged after upgrade/rollback;
if you modify fields managed by Helm manually, doing a rollback will effectively reset the fields to the Helm values.

Simple example
Let's say you install a deployment with Helm with the following labels (manifest stripped for readability):

apiVersion: apps/v1
kind: Deployment
# ...
spec:
  # ...
  template:
    # ...
    metadata:
      labels:
        label-1: install
        label-2: install
        label-3: install
    # ...

Now, you change the labels manually to:

labels:
  label-1: manual
  label-2: manual
  label-3: manual
  new-one: added   # also add one

And finally do a Helm upgrade, with the new values being:

labels:
  label-1: install  # no change
  label-2: upgrade  # change
                    # delete
  label-4: upgrade  # add

The result will be:

labels:
  label-1: install  # overwritten
  label-2: upgrade  # overwritten
                    # deleted
  label-4: upgrade  # added
  new-one: added    # ignored/kept

Helmfile: sync vs apply

Now that we understand how helm upgrades work, let's dive into helmfile sync vs apply.

The helmfile sync command will run helm upgrade on all releases. This means all releases will have their revision incremented by one. However, as Helm does three-way strategic merges, if there is no change between the live and desired state, no patch will actually be applied: there is just a new helm-release secret created.

The helmfile apply command will run helm upgrade only when there are changes.
To detect changes, helmfile uses the helm-diff plugin. For a long time, helm-diff only computed the difference between the old vs desired state; it didn't look at the live state (similar to Helm 2). If something changed outside of Helm, helm-diff would return "no change", and the release won't be upgraded.

helm-diff added support for three-way merge diffs on v3.3.0 (January 10, 2022). As the helm-diff process inherits environment from the helmfile process, it is now possible to run:

# use three-way merge strategy for diffing
HELM_DIFF_THREE_WAY_MERGE=true helmfile apply

This way, helmfile apply can now detect and change manual changes as well.

In other words, apply has the advantage of not creating useless new revisions, but doesn't guarantee the coherency of the live state, as manual changes may go undetected unless the helm-diff plugin is properly configured. sync is exactly the opposite: it always creates new revisions for all releases, but will detect and undo any manual change that happened on Helm-managed fields.

sync or apply is thus down to a trade-off, which is alleviated if you decide to never change anything manually (or always use three-way merge diffs). If you stick with this best practice (or always export the HELM_DIFF_THREE_WAY_MERGE environment variable), apply is always the way to go.

Written with ❤ by derlin

The Ops Community ⚙️: Lucy Linder

AWS S3 multipart uploads from unauthenticated users? presigned URLs (😕) vs federation tokens (😃)

The use case

Attempt 1: REST + presigned URLs 😕

A simple PUT with a presigned URL

Multipart uploads with presigned URLs

The implementation

The problems

Attempt 2: temporary credentials 😃

About federation tokens

(Multipart) uploads with federation tokens

Are federation tokens safe?

Conclusion

Installing HashiCorp Vault + ExternalSecrets Operator on Kubernetes: the easy way

About Vault and ExternalSecrets

Installing Vault and ExternalSecrets

Prerequisites

Procedure

Accessing the vault

one Docker image to rule them all

Examples

Limitations

helmfile: a simple trick to handle values intuitively

Values in umbrella charts (pure Helm)

Values in helmfile

My tip on using values in helmfile

Example

derlin / helmfile-intuitive-values-handling

How to manage values (globals, environment, release-specific) intuitively within helmfile

How to handle helmfile values nicely

TL;DR

Helm templates: do not use tpl in vain

`tpl` function not performant #8002

kubectl run: spawn temporary docker containers on Kubernetes

kubectl run command

Example: interact with a database using psql

Example: Kafka UI

Wrap-up

helmfile: difference between sync and apply (helm 3)

Helm states

How helm 3 upgrade/rollback works

Helm 2 and two-way merge

Helm 3 and three-way strategic merge

Helmfile: sync vs apply

`kubectl` run command