Erik Lundevall Zara

Posted on Oct 25, 2022 • Edited on Feb 11, 2024 • Originally published at cloudgnosis.org

A tale of two tools - Pulumi and AWS CDK

#cloudops #aws #pulumi #awscdk

Two tools have gained some adoption recently in the infrastructure as software space are Pulumi and AWS Cloud Development Kit (AWS CDK). In this article I am going to compare and write about experiences with these tools - good things, and bad things.

There is no clear “winner”. As most times, it depends on the circumstances. I will tell why I think something is good or bad. My intention is to provide you with some value about your tool choices.

Let us get started!

Tool introduction

The age of AWS CDK and Pulumi is roughly the same, both with a similar idea to use generic programming languages to describe cloud infrastructure, and manage that infrastructure through software and software engineering practices.

They are both responses to challenges with older tools. There are certainly several similarities, for example:

Both tools and their frameworks allow infrastructure to be declared using regular programming languages
Both tools support Typescript, Python, Java, C#, Go as programming languages to use.
Both tools are open source and available on GitHub
Testing support for the infrastructure is provided by both tools.
Both have command-line tools for deploying and managing infrastructure resources
Both have a service backend for managing infrastructure state (optional with Pulumi)
Both have a registry/website for packages to use

We can look at a few similarities with actual code examples.

Code examples

I have created five versions of essentially the same infrastructure, using both AWS CDK and Pulumi.

The solution comprises:

A VPC with subnets in two availability zones, public and private subnets, and a single NAT gateway.
An ECS Cluster. Containers in the cluster will be placed in the private subnets. Fargate is used, so no explicit servers to set up in the cluster.
A load balancer in the public subnets, which directs traffic to the containers
A local Dockerfile, which can build a docker image based on the Apache httpd web server.
Local docker image shall be uploaded to Elastic Container Registry (ECR), and containers running in ECS cluster should load from ECR.

The five versions implemented are:

Code in Python, using AWS CDK
Code in Python, using Pulumi and Pulumi Crosswalk for AWS
Code in Typescript, using AWS CDK
Code in Typescript, using Pulumi and Pulumi Crosswalk for AWS
Code in Typescript, using Pulumi and Pulumi CDK integration

The purpose is to show both how the solution can be represented in each tool, and what it would look like in different languages. Pulumi has both its own higher-level component library for AWS through CrossWalk for AWS, but can also leverage AWS CDK as well. Hence, there is an implementation that uses AWS CDK code but provisions it with Pulumi as well.

Check for more comments after the code examples!

Dockerfile in the my-image directory

FROM httpd

Solution in Python, using AWS CDK

import os
from aws_cdk import ( App, CfnOutput, Environment, Stack )
from aws_cdk.aws_ec2 import ( Vpc )
from aws_cdk.aws_ecr_assets import DockerImageAsset
from aws_cdk.aws_ecs import (
    Cluster,
    ContainerImage,
    DeploymentCircuitBreaker,
    FargateTaskDefinition,
    LogDriver,
    PortMapping,
    Protocol
)
from aws_cdk.aws_ecs_patterns import ApplicationLoadBalancedFargateService
from aws_cdk.aws_logs import RetentionDays

# Set up a CDK app and a stack for our resources
app = App()
stack = Stack(app,
              'my-container-infrastructure',
              env=Environment(account=os.getenv('CDK_DEFAULT_ACCOUNT'),
                              region=os.getenv('CDK_DEFAULT_REGION')))

# A VPC to use
vpc = Vpc(stack, 'vpc', vpc_name='my-vpc', nat_gateways=1, max_azs=2)

# AN ECS cluster
cluster = Cluster(stack, 'my-ecs-cluster', vpc=vpc)

WEBSERVER_PORT=80
taskdef = FargateTaskDefinition(stack,
                                'my-task-def',
                                cpu=512,
                                memory_limit_mib=1024)

image_asset = DockerImageAsset(stack, 'image-asset', directory='./my-image')
image = ContainerImage.from_docker_image_asset(image_asset)
containerdef = taskdef.add_container('my-container', image=image)
containerdef.add_port_mappings(PortMapping(container_port=WEBSERVER_PORT,
                                           protocol=Protocol.TCP))

lbservice = ApplicationLoadBalancedFargateService(
    stack,
    'loadbalanced-service',
    cluster=cluster,
    task_definition=taskdef,
    desired_count=2,
    service_name='my-service',
    circuit_breaker=DeploymentCircuitBreaker(rollback=True),
    public_load_balancer=True,
    listener_port=WEBSERVER_PORT)

CfnOutput(stack,
          'url',
          value=f'http://{lbservice.load_balancer.load_balancer_dns_name}')

app.synth()

Solution in Python, using Pulumi with Crosswalk for AWS

from pulumi import Config, Output, export
import pulumi_aws as aws
import pulumi_awsx as awsx

port = 80
cpu = 512
memory = 1024

vpc = awsx.ec2.Vpc(
    'my-vpc',
    number_of_availability_zones=2,
    nat_gateways=awsx.ec2.NatGatewayConfigurationArgs(
        strategy=awsx.ec2.NatGatewayStrategy.SINGLE))

# An ECS cluster
cluster = aws.ecs.Cluster('my-ecs-cluster')

# An ECR repository for the app image
repo = awsx.ecr.Repository('my-repo')

# Build and publish the image to ECR
image = awsx.ecr.Image(
    'image',
    repository_url=repo.url,
    path='./my-image')

lbsg = aws.ec2.SecurityGroup(
    'my-lb-sg',
    vpc_id=vpc.vpc_id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            from_port=port,
            to_port=port,
            protocol='tcp',
            cidr_blocks=['0.0.0.0/0'])
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            from_port=0,
            to_port=0,
            protocol='-1',
            cidr_blocks=['0.0.0.0/0'])
    ]
)

# An ALB for load balancing traffic from internet
loadbalancer = awsx.lb.ApplicationLoadBalancer(
    'my-lb',
    subnet_ids=vpc.public_subnet_ids,
    security_groups=[lbsg.id]
)

# Container definition for the ECS task
containerdef = awsx.ecs.TaskDefinitionContainerDefinitionArgs(
    image=image.image_uri,
    cpu=cpu,
    memory=memory,
    essential=True,
    port_mappings=[awsx.ecs.TaskDefinitionPortMappingArgs(
        container_port=port,
        target_group=loadbalancer.default_target_group
    )]
)

containersg = aws.ec2.SecurityGroup(
    'my-container-sg',
    vpc_id=vpc.vpc_id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            from_port=port,
            to_port=port,
            protocol='tcp',
            security_groups=[lbsg.id])
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            from_port=0,
            to_port=0,
            protocol='-1',
            cidr_blocks=['0.0.0.0/0'])
    ])

# Deploy an ECS Service on Fargate to host the application container
service = awsx.ecs.FargateService(
    'my-service',
    desired_count=2,
    cluster=cluster.arn,
    task_definition_args=awsx.ecs.FargateServiceTaskDefinitionArgs(
        container=containerdef),
    network_configuration=aws.ecs.ServiceNetworkConfigurationArgs(
        subnets=vpc.private_subnet_ids,
        security_groups=[containersg.id]
    ),
    deployment_circuit_breaker=aws.ecs.ServiceDeploymentCircuitBreakerArgs(
        enable=True,
        rollback=True))


# The endpoint URL
export("url", Output.concat("http://", loadbalancer.load_balancer.dns_name))

Solution in Typescript, using AWS CDK

import { App, CfnOutput, Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { DockerImageAsset } from 'aws-cdk-lib/aws-ecr-assets';
import { Cluster, ContainerImage, FargateTaskDefinition, Protocol } from 'aws-cdk-lib/aws-ecs';
import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns';

const app = new App();

const stack = new Stack(app, 'my-container-infrastructure', {
    env: {
        account: process.env.CDK_DEFAULT_ACCOUNT,
        region: process.env.CDK_DEFAULT_REGION,
    }
});

const vpc = new Vpc(stack, 'vpc', {
    vpcName: 'my-vpc',
    natGateways: 1,
    maxAzs: 2,
});

const cluster = new Cluster(stack, 'my-ecs-cluster', { vpc });

const webserverPort = 80;
const taskdef = new FargateTaskDefinition(stack, 'my-task-def', {
    cpu: 512,
    memoryLimitMiB: 1024,
});

const imageAsset = new DockerImageAsset(stack, 'image-asset', { directory: './my-image'});
const image = ContainerImage.fromDockerImageAsset(imageAsset);
const containerdef = taskdef.addContainer('my-container', { image });
containerdef.addPortMappings({
    containerPort: webserverPort,
    protocol: Protocol.TCP,
});

const lbservice = new ApplicationLoadBalancedFargateService(stack, 'loadbalanced-service', {
    cluster,
    taskDefinition: taskdef,
    desiredCount: 2,
    serviceName: 'my-service',
    circuitBreaker: { rollback: true },
    publicLoadBalancer: true,
    listenerPort: webserverPort,
});

new CfnOutput(stack, 'url', { value: `http://${lbservice.loadBalancer.loadBalancerDnsName}`});

Solution in Typescript, using Pulumi with Crosswalk for AWS

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as awsx from "@pulumi/awsx";

const config = new pulumi.Config();
const port = 80;
const cpu = 512;
const memory = 1024;

const vpc = new awsx.ec2.Vpc('my-vpc', {
    numberOfAvailabilityZones: 2,
    natGateways: { strategy: awsx.ec2.NatGatewayStrategy.Single },
});

// An ECS cluster to deploy into
const cluster = new aws.ecs.Cluster('my-ecs-cluster');

// An ECR repository for the app image
const repo = new awsx.ecr.Repository('my-repo');

// Build and publish the image to ECR
const image = new awsx.ecr.Image('image', {
    repositoryUrl: repo.url,
    path: './my-image',
});

const lbsg = new aws.ec2.SecurityGroup('my-lb-sg', {
    vpcId: vpc.vpcId,
    ingress: [{
        fromPort: port,
        toPort: port,
        protocol: 'tcp',
        cidrBlocks: ['0.0.0.0/0'],
    }],
    egress: [{
        fromPort: 0,
        toPort: 0,
        protocol: '-1',
        cidrBlocks: ['0.0.0.0/0'],
    }],
});

// An ALB to serve the container endpoint to the internet
const loadbalancer = new awsx.lb.ApplicationLoadBalancer('my-lb', {
    subnetIds: vpc.publicSubnetIds,
    securityGroups: [lbsg.id],
});

const containersg = new aws.ec2.SecurityGroup('my-container-sg', {
    vpcId: vpc.vpcId,
    ingress: [{
        fromPort: port,
        toPort: port,
        protocol: 'tcp',
        securityGroups: [lbsg.id],
    }],
    egress: [{
        fromPort: 0,
        toPort: 0,
        protocol: '-1',
        cidrBlocks: ['0.0.0.0/0'],
    }],
});

// Deploy an ECS Service on Fargate to host the application container
const service = new awsx.ecs.FargateService('my-service', {
    desiredCount: 2,
    cluster: cluster.arn,
    taskDefinitionArgs: {
        container: {
            image: image.imageUri,
            cpu: cpu,
            memory: memory,
            essential: true,
            portMappings: [{
                containerPort: port,
                targetGroup: loadbalancer.defaultTargetGroup,
            }],
        }
    },
    networkConfiguration: {
        subnets: vpc.privateSubnetIds,
        securityGroups: [containersg.id],
    },
    deploymentCircuitBreaker: {
        enable: true,
        rollback: true,
    },
});

// The URL at which the container's HTTP endpoint will be available
export const url = pulumi.interpolate`http://${loadbalancer.loadBalancer.dnsName}`;

Solution in Typescript, using Pulumi with CDK integration

import * as pulumi from "@pulumi/pulumi";
import * as pulumicdk from '@pulumi/cdk';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { DockerImageAsset} from 'aws-cdk-lib/aws-ecr-assets';
import { Cluster, ContainerImage, FargateTaskDefinition, Protocol } from 'aws-cdk-lib/aws-ecs';
import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns'

class MyContainerInfrastructureStack extends pulumicdk.Stack {
    url: pulumi.Output<string>;

    constructor(id: string, options?: pulumicdk.StackOptions) {
        super(id, { ...options });

        const vpc = new Vpc(this, 'vpc', {
            vpcName: 'my-vpc',
            natGateways: 1,
            maxAzs: 2,
        });

        const cluster = new Cluster(this, 'my-ecs-cluster', { vpc });

        const webserverPort = 80;
        const taskdef = new FargateTaskDefinition(this, 'my-task-def', {
            cpu: 512,
            memoryLimitMiB: 1024,
        });

        const imageAsset = new DockerImageAsset(this, 'image-asset', { directory: './my-image'});
        const image = ContainerImage.fromDockerImageAsset(imageAsset);
        const containerdef = taskdef.addContainer('my-container', { image });
        containerdef.addPortMappings({
            containerPort: webserverPort,
            protocol: Protocol.TCP,
        });

        const lbservice = new ApplicationLoadBalancedFargateService(this, 'lb-service', {
            cluster,
            taskDefinition: taskdef,
            desiredCount: 2,
            serviceName: 'my-service',
            circuitBreaker: { rollback: true },
            publicLoadBalancer: true,
            listenerPort: webserverPort,
        });

        this.url = this.asOutput(`http://${lbservice.loadBalancer.loadBalancerDnsName}`);

        this.synth();
    }
}

const stack = new MyContainerInfrastructureStack('my-container-infrastructure');
export const url = stack.url;

Development environment note

All the examples here were developed using Visual Studio Code (VS Code), and for each type of solution, I had set up a VS Code devcontainer with the required software installed, adapted for each example. This included:

Node.js 16
Python 3.10
AWS CDK 2.46.0
Pulumi 3.43.1
Git CLI
Docker-from-docker feature in devcontainer
AWS CLI
Homebrew
Granted CLI

Python code devcontainer was based on VS Code’s own Python devcontainer, Typescript code devcontainer was based on VS Code’s own Node.js devcontainer.

Those software packages that were available as a devcontainer feature were used when available.

Comments on code examples

ECR difference

For the Pulumi code, a named repository in ECR is used and a docker image is uploaded there. For AWS ’t supported out of the box with AWS CDK. Instead, it uploads the docker image to ECR, but in a repository that the AWS CDK has created itself, and which you do not have control over the name for it. If you want a named repository, you need to make a copy of this CDK internal repository, pretty much.

Languages

I think the Python development experience is better with AWS CDK as a library to define infrastructure. The Pulumi variant felt a bit more clunky to use.

For Typescript, the coding experience was fairly similar. The type system of Typescript works pretty well for this type of use case where you are mostly trying to generate a desired state definition, using more imperative mechanisms. I would say that part is roughly equal.

Typescript provides a better experience overall compared to Python when working with either tool, I think - in this context. For many other types of tasks, I would probably prefer Python, but when working with either AWS CDK or Pulumi, I think the Typescript language is nicer.

Documentation

The documentation is overall better for AWS CDK. Both AWS CDK and Pulumi have a few example code snippets for a specific type of resources, but AWS CDK documentation is a bit more comprehensive there and the API reference documentation is better than for Pulumi. For Pulumi, you also have to navigate around a bit more to find each piece of documentation, whereas for AWS CDK, the documentation is more grouped together.

The overall introduction and concepts documentation is a nicer experience with Pulumi. The higher level view from the Pulumi docs pages is nicer compared to the somewhat dry AWS docs.

Crosswalk vs AWS CDK

As for Crosswalk for AWS vs. the AWS CDK libraries, AWS CDK clearly covers much more than Crosswalk. The parts that Crosswalks covers are ok, although it did not have the same level of sane defaults as AWS CDK has. This is one reason that the code with Crosswalk got longer. For example, the security group settings were not set for ingress to only be the port for the web server in the load balancer and container. I had to set that explicitly. This is handled under the hood by AWS CDK.

This is one reason I found the fifth solution alternative interesting here - Pulumi using its CDK integration. Pulumi has several interesting tool features that AWS CDK/CloudFormation is missing, while the component libraries and documentation are better on the AWS CDK side.

It is a potential combination to get the best of both worlds.

Deployment speed and feedback

I was keen to see if there were any significant differences in terms of deployment speed for AWS CDK and Pulumi, given that AWS CDK has a conversion step to CloudFormation, and that more of the execution is taking place locally with Pulumi.

The speed was roughly the same, about 4 minutes to deploy from scratch. The times to delete or update the infrastructure were also somewhat similar.

Since AWS CDK does not have an explicit preview/plan step when you deploy, and Pulumi has that, I did not include the time for the preview part for Pulumi in the measurements.

The feedback with deployment from the command line is definitely nicer with Pulumi, you have a better view of the time each resource takes to provide and the dependencies. The AWS CDK feedback is ok, but it is harder to have a good understanding of the progress. I prefer Pulumi here, and I like that if I deploy from the command line, I will get a preview of the changes by default.

While you need to do an explicit preview to see that with AWS CDK (cdk diff), you will still get an update if there are security related changes with cdk deploy, i.e. IAM permission changes or security group changes.

Note on resource count

Pulumi and AWS CDK count the number of resources differently. When you use AWS CDK, the number of resources reported equals the number of CloudFormation resources. For the examples above, that was 36 resources.

When you use Pulumi, the pure Pulumi examples count to 41 resources. Here, things like provider and the component resource that group other resources also count, it seems, so the number gets higher.

The highest count was for the Pulumi CDK case, which counted to 56 resources. The AWS CDK probably has a few more abstractions and groups of resources in its model compared to Crosswalk, which increases the resource count.

That could implicate higher cost if you use the Pulumi service, but on the other hand, the added value these component resources may provide may be significantly higher.

Target audience differences

One thing that is not clear from code samples like these, what is the target audience? Who should use these things?

The sweet spot for both tools are people with developer/software engineering skills who also handle infrastructure.

In that regard, the sweet spot for AWS CDK is also teams or organisations that work similarly to how AWS and Amazon works - service-oriented teams that handle everything around that service, application code, infrastructure, CI/CD pipelines, security, support - all of it.

If you can and want to describe and handle all of that with the same programming languages, then AWS CDK may be what you want, if you do (almost) everything in AWS.

If you do not do everything with AWS, and your teams are not responsible for everything themselves, then Pulumi might fit better. It covers more providers than AWS, and integrates with and leverages a wider range of other tools.

If you don't like using programming languages, then AWS CDK isn't for you. Pulumi has YAML support, but I think the main benefit with that is if you can combine that with Pulumi components written in programming languages. So if some teams use programming languages and other use YAML configuration, then Pulumi can fit that.

It is possible to do the same type of combination with both AWS CDK and CloudFormation.

Final notes

I have tried to cover a few aspects of AWS CDK and Pulumi to compare them. I really like both of them, and they both have their pain points.

There are many more things that could be said about these tools, but that would rather be a book than a single article. The Pulumi CDK integration is a topic I find very interesting to explore further.

I hope this article has provided you with some useful information. Comment, suggest improvements and discuss what you think about these tools.

The Ops Community ⚙️