The Ops Community ⚙️

Cover image for Create a disposable Linux Azure VM with PowerShell, Bicep and cloud-init - running rootless dockerd and containerd
Kai Walter
Kai Walter

Posted on • Updated on

Create a disposable Linux Azure VM with PowerShell, Bicep and cloud-init - running rootless dockerd and containerd

TL;DR

In this post

  1. check current Azure subscription with Azure PowerShell
  2. set parameters for a Bicep template deployment with PowerShell
  3. cleaning up known_hosts after VM redeployment with PowerShell
  4. determine FQDN of newly deployed VM with Azure PowerShell
  5. SSH into newly created VM directly with specified key
  6. Resource Group creation/update from within Bicep (subscription scope) - with VM in a separate Bicep module
  7. setting customData (cloud-init.txt passed from PowerShell) in Bicep
  8. configuration of VM automatic shutdown with a Microsoft.DevTestLab resource

Motivation

In a recent post I already mentioned, that often I do remote SSH development from Visual Studio Code on Windows into a Linux Azure VMs in cases

  • build structure and performance requirements prevent me from using local WSL or GitHub Codespaces
  • I want to have an unobstructed Linux inner dev loop

So for these cases I work with disposable VMs which I can bring up fast and also tear down once a development stage is done or I need to rewind & redeploy to be absolutely sure to have intended dependencies in place.

I strive to have everything reproducible for future me or other people

In this post I want to share the whole scripting setup I use.

Prerequisites on Windows

  • Azure PowerShell
  • Bicep
  • OpenSSH

PowerShell script to drive the process

Although Bicep handles the bulk of resource creation I still use PowerShell to do some of the housekeeping activities around the creation process.

I will present the script in 5 parts - parts which in reality of course are assembled in one script - just to have a better context for explanations.

Parameters

Parameter section controls overall input to the creation process. I have the most common defaults for my use cases prefilled in the {...} parameter defaults.

[CmdletBinding()]
param (
   [string] $tenantId = "{azure-tenant-id-for-vm}",
   [string] $subscriptionId = "{azure-subscription-id-for-vm}",
   [string] $computerName = "{vm-name}",
   [string] $resourceGroupName = "{azure-resource-group-for-vm}",
   [string] $vmSize = "Standard_B8ms",
   [string] $location = "{azure-region-for-vm}",
   [string] $userName = "{vm-admin-user}",
   [string] $sshPrivKeyFilename = "{name-of-ssh-private-key-file-in-home/.ssh-folder}",
   [string] $sshPubKeyFilename = "{name-of-ssh-public-key-file-in-home/.ssh-folder}",
   [string] $cloudInitFilename = "cloud-init.txt"
)

Enter fullscreen mode Exit fullscreen mode

especially in parameters -computerName and -resourceGroupName I often work with 2 sets of inputs - with that I can delete the resource group/VM of a failed validation while already creating a new resource group/VM with the fixes

Reading configuration file contents

SSH public key file content and cloud-init are loaded into variables to be passed to Bicep deployment later.

$privKeyFilename = Join-Path $HOME ".ssh" $sshprivKeyFilename -Resolve

$pubKeyFilename = Join-Path $HOME ".ssh" $sshPubKeyFilename -Resolve
$pubKey = [System.IO.File]::ReadAllText($pubKeyFilename)

$cloudInitFilename = Join-Path $PSScriptRoot $cloudInitFilename -Resolve
$cloudInitText = [System.IO.File]::ReadAllText($cloudInitFilename)
Enter fullscreen mode Exit fullscreen mode

I use the -Resolve capability of Join-Path basically to implicitly checking whether a file exists - in case the file does not exist the script would already break here

Function Remove-SshKnownHost

After VM deployment succeeded but cloud-init process is still in progress, I want to immediately "ssh" into the VM to follow the cloud-init process. On recreation of a VM the former known_hosts entries are invalid, hence I use this function to clean this up before "ssh-ing" into the new VM.

function Remove-SshKnownHost {
   param (
      [Parameter(Position = 1, Mandatory = $true)]
      [string]
      $ComputerName
   )

   $knownhosts = Join-Path $HOME ".ssh" "known_hosts"

   if (Test-Path $knownhosts -PathType Leaf) {

      $contents = Get-Content $knownhosts -Raw

      if ($contents) {

         if ($contents -match "^[^\n]+\r\n") {
            $splitter = "\r\n"
            $joiner = "`r`n"
         }
         else {
            $splitter = "\n"
            $joiner = "`n"
         }

         $listIn = [regex]::Split($contents, $splitter ) | Where-Object { $_ -ne "" }
         $listOut = $listIn | Select-String $("^(?!$computerName)") -List

         if ($listOut.Count -ne $listIn.Count) {
            Write-Host "removed" $($listIn.Count - $listOut.Count) "lines"
            $listOut -join $joiner | Set-Content $knownhosts -NoNewline
         }
      }
      else {
         throw "file $knownhosts has no content"
      }

   }
   else {
      throw "file $knownhosts not found"
   }

}

Enter fullscreen mode Exit fullscreen mode

Deployment

Coming to the main part of the script, it safeguards that it is operating in the desired subscription and then initiating the Bicep deployment.

$subscription = Get-AzSubscription -TenantId $tenantId `
   -SubscriptionId $subscriptionId `
   -ErrorAction SilentlyContinue

if (!$subscription) {
   Connect-AzAccount -TenantId $tenantId -UseDeviceAuthentication
   $subscription = Get-AzSubscription -TenantId $tenantId `
      -SubscriptionId $subscriptionId 
}

Select-AzSubscription -SubscriptionObject $subscription

Write-Host "creating Resource Group" $resourceGroupName

$deploymentName = $resourceGroupName + "-rg"
$templateName = Join-Path $PSScriptRoot "rg.bicep"

$parameters = @{ }
$parameters['location'] = $location
$parameters['resourceGroupName'] = $resourceGroupName
$parameters['computerName'] = $computerName
$parameters['vmSize'] = $vmSize
$parameters['adminUsername'] = $userName
$parameters['adminPasswordOrKey'] = $pubKey
$parameters['customData'] = $cloudInitText

New-AzDeployment -Name $deploymentName `
   -TemplateFile $templateName `
   -Location $location `
   -TemplateParameterObject $parameters `
   -Verbose
Enter fullscreen mode Exit fullscreen mode

Observe cloud-init

As mentioned above known_hosts is cleaned up and a ssh session is started to follow cloud-init progress in /var/log/cloud-init-output.log. The SSH session is established directly with private key file and FQDN of VM and thus having no requirement on a configuration in $HOME\.ssh\config file.

Remove-SshKnownHost -ComputerName $computerName

$nicId = (Get-AzVM -ResourceGroupName $computerName -Name $resourceGroupName).NetworkProfile.NetworkInterfaces[0].Id
$publicIpId = (Get-AzNetworkInterface -ResourceId $nicId).IpConfigurations[0].PublicIpAddress.Id
$fqdn = (Get-AzResource -ResourceId $publicIpId | Get-AzPublicIpAddress).DnsSettings.Fqdn

do {
   Start-Sleep -Seconds 1
   ssh -i $privKeyFilename $userName@$($fqdn) tail -f /var/log/cloud-init-output.log
} until ($?)
Enter fullscreen mode Exit fullscreen mode

the Get-AzResource | Get-AzPublicIpAddress combination is required, as Azure PowerShell Get-AzPublicIpAddress does not seem to support -ResourceId parameter directly

The process is finished when tail is showing something like

...
NEEDRESTART-VER: 3.5
NEEDRESTART-KCUR: 5.15.0-1017-azure
NEEDRESTART-KEXP: 5.15.0-1019-azure
NEEDRESTART-KSTA: 3
NEEDRESTART-SVC: packagekit.service
NEEDRESTART-SVC: udisks2.service
NEEDRESTART-SVC: unattended-upgrades.service
Cloud-init v. 22.2-0ubuntu1~22.04.3 finished at Thu, 01 Sep 2022 06:57:15 +0000. Datasource DataSourceAzure [seed=/dev/sr0].  Up 680.07 seconds
Enter fullscreen mode Exit fullscreen mode

sometimes tail does not manage to follow the progress of cloud-init and the screen freezes; then I just Ctrl-C the ssh session, ssh again and check the process with tail /var/log/cloud-init-output.log

Template rg.bicep

For deploying the disposable resource group I also use Bicep which has a nice feature to change context from resource group to subscription level in order to make such resource group operations. VM is then created in scope of the resource group with another Bicep module:

targetScope = 'subscription' // Resource group must be deployed under 'subscription' scope

param location string
param resourceGroupName string
param computerName string
param vmSize string = 'Standard_DS1_v2'
param adminUsername string = 'admin'
@secure()
param adminPasswordOrKey string
param customData string

resource rg 'Microsoft.Resources/resourceGroups@2021-01-01' = {
  name: resourceGroupName
  location: location
}

module vm 'vm.bicep' = {
  name: 'vm'
  scope: rg
  params: {
    location: location
    computerName: computerName
    vmSize: vmSize
    adminUsername: adminUsername
    adminPasswordOrKey: adminPasswordOrKey
    customData: customData
  }
}
Enter fullscreen mode Exit fullscreen mode

Template vm.bicep

Only essential parameters are passed from PowerShell over rg.bicep to the VM creation. Assumptions or values that do not change that often I just keep in variables.

param location string = resourceGroup().location
param computerName string
param vmSize string = 'Standard_DS1_v2'

param adminUsername string = 'admin'
@secure()
param adminPasswordOrKey string
param customData string = 'echo customData'

var authenticationType = 'sshPublicKey'
var vmImagePublisher = 'canonical'
var vmImageOffer = '0001-com-ubuntu-server-jammy'
var vmImageSku = '22_04-lts-gen2'
var vnetAddressPrefix = '192.168.43.0/27'

var vmPublicIPAddressName = '${computerName}-ip'
var vmVnetName = '${computerName}-vnet'
var vmNsgName = '${computerName}-nsg'
var vmNicName = '${computerName}-nic'
var vmDiagnosticStorageAccountName = '${replace(computerName, '-', '')}${uniqueString(resourceGroup().id)}'

var shutdownTime = '2200'
var shutdownTimeZone = 'W. Europe Standard Time'

var linuxConfiguration = {
  disablePasswordAuthentication: true
  ssh: {
    publicKeys: [
      {
        path: '/home/${adminUsername}/.ssh/authorized_keys'
        keyData: adminPasswordOrKey
      }
    ]
  }
}

var resourceTags = {
  vmName: computerName
}

resource vmNsg 'Microsoft.Network/networkSecurityGroups@2022-01-01' = {
  name: vmNsgName
  location: location
  tags: resourceTags
  properties: {
    securityRules: [
      {
        name: 'in-SSH'
        properties: {
          protocol: 'Tcp'
          sourcePortRange: '*'
          destinationPortRange: '22'
          sourceAddressPrefix: '*'
          destinationAddressPrefix: '*'
          access: 'Allow'
          priority: 1000
          direction: 'Inbound'
        }
      }
    ]
  }
}

resource vmVnet 'Microsoft.Network/virtualNetworks@2022-01-01' = {
  name: vmVnetName
  location: location
  tags: resourceTags
  properties: {
    addressSpace: {
      addressPrefixes: [
        vnetAddressPrefix
      ]
    }
  }
}

resource vmSubnet 'Microsoft.Network/virtualNetworks/subnets@2022-01-01' = {
  name: 'vm'
  parent: vmVnet
  properties: {
    addressPrefix: vnetAddressPrefix
    networkSecurityGroup: {
      id: vmNsg.id
    }
  }
}

resource vmDiagnosticStorage 'Microsoft.Storage/storageAccounts@2019-06-01' = {
  name: vmDiagnosticStorageAccountName
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'Storage'
  tags: resourceTags
  properties: {}
}

resource vmPublicIP 'Microsoft.Network/publicIPAddresses@2019-11-01' = {
  name: vmPublicIPAddressName
  location: location
  tags: resourceTags
  properties: {
    publicIPAllocationMethod: 'Dynamic'
    dnsSettings: {
      domainNameLabel: computerName
    }
  }
}

resource vmNic 'Microsoft.Network/networkInterfaces@2022-01-01' = {
  name: vmNicName
  location: location
  tags: resourceTags
  properties: {
    ipConfigurations: [
      {
        name: 'ipconfig1'
        properties: {
          privateIPAllocationMethod: 'Dynamic'
          publicIPAddress: {
            id: vmPublicIP.id
          }
          subnet: {
            id: vmSubnet.id
          }
        }
      }
    ]
  }
}

resource vm 'Microsoft.Compute/virtualMachines@2022-03-01' = {
  name: computerName
  location: location
  tags: resourceTags
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    hardwareProfile: {
      vmSize: vmSize
    }
    storageProfile: {
      imageReference: {
        publisher: vmImagePublisher
        offer: vmImageOffer
        sku: vmImageSku
        version: 'latest'
      }
      osDisk: {
        createOption: 'FromImage'
        diskSizeGB: 1024
      }
    }
    osProfile: {
      computerName: computerName
      adminUsername: adminUsername
      adminPassword: adminPasswordOrKey
      customData: base64(customData)
      linuxConfiguration: ((authenticationType == 'password') ? json('null') : linuxConfiguration)
    }
    networkProfile: {
      networkInterfaces: [
        {
          id: vmNic.id
        }
      ]
    }
    diagnosticsProfile: {
      bootDiagnostics: {
        enabled: true
        storageUri: vmDiagnosticStorage.properties.primaryEndpoints.blob
      }
    }
  }
}

resource vmShutdown 'Microsoft.DevTestLab/schedules@2018-09-15' = {
  name: 'shutdown-computevm-${computerName}'
  location: location
  tags: resourceTags
  properties: {
    status: 'Enabled'
    taskType: 'ComputeVmShutdownTask'
    dailyRecurrence: {
      time: shutdownTime
    }
    timeZoneId: shutdownTimeZone
    notificationSettings: {
      status: 'Disabled'
    }
    targetResourceId: vm.id
  }
}
Enter fullscreen mode Exit fullscreen mode

important side note: automatic VM shutdown with Microsoft.DevTestLab only works in Azure environments, where this service is available; e.g. Azure China does not have this service and hence other means like Azure Automation need to be used

cloud-init

In my previous post on cloud-init and user space configuration for the Rust toolchain I already explored some of the quirks when trying to make user configuration during cloud-init process.

I applied that knowledge to additionally bring up rootless dockerd and containerd

  • for some tinkering I want to do with Rust, WASM, WASI and a new microservice abstraction framework called SpiderLightning slight
  • to prepare myself for Docker to whatever can replace Docker for my use cases

The essence again is how much user space needs to be prepared during cloud-init, in order to get the setup scripts running. It's not straightforward, not well documented and needs some trial and error.

#cloud-config
package_upgrade: true
packages:
- apt-transport-https
- build-essential
- cmake
- libssl-dev
- openssl
- unzip
- pkg-config
- jq
- uidmap
- dbus-user-session
write_files:
  - path: /tmp/install-containerd-tools.sh
    content: | 
      #!/bin/bash
      mkdir -p ~/.local/bin
      tar -C ~/.local/bin/ -xzf /tmp/nerdctl.tar.gz --strip-components 1 bin/nerdctl
      tar -C ~/.local/bin/ -xzf /tmp/nerdctl.tar.gz --strip-components 1 --wildcards bin/containerd-rootless*
      tar -C ~/.local -xzf /tmp/nerdctl.tar.gz libexec
      echo 'export CNI_PATH=~/.local/libexec/cni' >> ~/.bashrc
      echo 'preliminary env setup ----------'
      export PATH=$PATH:~/.local/bin
      env
      echo 'preliminary systemctl setup ----------'
      export XDG_RUNTIME_DIR=$(loginctl show-user kai -P RuntimePath)
      export DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/$(id -u)/bus
      ~/.local/bin/containerd-rootless-setuptool.sh install
      dockerd-rootless-setuptool.sh install
      echo 'export DOCKER_HOST=unix:///run/user/1000/docker.sock' >> ~/.bashrc
    permissions: '0755'  
runcmd:
- export USER=$(awk -v uid=1000 -F":" '{ if($3==uid){print $1} }' /etc/passwd)

- export RUSTUP_HOME=/opt/rust
- export CARGO_HOME=/opt/rust
- curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | bash -s -- -y --no-modify-path --default-toolchain stable --profile default
- echo '\n\n# added by cloud init\nsource /opt/rust/env' >> /home/$USER/.profile
- sudo -H -u $USER bash -c 'source /opt/rust/env && rustup default stable'

- curl -fsSL https://get.docker.com -o get-docker.sh
- sudo sh get-docker.sh
- wget -q -O /usr/libexec/docker/cli-plugins/docker-buildx $(curl -s https://api.github.com/repos/docker/buildx/releases/latest | jq -r ".assets[] | select(.name | test(\"linux-amd64\")) | .browser_download_url")
- chmod u+x /usr/libexec/docker/cli-plugins/docker-buildx

- wget -q -O /tmp/nerdctl.tar.gz $(curl -s https://api.github.com/repos/containerd/nerdctl/releases/latest | jq -r ".assets[] | select(.name | test(\"full.*linux-amd64\")) | .browser_download_url")
- loginctl enable-linger $USER
- sudo -H -u $USER bash -c '/tmp/install-containerd-tools.sh'

- curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
Enter fullscreen mode Exit fullscreen mode

Result

Now I can ssh into the VM and use Docker and containerd rootless:

user@my-devvm:~$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
user@my-devvm:~$ docker buildx ls
NAME/NODE  DRIVER/ENDPOINT STATUS  BUILDKIT PLATFORMS
default *  docker
  default  default         running 20.10.17 linux/amd64, linux/386
rootless   docker
  rootless rootless        running 20.10.17 linux/amd64, linux/386
user@my-devvm:~$ nerdctl ps
CONTAINER ID    IMAGE    COMMAND    CREATED    STATUS    PORTS    NAMES
user@my-devvm:~$ rustc -V
rustc 1.63.0 (4b91a6ea7 2022-08-08)
user@my-devvm:~$ cargo -V
cargo 1.63.0 (fd9c4297c 2022-07-01)
Enter fullscreen mode Exit fullscreen mode

Clean up

To clean up, a brief

Remove-AzResourceGroup -Name {azure-resource-group-for-vm} -Force`
Enter fullscreen mode Exit fullscreen mode

is required.

Related resources

Top comments (0)