Erik Lundevall Zara

Posted on May 3, 2023 • Edited on Mar 23, 2024 • Originally published at cloudgnosis.org

Tidy Cloud AWS #47 - Infrastructure as SQL

#cloudops #aws #terraform #productivity

Hello all!

Welcome to the next issue of the Tidy Cloud AWS newsletter! In this issue, we will have a quick look at a very nice tool called Steampipe.

Enjoy!

Steampipe intro

Steampipe is a neat open source tool from Turbot, which allows you to treat many types of data as databases to query, using SQL. This includes resources from cloud providers, SaaS solutions, and much more.

You can treat your AWS environment as a database, your GitHub, local configuration files, your Terraform code, Kubernetes clusters, and much more. All of it can be queried using SQL.

On macOS, you can install steampipe using Homebrew for example:

brew install steampipe

For Linux and Windows (using WSL - Windows Subsystem for Linux) there is an install script to download and run.

sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)"

You cannot do that much though with just steampipe you also need plugins. There are plenty of different plugins, I am going to include a few here (AWS, AWS CloudFormation, Terraform, Config):

steampipe plugin install aws
steampipe plugin install awscfn
steampipe plugin install terraform
steampipe plugin install config

One way of running steampipe is to run the command steampipe query, which sets you up with a query prompt. There are a few meta commands you can use there, for example:

> steampipe query
Welcome to Steampipe v0.19.4
For more information, type .help

>.tables terraform
+-----------------------+------------------------------------+
| table                 | description                        |
+-----------------------+------------------------------------+
| terraform_data_source | Terraform data source information. |
| terraform_local       | Terraform local information.       |
| terraform_module      | Terraform module information.      |
| terraform_output      | Terraform output information.      |
| terraform_provider    | Terraform provider information.    |
| terraform_resource    | Terraform resource information.    |
+-----------------------+------------------------------------+



> .tables config
+----------------+---------------------------------------------------------------------+
| table          | description                                                         |
+----------------+---------------------------------------------------------------------+
| ini_key_value  | Table representation of an INI file.                                |
| ini_section    | Retrieves a list of sections and subsections defined in an INI file. |
| json_file      | Represents the JSON file content.                                   |
| json_key_value | List all key-value pairs from given JSON file.                      |
| yml_file       | Represents the YML file content into JSON format.                   |
| yml_key_value  | List all key-value pairs from given YML file.                       |
+----------------+---------------------------------------------------------------------+

>.tables 
 ==> aws
+--------------------------------------------------------------+--------------------------------------------------------------------+
| table                                                        | description                                                        |
+--------------------------------------------------------------+--------------------------------------------------------------------+
| aws_accessanalyzer_analyzer                                  | AWS Access Analyzer                                                |
| aws_account                                                  | AWS Account                                                        |
| aws_account_alternate_contact                                | AWS Account Alternate Contact                                      |
| aws_account_contact                                          | AWS Account Contact                                                |
| aws_acm_certificate                                          | AWS ACM Certificate                                                |
| aws_amplify_app                                              | AWS Amplify App                                                    |
| aws_api_gateway_api_key                                      | AWS API Gateway API Key                                            |

.
.
.

You can also use .inspect to look at the table definitions:

> .inspect aws_vpc
+---------------------------------+---------+----------------------------------------------------------------------------------+
| column                          | type    | description                                                                      |
+---------------------------------+---------+----------------------------------------------------------------------------------+
| _ctx                            | jsonb   | Steampipe context in JSON form, e.g. connection_name.                            |
| account_id                      | text    | The AWS Account ID in which the resource is located.                             |
| akas                            | jsonb   | Array of globally unique identifier strings (also known as) for the resource.    |
| arn                             | text    | The Amazon Resource Name (ARN) specifying the vpc.                               |
| cidr_block                      | cidr    | The primary IPv4 CIDR block for the VPC.                                         |
| cidr_block_association_set      | jsonb   | Information about the IPv4 CIDR blocks associated with the VPC.                  |
| dhcp_options_id                 | text    | Contains the ID of the set of DHCP options, associated with the VPC.             |
| instance_tenancy                | text    | The allowed tenancy of instances launched into the VPC.                          |
| ipv6_cidr_block_association_set | jsonb   | Information about the IPv6 CIDR blocks associated with the VPC.                  |
| is_default                      | boolean | Indicates whether the VPC is the default VPC.                                    |
| owner_id                        | text    | Contains ID of the AWS account that owns the VPC.                                |
| partition                       | text    | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). |
| region                          | text    | The AWS Region in which the resource is located.                                 |
| state                           | text    | Contains the current state of the VPC.                                           |
| tags                            | jsonb   | A map of tags for the resource.                                                  |
| tags_src                        | jsonb   | A list of tags that are attached with the VPC.                                   |
| title                           | text    | Title of the resource.                                                           |
| vpc_id                          | text    | The ID of the VPC.                                                               |
+---------------------------------+---------+----------------------------------------------------------------------------------+

I am simply going to go through a few examples to show what you can do with steampipe.

Steampipe vs AWS CLI

Before steampipe, my go-to-tool for checking various resource information was the AWS CLI. You can run the CLI tool to get much information about your AWS resource. You can also do a lot of filtering to get the specific data you want. A problem is though that the query language that AWS uses is JMESPath. It is quite capable, but the syntax is not something I use daily and not that intuitive to me. I struggle a fair amount when trying to use it.

Steampipe uses plain SQL - PostgreSQL. I might not use that daily either, but often enough that the syntax comes more naturally than JMESpath.

For example, if I want to check what VPCs are in an account and the Name and Environment tags for these, it is fairly straightforward:

> select vpc_id, cidr_block, tags -> 'Name' as vpc_name, tags -> 'Environment' as env from aws_vpc;

+-----------------------+---------------+---------------------------------------------------------------+---------------+
| vpc_id                | cidr_block    | vpc_name                                                      | env           |
+-----------------------+---------------+---------------------------------------------------------------+---------------+
| vpc-0e09b81bff5b1f4b1 | 10.100.0.0/16 | "base-infrastructure-pipeline/dev/base-network-main/base-vpc" | "Development" |
+-----------------------+---------------+---------------------------------------------------------------+---------------+

Find my way in config files

I have a directory with several git repositories, which all use YAML files for configuration data. I have used elz in some places in the config files I have locally to identify some own testing resources of mine. It is possible to use grep, but I lose some context in that case.

> select count(*) from yml_file
+-------+
| count |
+-------+
| 43    |
+-------+

> select key_path, value, start_line, path from yml_key_value where value like '%elz%'
+------------------------------------------+-----------+------------+---------------------------------------------------------------------------------------------+
| key_path                                 | value     | start_line | path                                                                                        |
+------------------------------------------+-----------+------------+---------------------------------------------------------------------------------------------+
| monitoring.notification_channels.0.topic | elz-tests | 100        | /Users/erikl/Documents/xxxx/repos/cicd-customer-testwork/configs/xxxx-x/dev.yaml |
+------------------------------------------+-----------+------------+---------------------------------------------------------------------------------------------+

Terraform provider versions

I want to check the Terraform provider versions stated in a bunch of terraform files in a directory structure. I do not want it to print out the whole path for the file locations. Also, I do not want it to include version information in modules in any .terraform directory.

> select name, version, replace(path, '/Users/erikl/Documents/YYYY/tfrepos/aws-infra', '...') as subpath from terraform_provider where version is not NULL and path not like '%.terraform%'
+------------+---------+--------------------------------------------------+
| name       | version | subpath                                          |
+------------+---------+--------------------------------------------------+
| kubernetes | 1.13.2  | .../yyyy-infra/main/main.tf                  |
| aws        | >= 3.8  | .../yyyy-service-backend-prod/main/main.tf   |
| aws        | >= 3.8  | .../yyyy-service-backend-prod/main/main.tf   |
| kubernetes | 1.13.2  | .../yyyy-service-backend-prod/main/main.tf   |
| aws        | >= 3.8  | .../yyyy-service-backend-prod/public_apis/main.tf |
| kubernetes | 1.13.2  | .../yyyy-test/main/main.tf                   |
| aws        | ~> 2.27 | .../root/main.tf                                 |
+------------+---------+--------------------------------------------------+

Steampipe service

Instead of running queries on the command-line, you can also use your favourite PostgreSQL tool to do the queries. Instead of using steampipe query to run queries directly through the command line, you can start it as a database engine with.

steampipe service start

You can then connect to it using any tool that can use PostgreSQL and do your queries and reports. This includes steampipe query as well.

Do not forget to stop the service when you are done with it, otherwise you may not get what you expect if you try to run steampipe query later. Been there, done that.

Final words

There are much more to cover around steampipe, but I do not want to keep this too long. Check it out on the Steampipe website for more!

You can find older newsletters and more at Tidy Cloud AWS. You will also find other useful articles around AWS automation and infrastructure-as-software.

Until next time,

/Erik

The Ops Community ⚙️