Javier Marasco

Posted on Aug 31, 2022

How to auto-document your deployments

#python #tutorials #devops

A bit of background

For most of my infrastructure deployments (if not all) I use a combination of terraform and terragrunt (I could write an article on it if you like, let me know in the comments) which relies on a configuration file to know what the automation should build.
You can think of it like there is a pipeline that reads a config file (json), extracts the information from it and passes it to terraform/terragrunt to run the build.

After each automation I create, there is a documentation on how to use it, what does what and where every piece of code exists and how to extend functionality if needed, this is very important for my team (so everyone can use it without me even present) but also as a guideline for another automations I will be doing in the future so I can re-use some patterns.

Functional documentation vs build documentation

After I create the documentation on how the solution works (functional documentation) there is another kind of documentation that is not so common to have, I am talking about a place where you can store details of what my automation built, and you can imagine this is kind of hard to track as (remember anyone in my team can run the pipeline now reading my documentation) more runs are done on the pipeline.

Approach (what if.....)

A few days ago I woke up having an idea and I started coding it right away (yeah... about 6am in the morning on a Monday...), I was excited to test if the idea was actually possible and how would it work out in the reality.
The idea was simple apparently: "Every time the pipeline runs, it will read the config files (json files) and extract information from them and format them nicely in html and publish that into some place". That sounds easy, so let's see what else can we add.

Folders, files and configs

Ok, I wrote a very simple folder structure to simulate an imaginary repository, this is like this.

/folder/app1/config.json
/folder/app2/config.json
/folder/app3/config.json

Each one of the config.json contains this structure with different values:

{
    "Name": "My resource in one",
    "UUID": "232131232-13eqwesd2-12321w-dsdasda",
    "ConnectionString": "My-connection-one.com:2020",
    "IPs": [
        "10.1.1.2",
        "23.2.2.12",
        "192.233.123.3"
    ]
}

This is a very simple configuration file, imagine this will be sent as input to a terraform module that would build something for you.

Code approach

I will be writing this in Python. First we will define a basic html structure and a variable that will contain our dynamic content.

import os
import json

base_template = '''
<body>
    <table style="border: 1px solid black;">
        <th style="border: 1px solid black;">Name</th>
        <th style="border: 1px solid black;">Connection string</th>
        <th style="border: 1px solid black;">IPs</th>
        {tr}
    </table>
</body>
'''

temporal_tr = ''

base_template contains an html table with a template place holder {tr} which will be replaced later by the content of temporal_tr

Then, we need to iterate over any .json file in our folder structure where we have our configuration files.

for root, subdirectories, files in os.walk(os.path.dirname(__file__)+'/configs'):    
    for file in files: 
        print(os.path.join(root, file))
        if file.__contains__('.json'):
            temporal_tr += '<tr>'
            config_file = os.path.join(os.path.dirname(__file__), os.path.join(root, file))
            with open(config_file, 'r') as read_json:
                content_json = json.load(read_json)
                temporal_tr += '<td style="border: 1px solid black;">'
                temporal_tr += content_json['Name']
                temporal_tr += '</td>'
                temporal_tr += '<td style="border: 1px solid black;">'
                temporal_tr += content_json['ConnectionString']
                temporal_tr += '</td>'
                temporal_tr += '<td style="border: 1px solid black;">'
                temporal_tr += ' '.join(content_json['IPs'])
                temporal_tr += '</td>'
            temporal_tr += '</tr>'

Keep in mind this is VERY hardcoded to match the config files I am using in this test, this definitely needs adjusts to match your case or even be more performant.

Lastly we will need to grab an index.html file, write our table inside, replace the {tr} place holder and save the index.html file.

path_html = os.path.join(os.path.dirname(__file__), 'index.html')
with open(path_html, 'w') as write_base:
    write_base.write(base_template)
tmpl = open(path_html, 'rt').read()
text = tmpl.format(tr=temporal_tr)
with open(path_html, 'w') as writer:
    writer.write(text)

At this point, if we execute this file, if will update an index.html file with the table format and the dynamic content extracted from our files.

You can try adding more config files (following the folder structure) and run the script again to see the changes in the file.

Just one more thing

Why stop here and have an index.html file with our content? why not do something more? ... since confluence is so used and they have a free tier, I opened an space there and I added a functionality to update a page in a particular space in confluence with the content of the index.html file.

Before we can start this section you need to do some steps.

Go to Confluence and open a free cloud account
Log in into here and create a token for your access
Create an empty page and retrieve the id of the page

How does that works? I wrote a simple function that does this for you:

Get's the current version of your page
Increases the version by 1
Creates the content of the page based on our previous work
Publish the content to the confluence page

def publish_to_confluence(user,token,url,page_id,title,content):
    import requests
    confluence_page_version_url = '%s?expand=version' % (url)
    page_current_version = requests.get(url=confluence_page_version_url, auth=(user,token)).json()['version']['number']
    content = content.replace('\n','')
    content = content.replace('"', "'")
    body = '''
    {"id":"%s","type":"page", "title":"%s","body":{"storage":{"value": "%s","representation":"storage"}}, "version":{"number":%i}}
    ''' % (page_id, title, content,page_current_version + 1)
    update_page_content = requests.put(url=url, auth=(user,token), data=body.replace('\n',''), headers={'content-type': 'application/json'})

At this point, your confluence page is updated and every time your pipeline runs you get this page updated.

We have a small problem here, and that is you will eventually end with a lot of versions there, so we can do some plean up and remove any older version and retain only the last X amount of versions. For that I have this other function:

def clean_old_versions(user,token,url,threshold):
    import requests
    versions = requests.get(url='%s/version' % (url), headers={'content-type': 'application/json'},auth=(user,token))

    if len(versions.json()['results']) >= threshold:
        max = len(versions.json()['results'])
        amount_to_delete = max - threshold
        while amount_to_delete >= 0:
            deleted = requests.delete(url='%s/version/%s' % (url,max-amount_to_delete),auth=(user,token),headers={'content-type': 'application/json'})
            amount_to_delete -= 1

What if I don't want to use confluence but I still want to show this to others?

Well.... there is another possibility, Azure storage accounts have a functionality called static pages, basically there is a repository in them that you can upload an index.html file and expose it to the world using a well known azure dns FQDN (you can later configure your DNSs to use a more friendly name if you want).

I just created an azure storage account in Azure and enabled "Static websites" this gives us an url https://javilabsstaticpage.z6.web.core.windows.net/ (in my case)
and also creates a $web container so we store there our static page (you need to specify the page you will be serving and also optionally an error page, I called mine index.html) so you should upload your page into this container.

I will use the python SDK for this so we keep the same language and integrate all this in a single big script.

You will need to install the Azure SDK (mainly the authorization module and the storage one, nothing else).

pip install azure-storage-blob

After you installed this, you can execute this code

def publish_to_storageaccount(account_conn_str,content_file_location):
    from azure.storage.blob import BlobServiceClient
    container = '$web'
    blob_client = BlobServiceClient.from_connection_string(account_conn_str)
    my_blob = blob_client.get_blob_client(container=container, blob='index.html')
    my_blob.delete_blob(delete_snapshots='include')
    with open(content_file_location, "rb") as blob:
        my_blob.upload_blob(blob)

With this you are uploading into the $web container, the html file you created previously (the one you are using to update your confluence page).

In this way, you also stored this information in a cheap storage that can be accessed from internet.

And how does all this works with a pipeline?

Well, at this point you have all the code needed to use any CI/CD tool you know/use and execute this script pointing to the place where your configuration files are.
Every time the pipeline runs, you will be updating your confluence page and/or your storage account static page, no more manual updates and a real reflection of your deployed resources in a handy place.

Wrap up

So in this article you saw how to make your pipeline update a document on each run, so basically on every run, you will be updating the documentation with the list of what was created by the pipeline while you don't need to do anything for it to keep current.

I hope you find this articule useful and let me know if you have any question, if you find this interesting I will appreciate you share if with others, leave a heath, a comment or follow me in any of my networks.

Thank you for reading!

The Ops Community ⚙️