The Ops Community ⚙️

Cover image for PagerDuty Garage Walkthrough: Sending On Call Info to Slack, Part 2
Mandi Walls for PagerDuty Community

Posted on

PagerDuty Garage Walkthrough: Sending On Call Info to Slack, Part 2

In Part 1 of this post series, I walked through some basic calls to collect the current on-call responder for each schedule in a PagerDuty account. That solution might not cover everything you could be looking for when querying the active on-call responders in your organization. Another potential solution uses a different API endpoint, the /oncalls endpoint.

You’ll find this code at GitHub.

How /oncalls is Different

In Part 1, we started with the /schedules endpoint, since here at PagerDuty each team sets up a schedule for use in all of their services. That means in our account there is (mostly) a 1:1 relationship between an active team and the schedule that includes the members of that team. Then those teams each have a dedicated channel in our Slack workspace. So our solution is very much tied to the organizational set up. If that fits for your org, too, great! If not, we can take a look at another option.

The /oncalls endpoint returns all of the current responders in your organization, in every rule of every escalation policy. This will likely be a much larger dataset than what is available via /schedules, since an escalation policy can include schedules or individuals in any rule.

That also means that the data returned might be more complicated. Escalation policy rules can have more than one individual included:

A PagerDuty escalation policy showing two individual users on call at the same level.

Those individuals might be assigned via round robin:

A PagerDuty escalation policy showing two individual users on call at the same level but assigned round robin.

The escalation policy can have a schedule:

A PagerDuty escalation policy showing a user assigned oncal based on a schedule.

Or even a mix of both users and schedules:

A PagerDuty escalation policy showing a user assigned oncall directly and one assigned via an included schedule.

Teams find different ways that any of these combinations can be helpful, and any combination can be applied to up to 9 escalation rules in a single escalation policy.

So there’s a lot more going on here than what we saw with /schedules. It’s a more complete, though more complicated, picture of what is going on in your PagerDuty account.

Let’s dig in!

Prerequisites and Libraries

  • An API Key for PagerDuty REST API. (Different from the events API). If you’re only interested in the on-call for a single team, you could use a personal key. If you want to look across multiple teams in an account, you’ll want an account-wide key, and you’ll need admin permissions to do that.
  • A webhook or target to send information to Slack. YMMV on how you’d like to integrate with Slack. You have a couple of different options - I’m following along with the instructions here as a prerequisite for this setup, and I’m using the same endpoint I used for my other post. ;)

In Python

I’m going to use Python 3.9 for this example. I’ve not tested all of the bits with earlier versions of Python, so there might be some things that don’t quite work.

The PagerDuty API library for Python is called pdpyras, and I’ll be using that to make requests to the API. You can also use the requests library, if that’s more to your liking. Some of the data structures will be different from what’s below, but aren’t too wild. The requests library is still necessary for sending information to the Slack webhook.

Other packages I’m using are json for reading and creating JSON objects and os for pulling keys out of the running shell environment (you can use a vault of some sort instead).

What We’ll Get

As in Part 1, we’ll send a string of output from the PagerDuty API to a webhook in Slack. This will be processed as a single message in a single channel. The output will include the name of the escalation policy, a link to the policy in the web ui, and the responders currently on-call at each level of the policy.

The Highlights

  1. Request all the on-calls in the account using the /oncalls endpoint
  2. From each on-call in the response, pull some important information and ignore other stuff
  3. Build a data structure to manage all the data. We’ll get into why this step is helpful.
  4. Then we’ll wrap it all up in a structure we can send to Slack!

The Code

Set Up

First we’ll import libraries, read some keys from the environment, set up the API session, and initialize the Slack message blocks.

import json
import os
import requests
from pdpyras import APISession

api_token = os.environ['PD_API_KEY']

session = APISession(api_token)

slack_url = os.environ['SLACK_CHANNEL_URL']

blocks = []
header_block = {
    "type": "header",
    "text": {
        "type": "plain_text",
        "text": "Oncall Now:"
    }
}
blocks.append(header_block)

# all_eps is going to capture just the escalation policy id, summary, and on-calls at each rule
all_eps = {}
Enter fullscreen mode Exit fullscreen mode

Request the On Calls

We only need one API request for this script, to the /oncalls endpoint.

# this is going to request the oncalls based on active escalation policies
all_oncalls = session.rget("/oncalls")
Enter fullscreen mode Exit fullscreen mode

Walk the Data

Now pull apart the returned data and pick out the bits of info we want to display. There’s some information in the raw return that we don’t necessarily want for this use case. Each current responder in your organization will have a created object in the data returned. The objects include information like the escalation policy the user is referenced in, whether they are configured as a user or if they are part of a schedule, and what rule level they are included in.

We’ll organize the data by escalation policy in the all_eps data structure.

When there is more than one responder in a given escalation rule - like in the first example above - they are treated as two on-call objects, and they are not necessarily grouped together in the data returned from the API. So to make sure I don’t miss someone, I’m looking for the escalation policy id. If I have already seen this id, add data to its structure. If I haven’t seen this id, create a new entry in the all_eps structure.

The same applies in the substructure, where the rule levels appear. There might be up to 9 rules, and any rule might have more than one responder. The eventual structure is:

{ “ID”: 
    {
    “summary”: “Name of the Escalation Policy”,
    “html_url”: “URL of the EP in the Web UI”,
    “levels”: {
        “1”: “person1, person2”,
        “2”: “person3”,
        …
    }
    }
}
Enter fullscreen mode Exit fullscreen mode

Where there is more than one person included as a responder at a level, I’m concatenating them to the string value for that level. I could create another array here, but I don’t really need it for the output I’m going to create later. I’m also not paying attention to whether the user was included as an individual or via a schedule. For this example, that’s not important, but it’s there in the API data.

So this code creates the all_eps structure from the API data:

for oncall in all_oncalls:
    ep_id = oncall['escalation_policy']['id']

    # the escalation policy is already included, add another person
    if ep_id not in all_eps:
        all_eps[ep_id] = {
            "summary": oncall['escalation_policy']['summary'],
            "html_url": oncall['escalation_policy']['html_url'],
            "levels": {}
        }
    esc_level = oncall['escalation_level']
    if esc_level not in all_eps[ep_id]['levels']:
        all_eps[ep_id]['levels'][esc_level] = {"people": oncall['user']['summary']}
    else:
        all_eps[ep_id]['levels'][esc_level]["people"] = all_eps[ep_id]['levels'][esc_level]["people"] + ", " + oncall['user']['summary']
Enter fullscreen mode Exit fullscreen mode

Build the Message Blocks for Slack

Now I have all the on-call responders, grouped together via their escalation policies and organized by rule level. I can use all_eps to create my message blocks. Each escalation policy serves as a key.

These messages have a bit more fancy formatting than the messages in Part 1. I’m including the link to the escalation policy in the web UI as part of the message, and then using bold text to highlight each escalation rule level. Each escalation policy will have a single line of output included in the message.

for ep in all_eps:
    # build the message block for this escalation policy
    msg_str = "*<{}|{}>* ".format(all_eps[ep]['html_url'], all_eps[ep]['summary'])
    for level in sorted(all_eps[ep]['levels'].keys()):
        msg_str = msg_str + "| *Level {}*: {} ".format(level, all_eps[ep]['levels'][level]['people'])
    msg_block = {
        "type": "section",
        "text": {
            "type": "mrkdwn",
            "text": msg_str
        }
    }
    blocks.append(msg_block)
Enter fullscreen mode Exit fullscreen mode

Send to Slack

Now I have the message built, each escalation policy is a single “block” which corresponds to one line in the Slack message. Use json.dumps() to turn the blocks into JSON for Slack, and send the output using requests.post.

# build the json payload
payload = {
    "blocks": blocks
}
j_payload = json.dumps(payload)

# create and send the request to the slack webhook url
slack_headers = {"Content-Type": "application/json"}
sent_msg = requests.post(slack_url, headers=slack_headers, data=j_payload)
sent_msg.raise_for_status()
print(sent_msg.text)
Enter fullscreen mode Exit fullscreen mode

Sample Output

My dev account is pretty simple, and since there’s a limit on the number of responders available in dev accounts on PagerDuty, it looks pretty repetitive, but you can see how the escalation policies are organized with each level rule.

Slack channel message with on-call listings for the 9 escalation policies in this account.

Contrast that to the sample output for the use case in Part 1:

Slack channel message with on-call listings for 5 schedules in this account.

In this account, there are more escalation policies than there are schedules, and there are escalation rules that include individuals, so the Part 1 use case only covers part of the active on-call responders across the entire account.

Next Steps

Now that we’re well-versed in the schedule and escalation_policy objects in the API, the last post in this series will tackle the information from the other side: we’ll take a look at what is needed to find the current on-call responders for a specific service in a PagerDuty account.

Oldest comments (0)