The Open-Source On-Call Integration

This document provides a step-by-step guide to integrate Versus Incident with AWS Incident Manager to make an On Call. The integration enables automated escalation of alerts to on-call teams when incidents are not acknowledged within a specified time. Prerequisites Before you begin, ensure you have: An AWS account with access to AWS Incident Manager. Versus Incident deployed (instructions provided later). Prometheus Alert Manager set up to monitor your systems. Setting Up AWS Incident Manager for On-Call AWS Incident Manager requires configuring several components to manage on-call workflows. Let’s configure a practical example using 6 contacts, two teams, and a two-stage response plan. Use the AWS Console to set these up. Contacts Contacts are individuals who will be notified during an incident. In the AWS Console, navigate to Systems Manager > Incident Manager > Contacts. Click Create contact. For each contact: Enter a Name (e.g., "Natsu Dragneel"). Add Contact methods (e.g., SMS: +1-555-123-4567, Email: natsu@devopsvn.tech). Save the contact. Repeat to create 6 contacts (e.g., Natsu, Zeref, Igneel, Gray, Gajeel, Laxus). Escalation Plan An escalation plan defines the order in which contacts are engaged. Go to Incident Manager > Escalation plans > Create escalation plan. Name it (e.g., TeamA_Escalation). Add contacts (e.g., Natsu, Zeref, and Igneel) and set them to engage simultaneously or sequentially. Save the plan. Create a second plan (e.g., TeamB_Escalation) for Gray, Gajeel, and Laxus. RunBook (Optional) RunBooks automate incident resolution steps. For this guide, we’ll skip RunBook creation, but you can define one in AWS Systems Manager Automation if needed. Response Plan A response plan ties contacts and escalation plans into a structured response. Go to Incident Manager > Response plans > Create response plan. Name it (e.g., CriticalIncidentResponse). Define two stages: Stage 1: Engage TeamA_Escalation (Natsu, Zeref, and Igneel) with a 5-minute timeout. Stage 2: If unacknowledged, engage TeamB_Escalation (Gray, Gajeel, and Laxus). Save the plan and note its ARN (e.g., arn:aws:ssm-incidents::111122223333:response-plan/CriticalIncidentResponse). Define IAM Role for Versus Versus needs permissions to interact with AWS Incident Manager. In the AWS Console, go to IAM > Roles > Create role. Choose AWS service as the trusted entity and select EC2 (or your deployment type, e.g., ECS). Attach a custom policy with these permissions: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ssm-incidents:StartIncident", "ssm-incidents:GetResponsePlan" ], "Resource": "*" } ] } Name the role (e.g., VersusIncidentRole) and create it. Note the Role ARN (e.g., arn:aws:iam::111122223333:role/VersusIncidentRole). Deploy Versus Incident Deploy Versus using Docker or Kubernetes. Docker Deployment. Create a directory for your configuration files: mkdir -p ./config Create config/config.yaml with the following content: name: versus host: 0.0.0.0 port: 3000 public_host: https://your-ack-host.example alert: debug_body: true slack: enable: true token: ${SLACK_TOKEN} channel_id: ${SLACK_CHANNEL_ID} template_path: "config/slack_message.tmpl" oncall: enable: true wait_minutes: 3 aws_incident_manager: response_plan_arn: ${AWS_INCIDENT_MANAGER_RESPONSE_PLAN_ARN} redis: # Required for on-call functionality insecure_skip_verify: true # dev only host: ${REDIS_HOST} port: ${REDIS_PORT} password: ${REDIS_PASSWORD} db: 0 Create Slack templates config/slack_message.tmpl:

Mar 22, 2025 - 18:04
 0
The Open-Source On-Call Integration

This document provides a step-by-step guide to integrate Versus Incident with AWS Incident Manager to make an On Call. The integration enables automated escalation of alerts to on-call teams when incidents are not acknowledged within a specified time.

Versus

Prerequisites

Before you begin, ensure you have:

  • An AWS account with access to AWS Incident Manager.
  • Versus Incident deployed (instructions provided later).
  • Prometheus Alert Manager set up to monitor your systems.

Setting Up AWS Incident Manager for On-Call

AWS Incident Manager requires configuring several components to manage on-call workflows. Let’s configure a practical example using 6 contacts, two teams, and a two-stage response plan. Use the AWS Console to set these up.

Contacts

Contacts are individuals who will be notified during an incident.

  1. In the AWS Console, navigate to Systems Manager > Incident Manager > Contacts.
  2. Click Create contact.
  3. For each contact:
  4. Enter a Name (e.g., "Natsu Dragneel").
  5. Add Contact methods (e.g., SMS: +1-555-123-4567, Email: natsu@devopsvn.tech).
  6. Save the contact.

Repeat to create 6 contacts (e.g., Natsu, Zeref, Igneel, Gray, Gajeel, Laxus).

Escalation Plan

An escalation plan defines the order in which contacts are engaged.

  1. Go to Incident Manager > Escalation plans > Create escalation plan.
  2. Name it (e.g., TeamA_Escalation).
  3. Add contacts (e.g., Natsu, Zeref, and Igneel) and set them to engage simultaneously or sequentially.
  4. Save the plan.
  5. Create a second plan (e.g., TeamB_Escalation) for Gray, Gajeel, and Laxus.

RunBook (Optional)

RunBooks automate incident resolution steps. For this guide, we’ll skip RunBook creation, but you can define one in AWS Systems Manager Automation if needed.

Response Plan

A response plan ties contacts and escalation plans into a structured response.

  1. Go to Incident Manager > Response plans > Create response plan.
  2. Name it (e.g., CriticalIncidentResponse).
  3. Define two stages:
  4. Stage 1: Engage TeamA_Escalation (Natsu, Zeref, and Igneel) with a 5-minute timeout.
  5. Stage 2: If unacknowledged, engage TeamB_Escalation (Gray, Gajeel, and Laxus).
  6. Save the plan and note its ARN (e.g., arn:aws:ssm-incidents::111122223333:response-plan/CriticalIncidentResponse).

Define IAM Role for Versus

Versus needs permissions to interact with AWS Incident Manager.

  1. In the AWS Console, go to IAM > Roles > Create role.
  2. Choose AWS service as the trusted entity and select EC2 (or your deployment type, e.g., ECS).
  3. Attach a custom policy with these permissions:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssm-incidents:StartIncident",
                "ssm-incidents:GetResponsePlan"
            ],
            "Resource": "*"
        }
    ]
}
  1. Name the role (e.g., VersusIncidentRole) and create it.
  2. Note the Role ARN (e.g., arn:aws:iam::111122223333:role/VersusIncidentRole).

Deploy Versus Incident

Deploy Versus using Docker or Kubernetes. Docker Deployment. Create a directory for your configuration files:

mkdir -p ./config

Create config/config.yaml with the following content:

name: versus
host: 0.0.0.0
port: 3000
public_host: https://your-ack-host.example

alert:
  debug_body: true

  slack:
    enable: true
    token: ${SLACK_TOKEN}
    channel_id: ${SLACK_CHANNEL_ID}
    template_path: "config/slack_message.tmpl"

oncall:
  enable: true
  wait_minutes: 3

  aws_incident_manager:
    response_plan_arn: ${AWS_INCIDENT_MANAGER_RESPONSE_PLAN_ARN}

redis: # Required for on-call functionality
  insecure_skip_verify: true # dev only
  host: ${REDIS_HOST}
  port: ${REDIS_PORT}
  password: ${REDIS_PASSWORD}
  db: 0

Create Slack templates config/slack_message.tmpl: