Skip to content

Commit

Permalink
Merge pull request #18 from DEFRA/handle-pager-duty-label
Browse files Browse the repository at this point in the history
will not send alerts to pagerDuty if they lack pagerDuty=true label
  • Loading branch information
christopherjturner authored Feb 6, 2025
2 parents bcf4e25 + 84875fc commit 1c71c7b
Show file tree
Hide file tree
Showing 4 changed files with 188 additions and 7 deletions.
90 changes: 90 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,95 @@
# cdp-notify

How the pager duty integration works.

`cdp-notify` receives grafana alerts, the same as the email notification service.

Example grafana alert message:

```
{
environment: 'prod',
service: 'test-service',
team: 'Platform',
alertName: 'test-service - prod',
status: 'firing',
startsAt: '2024-11-04 12:53:20 +0000 UTC',
endsAt: '2024-11-04 12:53:20 +0000 UTC',
summary: 'A test suite',
description: '',
series: '',
runbookUrl: '',
alertURL: 'https://grafana/alerting/grafana/0000/view'
}
```

When processing an event the pager duty handler does the following:

1. If the alert does not contain the field `pagerDuty="true"` (note: `"true"` is a quoted string, NOT a boolean) then no alert is sent

2. Decides if the alert comes from an environment we want to handle.
This is done by comparing the `environment` field in the alert to either: alertEnvironments from config (default: `[prod]`) or to the environment field in `~/src/config/pagerduty-service-overides.js` if an entry exist for that service.
If there is no match then the alert is not processed

3. Find all pager-duty integration keys for the alert
This is done in several steps:

- First, find the team that owns the service.
- If there is an entry for the service in `~/src/config/pagerduty-service-overides.js` use the teams set there.
- If not, look up the teams for the service from cdp-portal-backend.
- For each team see if there is a config entry matching `pagerduty.teams.${team}.integrationKey`
- If no matches are found check the config for `pagerduty.services.${service}.integrationKey`
- If no matches are found the alert is not processed.

4. Check if sending alerts is enabled
If `pagerduty.sendAlerts` is set to false in config no alerts will be sent.
5. Check if alert has a status of either `firing` or `resolve`.
The status is remapped from `firing` to `trigger` and `resolved` to `resolve`.
6. Generate a deduplication key
This is a md5 hash of the service, environment and alertURL field.
7. Call the pager duty api
A payload is built using the integration key, alert details, status and deduplication key.

## PagerDuty Config

### Adding a new tenant service to pager-duty

Find the team name as it appears in portal for the owner of the service.
Ensure the team is setup in pager-duty and an integration key is generated.
In cdp-notify add a new entry to ~/src/config/index.js to `pagerduty.teams`

```
pagerduty: {
teams: {
'my-new-team': {
integrationKey: {
doc: 'Integration key for digital service',
format : String,
default: 'key',
env : 'MY_NEW_TEAM_INTEGRATION_KEY'
}
}
}
```

In add the team's pagerduty integration key as a secret to cdp-notify with an ID matching the id set in the new config entry (e.g. MY_NEW_TEAM_INTEGRATION_KEY).
Redeploy the service.

### Sending alerts for non-tenant services

To send pager duty alerts for non-tenant services (e.g. mongodb, lambdas, workflows etc) add a new entry to `pagerduty-service-override.js`.

```
{
'my-non-tenant-service': {
teams: ['Team-to-Alert']
}
}
```

Optionally you can specify an array of environment names if you want cdp-notify to alert from non-prod environments.
The name entry in the overrides file must match the service field of the grafana alert. The team must also be configured in cdp-notify (see above).

## Local Dev

Dependencies:
Expand Down
1 change: 1 addition & 0 deletions src/listeners/grafana/email/handle-grafana-email-alerts.js
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ export { handleGrafanaEmailAlert }
* @property {string} team
* @property {string} service
* @property {string} alertName
* @property {string} pagerDuty
* @property {string} status
* @property {string} startsAt
* @property {string} endsAt
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,14 @@ export function findIntegrationKeyForService(alert) {
export async function handleGrafanaPagerDutyAlert(message) {
const payload = JSON.parse(message.Body)

// reject alerts that are not flagged for pagerDuty
if (payload?.pagerDuty !== 'true') {
logger.info(
`ignoring alert ${message.MessageId} does not have a pagerDuty=true label`
)
return
}

if (!shouldSendAlert(payload, config.get('alertEnvironments'))) {
return
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,31 +190,113 @@ describe('#sendAlertsToPagerduty', () => {
test('should trigger pagerduty message when grafana alert is fired', async () => {
const integrationKey = '1234567890'
const team = 'team1'
const service = 'service1'
const service1 = 'service1'
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team }] })
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })

config.set(`pagerduty.teams.${team}.integrationKey`, integrationKey)
config.set('pagerduty.sendAlerts', true)

const payload = {
service,
const grafanaAlert = {
service: service1,
environment: 'prod',
status: 'firing'
status: 'firing',
pagerDuty: 'true'
}
const message = {
MessageId: '123',
Body: JSON.stringify(payload)
Body: JSON.stringify(grafanaAlert)
}
await handleGrafanaPagerDutyAlert(message)

expect(fetchService).toHaveBeenCalledWith(service)
expect(fetchService).toHaveBeenCalledWith(service1)
expect(sendAlert).toHaveBeenCalledWith(
integrationKey,
payload,
grafanaAlert,
[team],
expect.any(String),
'trigger'
)
})

test('should not trigger pagerduty if alert has no pagerDuty=true flag', async () => {
const integrationKey = '3453445'
const team2 = 'team2'
const service2 = 'service2'
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team2 }] })
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })

config.set(`pagerduty.teams.${team2}.integrationKey`, integrationKey)
config.set('pagerduty.sendAlerts', true)

const grafanaAlert = {
service: service2,
environment: 'prod',
status: 'firing'
}
const message = {
MessageId: '123',
Body: JSON.stringify(grafanaAlert)
}
await handleGrafanaPagerDutyAlert(message)

expect(fetchService).not.toHaveBeenCalled()
expect(sendAlert).not.toHaveBeenCalled()
})

test('should not trigger pagerduty no integration keys are set for team', async () => {
const team3 = 'team3'
const service3 = 'service3'
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team3 }] })
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })

config.set('pagerduty.sendAlerts', true)

const grafanaAlert = {
service: service3,
environment: 'prod',
status: 'firing',
pagerDuty: 'true'
}
const message = {
MessageId: '123',
Body: JSON.stringify(grafanaAlert)
}
await handleGrafanaPagerDutyAlert(message)

expect(fetchService).toHaveBeenCalledWith(service3)
expect(sendAlert).not.toHaveBeenCalled()
})

test('should trigger pagerduty using the fallback service key', async () => {
const integrationKey = 'service-level-key'
const team4 = 'team4'
const service4 = 'service4'
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team4 }] })
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })

config.set('pagerduty.sendAlerts', true)
config.set(`pagerduty.services.${service4}.integrationKey`, integrationKey)

const grafanaAlert = {
service: service4,
environment: 'prod',
status: 'firing',
pagerDuty: 'true'
}
const message = {
MessageId: '123',
Body: JSON.stringify(grafanaAlert)
}
await handleGrafanaPagerDutyAlert(message)

expect(fetchService).toHaveBeenCalledWith(service4)
expect(sendAlert).toHaveBeenCalledWith(
integrationKey,
grafanaAlert,
[team4],
expect.any(String),
'trigger'
)
})
})

0 comments on commit 1c71c7b

Please sign in to comment.