Skip to content

Commit 1c71c7b

Browse files
Merge pull request #18 from DEFRA/handle-pager-duty-label
will not send alerts to pagerDuty if they lack pagerDuty=true label
2 parents bcf4e25 + 84875fc commit 1c71c7b

File tree

4 files changed

+188
-7
lines changed

4 files changed

+188
-7
lines changed

README.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,95 @@
11
# cdp-notify
22

3+
How the pager duty integration works.
4+
5+
`cdp-notify` receives grafana alerts, the same as the email notification service.
6+
7+
Example grafana alert message:
8+
9+
```
10+
{
11+
environment: 'prod',
12+
service: 'test-service',
13+
team: 'Platform',
14+
alertName: 'test-service - prod',
15+
status: 'firing',
16+
startsAt: '2024-11-04 12:53:20 +0000 UTC',
17+
endsAt: '2024-11-04 12:53:20 +0000 UTC',
18+
summary: 'A test suite',
19+
description: '',
20+
series: '',
21+
runbookUrl: '',
22+
alertURL: 'https://grafana/alerting/grafana/0000/view'
23+
}
24+
```
25+
26+
When processing an event the pager duty handler does the following:
27+
28+
1. If the alert does not contain the field `pagerDuty="true"` (note: `"true"` is a quoted string, NOT a boolean) then no alert is sent
29+
30+
2. Decides if the alert comes from an environment we want to handle.
31+
This is done by comparing the `environment` field in the alert to either: alertEnvironments from config (default: `[prod]`) or to the environment field in `~/src/config/pagerduty-service-overides.js` if an entry exist for that service.
32+
If there is no match then the alert is not processed
33+
34+
3. Find all pager-duty integration keys for the alert
35+
This is done in several steps:
36+
37+
- First, find the team that owns the service.
38+
- If there is an entry for the service in `~/src/config/pagerduty-service-overides.js` use the teams set there.
39+
- If not, look up the teams for the service from cdp-portal-backend.
40+
- For each team see if there is a config entry matching `pagerduty.teams.${team}.integrationKey`
41+
- If no matches are found check the config for `pagerduty.services.${service}.integrationKey`
42+
- If no matches are found the alert is not processed.
43+
44+
4. Check if sending alerts is enabled
45+
If `pagerduty.sendAlerts` is set to false in config no alerts will be sent.
46+
5. Check if alert has a status of either `firing` or `resolve`.
47+
The status is remapped from `firing` to `trigger` and `resolved` to `resolve`.
48+
6. Generate a deduplication key
49+
This is a md5 hash of the service, environment and alertURL field.
50+
7. Call the pager duty api
51+
A payload is built using the integration key, alert details, status and deduplication key.
52+
53+
## PagerDuty Config
54+
55+
### Adding a new tenant service to pager-duty
56+
57+
Find the team name as it appears in portal for the owner of the service.
58+
Ensure the team is setup in pager-duty and an integration key is generated.
59+
In cdp-notify add a new entry to ~/src/config/index.js to `pagerduty.teams`
60+
61+
```
62+
pagerduty: {
63+
teams: {
64+
'my-new-team': {
65+
integrationKey: {
66+
doc: 'Integration key for digital service',
67+
format : String,
68+
default: 'key',
69+
env : 'MY_NEW_TEAM_INTEGRATION_KEY'
70+
}
71+
}
72+
}
73+
```
74+
75+
In add the team's pagerduty integration key as a secret to cdp-notify with an ID matching the id set in the new config entry (e.g. MY_NEW_TEAM_INTEGRATION_KEY).
76+
Redeploy the service.
77+
78+
### Sending alerts for non-tenant services
79+
80+
To send pager duty alerts for non-tenant services (e.g. mongodb, lambdas, workflows etc) add a new entry to `pagerduty-service-override.js`.
81+
82+
```
83+
{
84+
'my-non-tenant-service': {
85+
teams: ['Team-to-Alert']
86+
}
87+
}
88+
```
89+
90+
Optionally you can specify an array of environment names if you want cdp-notify to alert from non-prod environments.
91+
The name entry in the overrides file must match the service field of the grafana alert. The team must also be configured in cdp-notify (see above).
92+
393
## Local Dev
494

595
Dependencies:

src/listeners/grafana/email/handle-grafana-email-alerts.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,7 @@ export { handleGrafanaEmailAlert }
151151
* @property {string} team
152152
* @property {string} service
153153
* @property {string} alertName
154+
* @property {string} pagerDuty
154155
* @property {string} status
155156
* @property {string} startsAt
156157
* @property {string} endsAt

src/listeners/grafana/pagerduty/handle-grafana-pagerduty-alerts.js

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,14 @@ export function findIntegrationKeyForService(alert) {
9696
export async function handleGrafanaPagerDutyAlert(message) {
9797
const payload = JSON.parse(message.Body)
9898

99+
// reject alerts that are not flagged for pagerDuty
100+
if (payload?.pagerDuty !== 'true') {
101+
logger.info(
102+
`ignoring alert ${message.MessageId} does not have a pagerDuty=true label`
103+
)
104+
return
105+
}
106+
99107
if (!shouldSendAlert(payload, config.get('alertEnvironments'))) {
100108
return
101109
}

src/listeners/grafana/pagerduty/handle-grafana-pagerduty-alerts.test.js

Lines changed: 89 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -190,31 +190,113 @@ describe('#sendAlertsToPagerduty', () => {
190190
test('should trigger pagerduty message when grafana alert is fired', async () => {
191191
const integrationKey = '1234567890'
192192
const team = 'team1'
193-
const service = 'service1'
193+
const service1 = 'service1'
194194
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team }] })
195195
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })
196196

197197
config.set(`pagerduty.teams.${team}.integrationKey`, integrationKey)
198198
config.set('pagerduty.sendAlerts', true)
199199

200-
const payload = {
201-
service,
200+
const grafanaAlert = {
201+
service: service1,
202202
environment: 'prod',
203-
status: 'firing'
203+
status: 'firing',
204+
pagerDuty: 'true'
204205
}
205206
const message = {
206207
MessageId: '123',
207-
Body: JSON.stringify(payload)
208+
Body: JSON.stringify(grafanaAlert)
208209
}
209210
await handleGrafanaPagerDutyAlert(message)
210211

211-
expect(fetchService).toHaveBeenCalledWith(service)
212+
expect(fetchService).toHaveBeenCalledWith(service1)
212213
expect(sendAlert).toHaveBeenCalledWith(
213214
integrationKey,
214-
payload,
215+
grafanaAlert,
215216
[team],
216217
expect.any(String),
217218
'trigger'
218219
)
219220
})
221+
222+
test('should not trigger pagerduty if alert has no pagerDuty=true flag', async () => {
223+
const integrationKey = '3453445'
224+
const team2 = 'team2'
225+
const service2 = 'service2'
226+
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team2 }] })
227+
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })
228+
229+
config.set(`pagerduty.teams.${team2}.integrationKey`, integrationKey)
230+
config.set('pagerduty.sendAlerts', true)
231+
232+
const grafanaAlert = {
233+
service: service2,
234+
environment: 'prod',
235+
status: 'firing'
236+
}
237+
const message = {
238+
MessageId: '123',
239+
Body: JSON.stringify(grafanaAlert)
240+
}
241+
await handleGrafanaPagerDutyAlert(message)
242+
243+
expect(fetchService).not.toHaveBeenCalled()
244+
expect(sendAlert).not.toHaveBeenCalled()
245+
})
246+
247+
test('should not trigger pagerduty no integration keys are set for team', async () => {
248+
const team3 = 'team3'
249+
const service3 = 'service3'
250+
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team3 }] })
251+
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })
252+
253+
config.set('pagerduty.sendAlerts', true)
254+
255+
const grafanaAlert = {
256+
service: service3,
257+
environment: 'prod',
258+
status: 'firing',
259+
pagerDuty: 'true'
260+
}
261+
const message = {
262+
MessageId: '123',
263+
Body: JSON.stringify(grafanaAlert)
264+
}
265+
await handleGrafanaPagerDutyAlert(message)
266+
267+
expect(fetchService).toHaveBeenCalledWith(service3)
268+
expect(sendAlert).not.toHaveBeenCalled()
269+
})
270+
271+
test('should trigger pagerduty using the fallback service key', async () => {
272+
const integrationKey = 'service-level-key'
273+
const team4 = 'team4'
274+
const service4 = 'service4'
275+
jest.mocked(fetchService).mockResolvedValue({ teams: [{ name: team4 }] })
276+
jest.mocked(sendAlert).mockResolvedValue({ text: () => 'ok' })
277+
278+
config.set('pagerduty.sendAlerts', true)
279+
config.set(`pagerduty.services.${service4}.integrationKey`, integrationKey)
280+
281+
const grafanaAlert = {
282+
service: service4,
283+
environment: 'prod',
284+
status: 'firing',
285+
pagerDuty: 'true'
286+
}
287+
const message = {
288+
MessageId: '123',
289+
Body: JSON.stringify(grafanaAlert)
290+
}
291+
await handleGrafanaPagerDutyAlert(message)
292+
293+
expect(fetchService).toHaveBeenCalledWith(service4)
294+
expect(sendAlert).toHaveBeenCalledWith(
295+
integrationKey,
296+
grafanaAlert,
297+
[team4],
298+
expect.any(String),
299+
'trigger'
300+
)
301+
})
220302
})

0 commit comments

Comments
 (0)