Skip to content

Commit 5e9327f

Browse files
authored
ref(node): Improve Express URL Parameterization (#5450)
Improve the URL parameterization for transaction names in our Express integration. Previously, we would only obtain a parameterized version of the incoming request's URL **after** the registered handler(s) finished their job, right before we called `transaction.finish()`. This is definitely too late for DSC propagation, if the handlers made any outgoing request. In that case, the DSC (propagated via the `baggage` header) would not contain a transaction name. As of this PR, we patch the `Express.Router` prototype, more precisely the `process_params` function. This function contains a reference to the incoming `req` object as well as to the `layer` that matched the route. We hook into the function to extract the matched and parameterized partial route of the layer and we attach it to the `req` object. This happens for every matched layer, until the entire route is resolved, in which stichtched together the entire parameterized route. For the time being, we deliberately ignore route registrations with wildcards (e.g. `app.get('/*', ...)`) when stitching together the parameterized route. After we added a new part to the reconstructed route, we check if it has the same number of segments as the original (raw) URL. In case it has, we assume that the parameterized route is complete and we update the active transaction with the parameterized name. Additionally, we set the transaction source to `route` at this point. In case we never get to the point where we have an equal amount of URL segments, the transaction name is never updated, meaning its source stays `url` (and therefore, it's not added to the DSC). In that case, we continue to update the transaction name like before right before the end of the transaction. After reading the Express source code, we confirmed that the process for resolving parts of a route and handling it is performed sequentially **for each matched route segment**. This means that each matched layer's handle function is invoked before the next part of the URL is matched. We therefore still have timing issues around DSC population if any of those intermediate handler functions make outgoing requests. A simple example: The incoming request is `/api/users/123` * We intercept the request and start the transaction with the name `/api/users/123` and source `url` * The first layer that matches is matches the route `/*`. * We start reconstructing the parameterized route with `/` * The handle function of this layer is invoked (e.g. to check authentication) * the handler makes an XHR request * at this point, we populate and freeze the DSC (without transaction name) * The second handler matches the route `/api/users/123` * We obtain the parameterized route and our reconstructed route is now `/api/users/:id` * Now we have 3 segments in the original and in the reconstructed route * We update the transaction name with `/api/users/:id` and set its source to `route` * The handle function of this layer is invoked * every request that might happen here will still propagate the DSC from layer 1 because it can't be modified * We finish the transaction This example shows that we still have timing issues w.r.t DSC propagation. That is, assuming that the Node SDK is in this case the head of the trace and it didn't intercept incoming DSC from the incoming request. However, with this PR we now at least have the chance to get the correct transaction name early enough. ref: #5342
1 parent 3f94dc1 commit 5e9327f

File tree

9 files changed

+292
-40
lines changed

9 files changed

+292
-40
lines changed

packages/node-integration-tests/suites/express/sentry-trace/baggage-header-assign/test.ts

+2-2
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ test('Should populate and propagate sentry baggage if sentry-trace header does n
8080
host: 'somewhere.not.sentry',
8181
// TraceId changes, hence we only expect that the string contains the traceid key
8282
baggage: expect.stringContaining(
83-
'sentry-environment=prod,sentry-release=1.0,sentry-public_key=public,sentry-trace_id=',
83+
'sentry-environment=prod,sentry-release=1.0,sentry-transaction=GET%20%2Ftest%2Fexpress,sentry-public_key=public,sentry-trace_id=',
8484
),
8585
},
8686
});
@@ -99,7 +99,7 @@ test('Should populate Sentry and ignore 3rd party content if sentry-trace header
9999
host: 'somewhere.not.sentry',
100100
// TraceId changes, hence we only expect that the string contains the traceid key
101101
baggage: expect.stringContaining(
102-
'sentry-environment=prod,sentry-release=1.0,sentry-public_key=public,sentry-trace_id=',
102+
'sentry-environment=prod,sentry-release=1.0,sentry-transaction=GET%20%2Ftest%2Fexpress,sentry-public_key=public,sentry-trace_id=',
103103
),
104104
},
105105
});
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
import * as Sentry from '@sentry/node';
2+
import * as Tracing from '@sentry/tracing';
3+
import cors from 'cors';
4+
import express from 'express';
5+
import http from 'http';
6+
7+
const app = express();
8+
9+
export type TestAPIResponse = { test_data: { host: string; 'sentry-trace': string; baggage: string } };
10+
11+
Sentry.init({
12+
dsn: 'https://[email protected]/1337',
13+
release: '1.0',
14+
environment: 'prod',
15+
integrations: [new Sentry.Integrations.Http({ tracing: true }), new Tracing.Integrations.Express({ app })],
16+
tracesSampleRate: 1.0,
17+
});
18+
19+
Sentry.setUser({ id: 'user123', segment: 'SegmentA' });
20+
21+
app.use(Sentry.Handlers.requestHandler());
22+
app.use(Sentry.Handlers.tracingHandler());
23+
24+
app.use(cors());
25+
26+
app.get('/test/express', (_req, res) => {
27+
const transaction = Sentry.getCurrentHub().getScope()?.getTransaction();
28+
if (transaction) {
29+
transaction.traceId = '86f39e84263a4de99c326acab3bfe3bd';
30+
transaction.metadata.source = undefined;
31+
}
32+
const headers = http.get('http://somewhere.not.sentry/').getHeaders();
33+
34+
// Responding with the headers outgoing request headers back to the assertions.
35+
res.send({ test_data: headers });
36+
});
37+
38+
app.use(Sentry.Handlers.errorHandler());
39+
40+
export default app;
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
import * as path from 'path';
2+
3+
import { getAPIResponse, runServer } from '../../../../utils/index';
4+
import { TestAPIResponse } from '../server';
5+
6+
test('Does not include transaction name if transaction source is not set', async () => {
7+
const url = await runServer(__dirname, `${path.resolve(__dirname, '.')}/server.ts`);
8+
9+
const response = (await getAPIResponse(new URL(`${url}/express`))) as TestAPIResponse;
10+
const baggageString = response.test_data.baggage;
11+
12+
expect(response).toBeDefined();
13+
expect(response).toMatchObject({
14+
test_data: {
15+
host: 'somewhere.not.sentry',
16+
},
17+
});
18+
expect(baggageString).not.toContain('sentry-transaction=');
19+
});

packages/node-integration-tests/suites/express/sentry-trace/baggage-header-out/test.ts

+1-17
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,8 @@ test('should attach a `baggage` header to an outgoing request.', async () => {
1313
test_data: {
1414
host: 'somewhere.not.sentry',
1515
baggage:
16-
'sentry-environment=prod,sentry-release=1.0,sentry-user_segment=SegmentA' +
16+
'sentry-environment=prod,sentry-release=1.0,sentry-transaction=GET%20%2Ftest%2Fexpress,sentry-user_segment=SegmentA' +
1717
',sentry-public_key=public,sentry-trace_id=86f39e84263a4de99c326acab3bfe3bd,sentry-sample_rate=1',
1818
},
1919
});
2020
});
21-
22-
test('Does not include transaction name if transaction source is not set', async () => {
23-
const url = await runServer(__dirname, `${path.resolve(__dirname, '.')}/server.ts`);
24-
25-
const response = (await getAPIResponse(new URL(`${url}/express`))) as TestAPIResponse;
26-
const baggageString = response.test_data.baggage;
27-
28-
expect(response).toBeDefined();
29-
expect(response).toMatchObject({
30-
test_data: {
31-
host: 'somewhere.not.sentry',
32-
},
33-
});
34-
expect(baggageString).not.toContain('sentry-user_id=');
35-
expect(baggageString).not.toContain('sentry-transaction=');
36-
});

packages/node-integration-tests/suites/express/sentry-trace/baggage-other-vendors-with-sentry-entries/test.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ test('should ignore sentry-values in `baggage` header of a third party vendor an
3030
test_data: {
3131
host: 'somewhere.not.sentry',
3232
baggage: expect.stringContaining(
33-
'other=vendor,foo=bar,third=party,last=item,sentry-environment=prod,sentry-release=1.0,sentry-public_key=public',
33+
'other=vendor,foo=bar,third=party,last=item,sentry-environment=prod,sentry-release=1.0,sentry-transaction=GET%20%2Ftest%2Fexpress,sentry-public_key=public',
3434
),
3535
},
3636
});

packages/node/src/handlers.ts

+3-2
Original file line numberDiff line numberDiff line change
@@ -40,12 +40,13 @@ export function tracingHandler(): (
4040

4141
const baggage = parseBaggageSetMutability(rawBaggageString, traceparentData);
4242

43+
const [name, source] = extractPathForTransaction(req, { path: true, method: true });
4344
const transaction = startTransaction(
4445
{
45-
name: extractPathForTransaction(req, { path: true, method: true }),
46+
name,
4647
op: 'http.server',
4748
...traceparentData,
48-
metadata: { baggage },
49+
metadata: { baggage, source },
4950
},
5051
// extra context passed to the tracesSampler
5152
{ request: extractRequestData(req) },

packages/node/test/requestdata.test.ts

+98-1
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66

77
// TODO (v8 / #5257): Remove everything above
88

9-
import { Event, User } from '@sentry/types';
9+
import { Event, TransactionSource, User } from '@sentry/types';
1010
import {
1111
addRequestDataToEvent,
1212
AddRequestDataToEventOptions,
1313
CrossPlatformRequest,
14+
extractPathForTransaction,
1415
extractRequestData as newExtractRequestData,
1516
} from '@sentry/utils';
1617
import * as cookie from 'cookie';
@@ -485,3 +486,99 @@ describe.each([oldExtractRequestData, newExtractRequestData])(
485486
});
486487
},
487488
);
489+
490+
describe('extractPathForTransaction', () => {
491+
it.each([
492+
[
493+
'extracts a parameterized route and method if available',
494+
{
495+
method: 'get',
496+
baseUrl: '/api/users',
497+
route: { path: '/:id/details' },
498+
originalUrl: '/api/users/123/details',
499+
} as CrossPlatformRequest,
500+
{ path: true, method: true },
501+
'GET /api/users/:id/details',
502+
'route' as TransactionSource,
503+
],
504+
[
505+
'ignores the method if specified',
506+
{
507+
method: 'get',
508+
baseUrl: '/api/users',
509+
route: { path: '/:id/details' },
510+
originalUrl: '/api/users/123/details',
511+
} as CrossPlatformRequest,
512+
{ path: true, method: false },
513+
'/api/users/:id/details',
514+
'route' as TransactionSource,
515+
],
516+
[
517+
'ignores the path if specified',
518+
{
519+
method: 'get',
520+
baseUrl: '/api/users',
521+
route: { path: '/:id/details' },
522+
originalUrl: '/api/users/123/details',
523+
} as CrossPlatformRequest,
524+
{ path: false, method: true },
525+
'GET',
526+
'route' as TransactionSource,
527+
],
528+
[
529+
'returns an empty string if everything should be ignored',
530+
{
531+
method: 'get',
532+
baseUrl: '/api/users',
533+
route: { path: '/:id/details' },
534+
originalUrl: '/api/users/123/details',
535+
} as CrossPlatformRequest,
536+
{ path: false, method: false },
537+
'',
538+
'route' as TransactionSource,
539+
],
540+
[
541+
'falls back to the raw URL if no parameterized route is available',
542+
{
543+
method: 'get',
544+
baseUrl: '/api/users',
545+
originalUrl: '/api/users/123/details',
546+
} as CrossPlatformRequest,
547+
{ path: true, method: true },
548+
'GET /api/users/123/details',
549+
'url' as TransactionSource,
550+
],
551+
])(
552+
'%s',
553+
(
554+
_: string,
555+
req: CrossPlatformRequest,
556+
options: { path?: boolean; method?: boolean },
557+
expectedRoute: string,
558+
expectedSource: TransactionSource,
559+
) => {
560+
const [route, source] = extractPathForTransaction(req, options);
561+
562+
expect(route).toEqual(expectedRoute);
563+
expect(source).toEqual(expectedSource);
564+
},
565+
);
566+
567+
it('overrides the requests information with a custom route if specified', () => {
568+
const req = {
569+
method: 'get',
570+
baseUrl: '/api/users',
571+
route: { path: '/:id/details' },
572+
originalUrl: '/api/users/123/details',
573+
} as CrossPlatformRequest;
574+
575+
const [route, source] = extractPathForTransaction(req, {
576+
path: true,
577+
method: true,
578+
customRoute: '/other/path/:id/details',
579+
});
580+
581+
expect(route).toEqual('GET /other/path/:id/details');
582+
expect(source).toEqual('route');
583+
});
584+
});

packages/tracing/src/integrations/node/express.ts

+101-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1+
/* eslint-disable max-lines */
12
import { Integration, Transaction } from '@sentry/types';
2-
import { logger } from '@sentry/utils';
3+
import { CrossPlatformRequest, extractPathForTransaction, logger } from '@sentry/utils';
34

45
type Method =
56
| 'all'
@@ -32,6 +33,32 @@ type Router = {
3233
[method in Method]: (...args: any) => any; // eslint-disable-line @typescript-eslint/no-explicit-any
3334
};
3435

36+
/* Extend the CrossPlatformRequest type with a patched parameter to build a reconstructed route */
37+
type PatchedRequest = CrossPlatformRequest & { _reconstructedRoute?: string };
38+
39+
/* Type used for pathing the express router prototype */
40+
type ExpressRouter = Router & {
41+
_router?: ExpressRouter;
42+
stack?: Layer[];
43+
lazyrouter?: () => void;
44+
settings?: unknown;
45+
process_params: (
46+
layer: Layer,
47+
called: unknown,
48+
req: PatchedRequest,
49+
res: ExpressResponse,
50+
done: () => void,
51+
) => unknown;
52+
};
53+
54+
/* Type used for pathing the express router prototype */
55+
type Layer = {
56+
match: (path: string) => boolean;
57+
handle_request: (req: PatchedRequest, res: ExpressResponse, next: () => void) => void;
58+
route?: { path: string };
59+
path?: string;
60+
};
61+
3562
interface ExpressResponse {
3663
once(name: string, callback: () => void): void;
3764
}
@@ -83,6 +110,7 @@ export class Express implements Integration {
83110
return;
84111
}
85112
instrumentMiddlewares(this._router, this._methods);
113+
instrumentRouter(this._router as ExpressRouter);
86114
}
87115
}
88116

@@ -211,3 +239,75 @@ function patchMiddleware(router: Router, method: Method): Router {
211239
function instrumentMiddlewares(router: Router, methods: Method[] = []): void {
212240
methods.forEach((method: Method) => patchMiddleware(router, method));
213241
}
242+
243+
/**
244+
* Patches the prototype of Express.Router to accumulate the resolved route
245+
* if a layer instance's `match` function was called and it returned a successful match.
246+
*
247+
* @see https://github.com/expressjs/express/blob/master/lib/router/index.js
248+
*
249+
* @param appOrRouter the router instance which can either be an app (i.e. top-level) or a (nested) router.
250+
*/
251+
function instrumentRouter(appOrRouter: ExpressRouter): void {
252+
// This is how we can distinguish between app and routers
253+
const isApp = 'settings' in appOrRouter;
254+
255+
// In case the app's top-level router hasn't been initialized yet, we have to do it now
256+
if (isApp && appOrRouter._router === undefined && appOrRouter.lazyrouter) {
257+
appOrRouter.lazyrouter();
258+
}
259+
260+
const router = isApp ? appOrRouter._router : appOrRouter;
261+
const routerProto = Object.getPrototypeOf(router) as ExpressRouter;
262+
263+
const originalProcessParams = routerProto.process_params;
264+
routerProto.process_params = function process_params(
265+
layer: Layer,
266+
called: unknown,
267+
req: PatchedRequest,
268+
res: ExpressResponse & SentryTracingResponse,
269+
done: () => unknown,
270+
) {
271+
// Base case: We're in the first part of the URL (thus we start with the root '/')
272+
if (!req._reconstructedRoute) {
273+
req._reconstructedRoute = '';
274+
}
275+
276+
// If the layer's partial route has params, the route is stored in layer.route. Otherwise, the hardcoded path
277+
// (i.e. a partial route without params) is stored in layer.path
278+
const partialRoute = layer.route?.path || layer.path || '';
279+
280+
// Normalize the partial route so that it doesn't contain leading or trailing slashes
281+
// and exclude empty or '*' wildcard routes.
282+
// The exclusion of '*' routes is our best effort to not "pollute" the transaction name
283+
// with interim handlers (e.g. ones that check authentication or do other middleware stuff).
284+
// We want to end up with the parameterized URL of the incoming request without any extraneous path segments.
285+
const finalPartialRoute = partialRoute
286+
.split('/')
287+
.filter(segment => segment.length > 0 && !segment.includes('*'))
288+
.join('/');
289+
290+
// If we found a valid partial URL, we append it to the reconstructed route
291+
if (finalPartialRoute.length > 0) {
292+
req._reconstructedRoute += `/${finalPartialRoute}`;
293+
}
294+
295+
// Now we check if we are in the "last" part of the route. We determine this by comparing the
296+
// number of URL segments from the original URL to that of our reconstructed parameterized URL.
297+
// If we've reached our final destination, we update the transaction name.
298+
const urlLength = req.originalUrl?.split('/').filter(s => s.length > 0).length;
299+
const routeLength = req._reconstructedRoute.split('/').filter(s => s.length > 0).length;
300+
if (urlLength === routeLength) {
301+
const transaction = res.__sentry_transaction;
302+
if (transaction && transaction.metadata.source !== 'custom') {
303+
// If the request URL is '/' or empty, the reconstructed route will be empty.
304+
// Therefore, we fall back to setting the final route to '/' in this case.
305+
const finalRoute = req._reconstructedRoute || '/';
306+
307+
transaction.setName(...extractPathForTransaction(req, { path: true, method: true, customRoute: finalRoute }));
308+
}
309+
}
310+
311+
return originalProcessParams.call(this, layer, called, req, res, done);
312+
};
313+
}

0 commit comments

Comments
 (0)