-
Notifications
You must be signed in to change notification settings - Fork 633
Description
What version of OpenTelemetry are you using?
Current main branch of open-telemetry/opentelemetry-js-contrib with @opentelemetry/[email protected]
What version of Node are you using?
v24.13.0
What did you do?
The Node.js SDK auto-instrumentation-node installs a shutdown hook. This can block process termination by ~ 5 seconds in certain error scenarios, for example if the configured exporter cannot connect to its downstream collector (e.g. OTEL_EXPORTER_OTLP_ENDPOINT points to something that is currently not network-reachable).
As far as I can tell, this has always been the case for certain error conditions, but the likelihood of this happening seems to have increased with this change in otlp-exporter-base.
I agree that flushing telemetry via sdk.shutdown at process termination is the right call, and blocking process termination for a few tens or maybe hundreds of milliseconds is acceptable for that. Blocking process termination for up to 5 seconds can be problematic though, in particular in Kubernetes etc., where pod restart times are important.
What did you expect to see?
Process termination not being blocked by auto-instrumentations-node.
What did you see instead?
Process termination is blocked for ~5 seconds.
Additional context
There is a reproducer integration test here: #3348
To be honest, I'm not entirely sure if this can be solved in opentelemetry-js-contrib or if this needs to be solved in opentelemetry-js. My guess is that the shutdown call maybe should accept an optional timeout parameter? Not sure.
Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.