|
| 1 | +# **RFC0009 for Presto** |
| 2 | + |
| 3 | +## Enhancing Open Telemetry Implementation in Presto |
| 4 | + |
| 5 | +Proposers |
| 6 | + |
| 7 | +* Suresh Babu Areekara |
| 8 | +* Siddarth Ajay |
| 9 | +* Ben Tony Joe |
| 10 | + |
| 11 | +## [Related Issues] |
| 12 | + |
| 13 | +* https://github.com/prestodb/presto/issues/23975 |
| 14 | + |
| 15 | +## Summary |
| 16 | + |
| 17 | +The existing Open Telemetry implementation https://github.com/prestodb/presto/pull/18534 was an experimental feature, had a limited set of telemetry data(Query state changes) and did not include a child span concept. The recent implementation will make Presto more flexible, allowing support for both parent and child spans. Additionally, traces can now be propagated to the worker nodes as well. |
| 18 | + |
| 19 | +## Background |
| 20 | + |
| 21 | +OpenTelemetry is a powerful serviceability framework that helps to gain insights into the performance and behaviour of the systems. It facilitates generation, collection, and management of telemetry data such as traces. |
| 22 | + |
| 23 | +The OSS Presto had a basic implementation of Open Telemetry. |
| 24 | + |
| 25 | + |
| 26 | + |
| 27 | +## Proposed Implementation |
| 28 | + |
| 29 | +The Presto can be manually instrumented and will have the following advantages. |
| 30 | +- More flexibility and control over instrumentation |
| 31 | +- Easier to customize what operations can be monitored |
| 32 | +- Ability to pass additional information as span attributes and events |
| 33 | + |
| 34 | + |
| 35 | +- Open Telemetry SDK provides libraries for instrumenting applications to capture telemetry data(traces). It includes built-in integrations for common frameworks and supports custom instrumentation. |
| 36 | +- Presto application is getting instrumented using OpenTelemetry API. |
| 37 | +- After instrumentation Presto starts the span and register with OpenTelemetry SDK. |
| 38 | +- SDK creates context which is the actual association to the flow and attach to the current span(parent). |
| 39 | +- While performing any operations, Presto adds the required attributes and events to the respective span. |
| 40 | +- In case of sub operations (child span), Presto creates child span, extract the parent context and attach to the child span as parent context so that all parent and child spans get connected. |
| 41 | +- After the operation spans will get ended in the order of creation and update the span state. |
| 42 | +- SDK keeps on checking the flush trigger and if it reaches the batch, all those spans got batched and send to backend. |
| 43 | +- Backend is a system to store, analyse and visualize this telemetry data. Common backends include systems like Jaeger, Instana, Grafana stack, etc. |
| 44 | + |
| 45 | + |
| 46 | + |
| 47 | +Using context propagation, Signals can be correlated with each other, regardless of where they are generated. |
| 48 | + |
| 49 | +Context contains the information for the sending and receiving service, or execution unit, to correlate one signal with another. For example, if service A calls service B, then a span from service A whose ID is in context can be used as the parent span for the next span created in service B. |
| 50 | + |
| 51 | +Propagation is the mechanism that moves context between services and processes. It serializes or deserializes the context object and provides the relevant information to be propagated from one service to another. |
| 52 | + |
| 53 | +Propagation is usually handled by instrumentation libraries and is transparent to the user. In the event that you need to manually propagate context, you can use the Propagators API. |
| 54 | + |
| 55 | +OpenTelemetry maintains several official propagators. The default propagator is using the headers specified by the W3C TraceContext specification. |
| 56 | +- In Presto in areas where REST calls involved, we use the header for context propagation as per the above image. |
| 57 | + |
| 58 | +- Presto Coordinator fetch the current span context and inject as the traceparent http header. Which is then extracted from the Worker side and use to create the child spans with the parent context. |
| 59 | + |
| 60 | +- In some other areas parent context is available in child context and we directly set the parent context in child spans. |
| 61 | + |
| 62 | + |
| 63 | +## [Optional] Other Approaches Considered |
| 64 | + |
| 65 | +Based on the discussion, this may need to be updated with feedback from reviewers. |
| 66 | + |
| 67 | +## Adoption Plan |
| 68 | + |
| 69 | +Presto Open Telemetry can be configured by modifying the values in presto-main/etc/telemetry.properties |
| 70 | + |
| 71 | +```properties |
| 72 | +otel-factory.name=otel |
| 73 | +tracing-enabled=false |
| 74 | +tracing-backend-url=<backend endpoint> |
| 75 | +max-exporter-batch-size=256 |
| 76 | +max-queue-size=1024 |
| 77 | +schedule-delay=1000 |
| 78 | +exporter-timeout=1024 |
| 79 | +span-sampling=true |
| 80 | +``` |
| 81 | + |
| 82 | +***otel-factory.name***: unique identifier for OpenTelemetry factory implementation to be registered |
| 83 | + |
| 84 | +***tracing-enabled***: boolean value controlling if tracing is on or off |
| 85 | + |
| 86 | +***tracing-backend-url***: points to otel collector or backend for exporting telemetry data |
| 87 | + |
| 88 | +***max-exporter-batch-size***: maximum number of spans that will be exported in one batch |
| 89 | + |
| 90 | +***max-queue-size***: maximum number of spans that can be queued before being processed for export |
| 91 | + |
| 92 | +***schedule-delay***: delay between batches of span export, controlling how frequently spans are exported |
| 93 | + |
| 94 | +***exporter-timeout***: how long the span exporter will wait for a batch of spans to be successfully sent before timing out |
| 95 | + |
| 96 | +***span-sampling***: boolean to enable/disable sampling. If enabled, spans are only generated for major operations |
| 97 | + |
| 98 | +## Test Plan |
| 99 | + |
| 100 | +We have added UT cases for all the OTel implementations and UT span assertion for few major classes where the spans are actually getting generated. |
0 commit comments