-
Notifications
You must be signed in to change notification settings - Fork 437
feat(llmobs): support evp proxy for evaluation metrics #12966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 229 ± 2 ms. The average import time from base is: 231 ± 1 ms. The import time difference between this PR and base is: -2.02 ± 0.07 ms. Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-04-17 16:19:26 Comparing candidate commit 832dfda in PR branch Found 7 performance improvements and 0 performance regressions! Performance is the same for 487 metrics, 2 unstable metrics. scenario:iast_aspects-ospathsplitdrive_aspect
scenario:iast_aspects-ospathsplitext_aspect
scenario:otelspan-start-finish
scenario:otelspan-start-finish-telemetry
scenario:span-start-finish
scenario:span-start-finish-telemetry
scenario:span-start-finish-traceid128
|
243a0b1
to
cd62fb9
Compare
bfe40fe
to
f561ec9
Compare
abd46cb
to
b3b960b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a first pass, looks really good to me!! this should make editing/testing/debugging the writers a lot easier. and thanks for throwing agent-proxy evals in too! mostly just left some questions/suggestions and a couple style nits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did another pass - just a couple questions and one bug i found when trying it out locally, but i think it should be good after!
d6a7eac
to
2544c96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works locally for me now, everything looks great!
Resolves #13336. Credit to @IAL32 and a cherry-pick from #13338. When we made the jump from using the shared HTTPWriter to our own BaseLLMObsWriter class to submit spans and evals #12966, we used our own `_get_connection()` to return HTTP/HTTPS connections. However we forgot to include UDSHTTP connection (for the unix socket case), which means we broke UDS support until now. ### Why was this a problem in the first place? We used our own `_get_connection()` in #12966 because of an issue where creating the shared HTTPConnection helper class was leading to MRO superclass constructor issues in our tests. At the time we thought this was due to the shared HTTPConnection helper class having multiple superclasses and an issue with Python 3.10 in general, but this turns out to be due to vcrpy mocking HTTPConnection entirely and only being an issue in tests that rely on vcrpy. This PR makes some changes to avoid using vcrpy when not necessary, and making better assertions to ensure that spans are being sent (not necessary in most tests to have them be accepted). ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [](https://datadoghq.atlassian.net/browse/MLOB-2725) --------- Co-authored-by: IAL32 <[email protected]>
MLOB-2495
This PR does four things:
BaseLLMObsWriter
class. Additionally we simplify the writer constructor such that only interval/timeout is required, and api_key/site/agentless_enabled/_agentless_url are optional and otherwise inferred internally. _agentless_url is now a internal optional override param for testing.TestLLMObsSpanWriter
class to be defined in a singletests/llmobs/_utils.py
file rather than per integration test conftest file.Eval support for Agent Proxy
We were previously only submitting eval metrics via agentless intake. However we recently discovered that agent proxy does support submitting to direct LLMObs API intake (with the correct subdomain headers). This PR adds that support so that users do not need to specify an API key to submit eval metrics, especially if they are relying on agent proxy to submit traces.
Writer Refactor
We were previously making
LLMObsSpanWriter
subclass offHTTPWriter
, which is the APM tracer's base writer class for submitting traces. We initially reused this class as it provided a lot of convenience methods/functionality like retry logic, payload submission logic specifically to the agent (when we were implementing agent proxy support for spans).However, we've decided that there is a bit too much functionality and abstraction in the
HTTPWriter
class, including the writer client / encoder structure, vague error logs on unsuccessful writes, as well as the slightly convoluted codepath for actually flushing/submitting payloads. For example, the HTTPWriter structure involves:enqueue()
-->write()
-->write_with_client()
-->client.encoder.put()
periodic()
-->flush_queue()
-->_flush_queue_with_client()
-->client.encoder.encode()
+_send_payload()
This mismatch in functionality/requirements is due in part because LLMObs spans and eval metric payloads are JSON encoded, meaning there is no real need for a separate encoder-specific logic.
This new refactored writer directly manages its buffer, enqueuing, encoding, and flushing. This logic looks something like this:
enqueue(event)
-->BaseLLMObsWriter._buffer.append(event)
.periodic()
-->BaseLLMObsWriter._encode()
+BaseLLMObsWriter._send_payload()
Once we eventually specify the encoding schema to be more complex than the simple
json.dumps()
that we are currently doing, we can reuse theHTTPWriter
class and its encoder/client pattern. Until then, we don't really need the complex functionality provided byHTTPWriter
.Writer Initialization
The Writer classes now take in the following arguments:
tests/llmobs/conftest.py
)Instead of the previous design where subclass constructors individually handled setting intake/endpoint/url/headers for its own case, now each Span/Eval writer class has the following class attributes that are used to build up the actual intake/endpoint/headers in the base class constructor:
Checklist
Reviewer Checklist