feat(google-generativeai): Add metrics support #3506

adharshctr · 2025-12-08T13:28:35Z

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Important

Adds metrics support to Google Generative AI instrumentation, introducing histograms for token usage and operation duration.

Metrics Support:
- Introduces token_histogram and duration_histogram in __init__.py for tracking token usage and operation duration.
- Adds is_metrics_enabled() to check if metrics are enabled via TRACELOOP_METRICS_ENABLED environment variable.
- Implements _create_metrics() to create histograms for tokens and duration.
Function Modifications:
- Updates _awrap() and _wrap() in __init__.py to record operation duration using duration_histogram.
- Modifies _build_from_streaming_response(), _abuild_from_streaming_response(), and _handle_response() to include token_histogram.
- Changes set_model_response_attributes() in span_utils.py to record token counts using token_histogram.

^{This description was created by}^{for 012121e. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Optional metrics for Google Generative AI calls: captures input/output token counts and request duration when enabled; per-call metrics include system and model attributes.
- Public check and initialization for metrics support when a meter provider is configured.
Tests
- Added tests and recorded cassettes validating client spans, token-usage metrics, duration metrics, and instrumentation lifecycle (instrument/uninstrument idempotency).

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-08T13:28:46Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds optional metrics collection to the Google Generative AI instrumentation: creates token and duration histograms when a meter provider is present, threads those histograms through wrapper and handler call paths, and records token counts and request durations during response handling.

Changes

Cohort / File(s)	Summary
Metrics infra & instrumentation `packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py`	Adds `is_metrics_enabled()` and `_create_metrics()`; imports `Meter`, `get_meter`, `Meters`; creates token and duration histograms during `_instrument()` when a meter provider is provided; expands wrapper/handler signatures to accept `token_histogram` and `duration_histogram`; times requests and records duration metrics; propagates histograms into streaming builders and response handlers.
Span utils & token recording `packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`	Updates `set_model_response_attributes(...)` signature to accept `token_histogram`; when provided and usage metadata exists, records prompt and candidate token counts to `token_histogram` with system/model/token_type attributes while preserving existing span attribute behavior.
Tests, fixtures & cassettes `packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py`, `packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`, `packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py`, `packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/**`	Reworks fixtures: session-scoped `exporter` (InMemorySpanExporter) and new `metrics_test_context` (MeterProvider + InMemoryMetricReader), autouse cleanup, and environment fixture; adjusts `genai_client` fixture return; adds/updates tests asserting spans and metrics; adds new HTTP cassette YAMLs for generate_content interactions.

Sequence Diagram(s)

sequenceDiagram
    participant Instr as Instrumentor
    participant Meter as Meter / MeterProvider
    participant Wrapper as Wrapper (sync/async)
    participant Model as Remote Model / Client
    participant SpanUtil as Span Utils

    Instr->>Meter: get_meter() / _create_metrics() when meter provider present
    Meter-->>Instr: token_histogram, duration_histogram
    Instr->>Wrapper: create wrapper, pass token_histogram & duration_histogram

    Wrapper->>Wrapper: start timer (if duration_histogram)
    Wrapper->>Model: perform request / stream
    Model-->>Wrapper: response / streaming chunks
    Wrapper->>SpanUtil: set_model_response_attributes(span, response, model, token_histogram)
    SpanUtil->>Meter: record prompt/output token counts to token_histogram
    Wrapper->>Meter: record elapsed time to duration_histogram
    Wrapper-->>Instr: complete

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Verify _create_metrics() and _instrument() correctly obtain the meter and create histograms with appropriate semantic attributes.
Confirm all wrapper/handler signatures (sync and async) were updated consistently and no call sites omitted the new args.
Inspect set_model_response_attributes() token extraction and histogram.record() usage for None-safety and typing.
Validate tests/fixtures initialize and tear down metric providers/readers and that cassettes align with test requests.

Poem

🐰 I hopped through code to count each token,
Histograms bloom where responses awoken.
Timers tick softly; spans hum the tale,
Counts and durations trail the rabbit's trail.
A small hare cheers: metrics stitched without fail!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.43% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(google-generativeai): Add metrics support' accurately describes the main change: adding metrics support to the Google Generative AI instrumentation. It is specific and directly aligns with the primary objective of introducing token and duration histograms.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to 012121e in 2 minutes and 29 seconds. Click for details.

Reviewed 294 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1.

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py:308

Draft comment:
Clarify default behavior: is_metrics_enabled() defaults to true if the env var is unset. Confirm if this is intentional or if false should be the default.
Reason this comment was not posted:
Comment looked like it was already resolved.

2.

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:481

Draft comment:
Minor style note: The indentation of the second token_histogram.record call is inconsistent compared to the first call. Consider aligning them for readability.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_ewO5ELdqUUn3w64P

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
348-378: NameError when metrics are disabled.

token_histogram and duration_histogram are only defined inside the if is_metrics_enabled(): block at lines 351-352, but wrapper_args at line 366 references them unconditionally. When TRACELOOP_METRICS_ENABLED=false, these variables will be undefined, causing a NameError.

Apply this diff to initialize the histograms to None before the conditional:
         meter_provider = kwargs.get("meter_provider")
         meter = get_meter(__name__, __version__, meter_provider)

+        token_histogram = None
+        duration_histogram = None
         if is_metrics_enabled():
             token_histogram, duration_histogram = _create_metrics(meter)

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
39-39: Unused import Meter.

Meter is imported but only get_meter is used. The type hint on _create_metrics could use it, but currently it's just documentation.
-from opentelemetry.metrics import Meter, get_meter
+from opentelemetry.metrics import get_meter
Alternatively, if you want to keep it for type annotation, that's also fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f12aaec and 012121e.

📒 Files selected for processing (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (12 hunks)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

🧬 Code graph analysis (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

set_model_response_attributes (449-490)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/utils.py (2)

should_emit_events (44-50)

wrapper (23-33)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test Packages (3.11)
GitHub Check: Build Packages (3.11)
GitHub Check: Test Packages (3.12)
GitHub Check: Test Packages (3.10)
GitHub Check: Lint

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
348-372: Undefined token_histogram / duration_histogram when metrics are disabled

If TRACELOOP_METRICS_ENABLED is set to something other than "true", is_metrics_enabled() returns False and the _create_metrics call is skipped. In that case, token_histogram and duration_histogram are never defined, but they are still passed in wrapper_args. This will raise an UnboundLocalError the first time _instrument runs with metrics disabled.

This matches the earlier review feedback and should be fixed by always initializing the variables before the if is_metrics_enabled() guard.

A minimal, backward‑compatible fix:
         meter_provider = kwargs.get("meter_provider")
-        meter = get_meter(__name__, __version__, meter_provider)
-
-        if is_metrics_enabled():
-            token_histogram, duration_histogram = _create_metrics(meter)
+        meter = get_meter(__name__, __version__, meter_provider)
+
+        token_histogram = None
+        duration_histogram = None
+
+        if is_metrics_enabled():
+            token_histogram, duration_histogram = _create_metrics(meter)
Also, please double‑check that get_meter(__name__, __version__, meter_provider) matches the OpenTelemetry API version you target; in many setups, a provided meter provider is used via meter_provider.get_meter(...) and the third get_meter argument is a schema URL rather than a provider.
Verify the recommended way to obtain a `Meter` in the current OpenTelemetry Python metrics API: should instrumentations call `get_meter(__name__, __version__)` and rely on the global provider, or use a provided `meter_provider.get_meter(__name__, __version__)` when a custom provider is passed in?

🧹 Nitpick comments (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

449-490: Token metrics integration in set_model_response_attributes looks correct

The new token_histogram recording is properly gated by span.is_recording() and hasattr(response, "usage_metadata"), and it cleanly handles the None case for token_histogram while preserving existing span attributes and status behavior.

If you want to tidy later, you could avoid repeating hasattr(response, "usage_metadata") by grabbing usage = getattr(response, "usage_metadata", None) once and using it for both span attributes and histogram recording, but this is purely cosmetic.

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

308-325: Metrics helpers (is_metrics_enabled, _create_metrics) align with existing patterns

The is_metrics_enabled() helper matches the existing Bedrock pattern (env var with "true" default), and _create_metrics() defines LLM_TOKEN_USAGE and LLM_OPERATION_DURATION histograms with sensible units and descriptions. Once the initialization bug in _instrument is fixed, these helpers should behave as intended.

Please just confirm that Meters.LLM_TOKEN_USAGE and Meters.LLM_OPERATION_DURATION match the names used in your metrics backends and any dashboards, to avoid creating duplicate or unused instruments.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 012121e and 28f61d1.

📒 Files selected for processing (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (12 hunks)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py

🧬 Code graph analysis (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)

set_model_response_attributes (293-313)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (3)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)

set_model_response_attributes (449-490)

set_response_attributes (418-446)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/utils.py (1)

should_emit_events (44-50)

packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/__init__.py (2)

is_metrics_enabled (111-112)

_create_metrics (467-556)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test Packages (3.10)
GitHub Check: Test Packages (3.11)
GitHub Check: Lint
GitHub Check: Test Packages (3.12)
GitHub Check: Build Packages (3.11)

🔇 Additional comments (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)

207-221: Duration histogram recording is correctly implemented

Using time.perf_counter() to measure durations and recording them only when duration_histogram is truthy is a good pattern here. The attributes (GEN_AI_SYSTEM and GEN_AI_RESPONSE_MODEL) are appropriate and consistent for both async (_awrap) and sync (_wrap) paths.

Also applies to: 278-291

81-87: Token histogram propagation through streaming and response handlers looks sound

Passing token_histogram into _build_from_streaming_response, _abuild_from_streaming_response, and _handle_response, and then into set_model_response_attributes, ensures token metrics are recorded for both streaming and non‑streaming responses without changing existing span attribute behavior. The use of last_chunk or response / last_chunk if last_chunk else response is reasonable for picking up usage_metadata on streaming responses.

Also applies to: 97-103, 105-122, 135-143, 221-232, 293-302

gitguardian · 2025-12-09T12:38:46Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (1)
128-130: Also filter x-goog-api-key in VCR config

Given that the cassette for test_client_spans includes an x-goog-api-key header, this header should also be filtered here to prevent future recordings from capturing real API keys.

Suggested change:
-    return {"filter_headers": ["authorization"]}
+    return {"filter_headers": ["authorization", "x-goog-api-key"]}
This, combined with scrubbing the existing cassette, prevents secret leakage in recorded HTTP fixtures. [As per coding guidelines, …]

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)

36-67: Client span assertions look solid; consider slightly more defensive checks

The new test_client_spans exercises the right path and validates key GenAI + LLM attributes and token usage; this is good coverage for the new metrics/plumbing.

If you want to make the test a bit more robust against potential future schema changes, you could optionally guard direct index lookups with in attrs (as you already do for the prompt/completion content keys) before accessing values like SpanAttributes.LLM_USAGE_TOTAL_TOKENS to get clearer assertion errors when attributes go missing, but this is not strictly necessary.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 98655d7 and e524d2b.

📒 Files selected for processing (4)

packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml (1 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (3 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

**/cassettes/**/*.{yaml,yml,json}

📄 CodeRabbit inference engine (CLAUDE.md)

Never commit secrets or PII in VCR cassettes; scrub sensitive data

Files:

packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml

🧠 Learnings (3)

📚 Learning: 2025-12-02T21:09:48.690Z

Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-09-02T09:00:53.586Z

Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

🧬 Code graph analysis (2)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py (1)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

GoogleGenerativeAiInstrumentor (328-396)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

GoogleGenerativeAiInstrumentor (328-396)

packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)

reader (37-41)

meter_provider (45-50)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Lint
GitHub Check: Build Packages (3.11)

🔇 Additional comments (3)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)

79-94: Metrics test context wiring looks correct; no explicit uninstrument required

The metrics_test_context + clear_metrics_test_context pair correctly sets up an InMemoryMetricReader, MeterProvider, and global meter provider, then shuts them down at the end of the test session. Passing meter_provider=provider into GoogleGenerativeAiInstrumentor().instrument() aligns with the new metrics API and ensures histograms can be created.

Leaving instrumentation active at session scope is acceptable here since tests that need different instrumentation behavior already manage it via their own instrumentor fixtures.

67-76: Ensure tracer_provider fixture still exists or update instrument_ fixtures*

instrument_legacy, instrument_with_content, and instrument_with_no_content all depend on a tracer_provider fixture, but this file no longer defines one. If there isn't a shared tracer_provider fixture elsewhere, these fixtures will cause pytest to fail with "fixture 'tracer_provider' not found".

Two options:

If a global tracer_provider fixture exists elsewhere
Confirm the lifecycle is compatible with the new exporter fixture and that it still sets the provider before instrumentation.

If no such fixture exists anymore
Either reintroduce a local tracer_provider fixture, or refactor these fixtures to rely on the session-level setup now done in exporter (e.g., drop the tracer_provider parameter and let the instrumentor use the global provider).

packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py (1)

2-51: Lifecycle test and _is_instrumented helper look correct and valuable

Using wrapt and __wrapped__ to detect instrumentation is consistent with how OpenTelemetry instrumentations wrap functions, and the lifecycle test thoroughly exercises uninstrument → instrument → idempotent instrument → uninstrument flows for both Models and AsyncModels. This should catch regressions in WRAPPED_METHODS or uninstrument behavior.

...rumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
130-132: Consider renaming unused loop variable.

The loop control variable token_type is unpacked but not used within the loop body. While the code is functionally correct, renaming it to _ or _token_type would follow Python conventions for intentionally unused variables and silence the static analysis warning.

Apply this diff:
-    for token_type, dp in token_points_by_type.items():
+    for _, dp in token_points_by_type.items():
         assert dp.count >= 1
         assert dp.sum >= 0

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e524d2b and b20e9e7.

📒 Files selected for processing (4)

packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml (1 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_generate_metrics.yaml (1 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (4 hunks)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_generate_metrics.yaml

🚧 Files skipped from review as they are similar to previous changes (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

🧠 Learnings (4)

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-09-02T09:00:53.586Z

Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

🧬 Code graph analysis (2)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (3)

packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (2)

export (45-51)

InMemorySpanExporter (22-61)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

GoogleGenerativeAiInstrumentor (328-396)

packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)

reader (37-41)

meter_provider (45-50)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

GoogleGenerativeAiInstrumentor (328-396)

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (2)

SpanAttributes (64-245)

Meters (36-61)

🪛 Ruff (0.14.8)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

130-130: Loop control variable token_type not used within loop body

Rename unused token_type to _token_type

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test Packages (3.11)
GitHub Check: Test Packages (3.12)
GitHub Check: Test Packages (3.10)
GitHub Check: Build Packages (3.11)
GitHub Check: Lint

🔇 Additional comments (8)

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (3)

1-12: LGTM!

The imports are well-organized and include all necessary modules for testing spans, metrics, and OpenTelemetry instrumentation.

37-68: LGTM!

The test correctly validates span creation, span attributes, and token usage for the Google Generative AI instrumentation. The assertions are comprehensive and cover all critical span attributes.

29-34: Remove the unused mock_instrumentor fixture or document its intended purpose.

The fixture mocks the instrument() and uninstrument() methods but is not referenced by any tests in this file. If it's not used elsewhere in the test suite, remove it to reduce code maintenance overhead. If it's intended for future tests, add a comment explaining its purpose.
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (5)
13-13: LGTM!

The new imports provide the necessary OpenTelemetry SDK components for metrics and tracing infrastructure in tests.

Also applies to: 20-26, 28-30

63-66: LGTM!

Returning the full client object instead of client.models provides better flexibility for tests to access different client APIs (e.g., client.chats, client.models).

130-132: LGTM!

Adding x-goog-api-key to the filtered headers is essential for preventing API keys from being recorded in VCR cassettes.

135-138: LGTM!

The environment fixture ensures the Google API key is available for all tests, providing a sensible default for test environments.

36-47: Critical: Double instrumentation detected.

The exporter fixture (line 45) and metrics_test_context fixture (line 87) both create separate GoogleGenerativeAiInstrumentor instances and call instrument(). Since both fixtures are session-scoped, this results in double-wrapping the same methods, which can cause duplicate spans, metrics, and unpredictable behavior.

Recommended solution: Create a single session-scoped fixture that sets up both tracing and metrics providers, then instruments once with both:
@pytest.fixture(scope="session")
def instrumentation_setup():
    # Tracing setup
    span_exporter = InMemorySpanExporter()
    span_processor = SimpleSpanProcessor(span_exporter)
    tracer_provider = TracerProvider()
    tracer_provider.add_span_processor(span_processor)
    set_tracer_provider(tracer_provider)
    
    # Metrics setup
    resource = Resource.create()
    metric_reader = InMemoryMetricReader()
    meter_provider = MeterProvider(metric_readers=[metric_reader], resource=resource)
    metrics.set_meter_provider(meter_provider)
    
    # Single instrumentation with both providers
    instrumentor = GoogleGenerativeAiInstrumentor()
    instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider)
    
    yield span_exporter, metric_reader, meter_provider
    
    metric_reader.shutdown()
    meter_provider.shutdown()
    instrumentor.uninstrument()

@pytest.fixture(scope="session")
def exporter(instrumentation_setup):
    span_exporter, _, _ = instrumentation_setup
    return span_exporter

@pytest.fixture(scope="session")
def metrics_test_context(instrumentation_setup):
    _, metric_reader, meter_provider = instrumentation_setup
    return meter_provider, metric_reader
⛔ Skipped due to learnings
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)

69-78: This fixture depends on missing tracer_provider.

As noted above, this fixture will fail because tracer_provider is no longer defined. See the earlier comment for the fix.

99-128: These fixtures depend on missing tracer_provider.

Both instrument_with_content and instrument_with_no_content depend on the undefined tracer_provider fixture. See the earlier comment for the fix.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b20e9e7 and 048b73c.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (4 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

🧠 Learnings (4)

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-09-02T09:00:53.586Z

Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

📚 Learning: 2025-08-17T15:06:48.109Z

Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes

Applied to files:

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

🧬 Code graph analysis (1)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)

GoogleGenerativeAiInstrumentor (328-396)

packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)

reader (37-41)

meter_provider (45-50)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Test Packages (3.11)
GitHub Check: Test Packages (3.10)
GitHub Check: Build Packages (3.11)
GitHub Check: Test Packages (3.12)
GitHub Check: Lint

🔇 Additional comments (5)

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (5)

13-30: LGTM!

The new imports are correctly structured and necessary for the metrics and tracing infrastructure changes.

63-66: LGTM!

The fixture correctly reads the API key from an environment variable, adhering to secure credential handling practices. Based on learnings, this follows the pattern for test fixtures in this repository.

91-96: Previous issue addressed - fixture lifecycle is now correct.

The fixture properly yields before performing cleanup, ensuring metrics infrastructure remains available during test execution.

131-133: Good addition of API key header filtering.

Filtering x-goog-api-key from VCR recordings prevents accidental secret exposure in test cassettes. Based on learnings, this follows the recommended pattern for scrubbing secrets during recording.

136-139: LGTM!

The fallback to a placeholder API key enables tests to run without requiring real credentials, which is appropriate for unit tests using VCR cassettes.

coderabbitai · 2025-12-16T04:48:10Z

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

+@pytest.fixture(scope="session")
+def exporter():
    exporter = InMemorySpanExporter()
-    yield exporter
-
+    processor = SimpleSpanProcessor(exporter)

-@pytest.fixture(scope="function", name="tracer_provider")
-def fixture_tracer_provider(span_exporter):
    provider = TracerProvider()
-    provider.add_span_processor(SimpleSpanProcessor(span_exporter))
-    return provider
+    provider.add_span_processor(processor)
+    set_tracer_provider(provider)
+
+    GoogleGenerativeAiInstrumentor().instrument()
+
+    return exporter


⚠️ Potential issue | 🔴 Critical

Missing tracer_provider fixture breaks dependent fixtures.

The tracer_provider fixture was inlined into exporter, but instrument_legacy (line 70), instrument_with_content (line 100), and instrument_with_no_content (line 116) still depend on it. These fixtures will fail with a fixture not found error.

Either re-add the tracer_provider fixture or update the dependent fixtures:

@pytest.fixture(scope="session") def exporter(): exporter = InMemorySpanExporter() processor = SimpleSpanProcessor(exporter) provider = TracerProvider() provider.add_span_processor(processor) set_tracer_provider(provider) GoogleGenerativeAiInstrumentor().instrument() return exporter + + +@pytest.fixture(scope="function") +def tracer_provider(): + provider = TracerProvider() + return provider

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py around lines 36 to 47, the tracer_provider fixture was inlined into exporter which breaks other fixtures that still declare a dependency on tracer_provider; restore a session-scoped tracer_provider fixture that creates a TracerProvider, calls set_tracer_provider(provider), and returns it, and then change the exporter fixture signature to accept tracer_provider (so pytest injects it) while keeping the exporter setup (InMemorySpanExporter, SimpleSpanProcessor, instrument call) unchanged.

coderabbitai · 2025-12-16T04:48:10Z

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py

+@pytest.fixture(scope="session")
+def metrics_test_context():
+    resource = Resource.create()
+    reader = InMemoryMetricReader()
+    provider = MeterProvider(metric_readers=[reader], resource=resource)
+    metrics.set_meter_provider(provider)
+    GoogleGenerativeAiInstrumentor().instrument(meter_provider=provider)
+    return provider, reader


⚠️ Potential issue | 🟠 Major

Double instrumentation will cause issues.

Both exporter (line 45) and metrics_test_context (line 87) call GoogleGenerativeAiInstrumentor().instrument(). Since both are session-scoped, this will wrap methods twice, potentially causing duplicate spans/metrics or errors.

Consider consolidating instrumentation into a single fixture, or have the exporter fixture instrument with both tracer and meter providers:

@pytest.fixture(scope="session") -def exporter(): +def exporter(metrics_test_context): exporter = InMemorySpanExporter() processor = SimpleSpanProcessor(exporter) provider = TracerProvider() provider.add_span_processor(processor) set_tracer_provider(provider) - GoogleGenerativeAiInstrumentor().instrument() + meter_provider, _ = metrics_test_context + GoogleGenerativeAiInstrumentor().instrument(meter_provider=meter_provider) return exporter @pytest.fixture(scope="session") def metrics_test_context(): resource = Resource.create() reader = InMemoryMetricReader() provider = MeterProvider(metric_readers=[reader], resource=resource) metrics.set_meter_provider(provider) - GoogleGenerativeAiInstrumentor().instrument(meter_provider=provider) return provider, reader

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py around lines 81 to 88, the test setup double-instruments the library because both the exporter fixture (around line 45) and metrics_test_context (lines 81–88) call GoogleGenerativeAiInstrumentor().instrument(), which can wrap methods twice; fix by consolidating instrumentation into one place: remove the instrument() call from one fixture and instead pass the needed tracer_provider and meter_provider into the single instrumentation call (or make a single session-scoped instrumentation fixture used by both exporter and metrics_test_context), ensuring instrumentation is executed exactly once and that the exporter fixture uses the already-instrumented providers.

adharshctr added 4 commits December 8, 2025 17:45

Added the metrics support for google generativeai

67ce692

Collect tokens from response

53ced7a

Done linting

3ec203e

updated the wrapper

012121e

ellipsis-dev bot reviewed Dec 8, 2025

View reviewed changes

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py Outdated Show resolved Hide resolved

adharshctr added 2 commits December 8, 2025 19:08

Address the suggestion

28f61d1

Address the suggestion

98655d7

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

adharshctr added 3 commits December 9, 2025 18:05

Changed the failing testcase

ff67a81

updated confest file

c6d2e17

Added testcases for client

e524d2b

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

...rumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml Outdated Show resolved Hide resolved

adharshctr added 4 commits December 9, 2025 18:26

Added testcases for metrics

1a178c7

Recorded the tests

e57ec68

Addressed the suggestion

bbdaa9e

Merge branch 'main' into google-metrics-support

b20e9e7

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py Show resolved Hide resolved

packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py Show resolved Hide resolved

adharshctr added 2 commits December 9, 2025 18:59

Updated the suggestion

8e70176

Merge branch 'main' into google-metrics-support

048b73c

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

feat(google-generativeai): Add metrics support #3506

Are you sure you want to change the base?

feat(google-generativeai): Add metrics support #3506

Uh oh!

Conversation

adharshctr commented Dec 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gitguardian bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adharshctr commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

gitguardian bot commented Dec 9, 2025 •

edited

Loading