fix(llmobs): fix content arg extraction for vertex ai integration [backport #12034 to 2.19] (#12061)

ncybul · web-flow · commit 87a5c75076ed · 2025-01-23T18:56:38.000-05:00
Backports #12034 to 2.19. In [MLOS-42](https://datadoghq.atlassian.net/browse/MLOS-42) a customer was experiencing the following error: ``` ddtrace.internal.utils.ArgumentError: contents (at position 0) ``` where the `content` argument was not being extracted properly from the list of keyword arguments inputted into the `chat.send_message` method. This is because in the Vertex AI integration, we look for the `contents` keyword argument. However, the content field is titled `content` in the [send_message](https://github.com/google-gemini/generative-ai-python/blob/main/google/generativeai/generative_models.py#L514) method and `contents` in the [generate_content](https://github.com/google-gemini/generative-ai-python/blob/main/google/generativeai/generative_models.py#L239) method, so it is necessary to differentiate between these two cases. This PR is a small fix that corrects this error by differentiating between chat and completion requests in order to extract either `content` or `contents` respectively. ## Testing ### Automatic Testing I edited some of the currently existing tests to use the keyword argument extraction rather than the positional argument extraction to get the content in order to confirm that this fix resolves the error. ### Manual Testing Running the following code reproduced the error; furthermore, I confirmed that with this fix, the error is no longer present and the request completes successfully. ``` llm = GenerativeModel("gemini-1.5-flash") chat = llm.start_chat() resp = chat.send_message(content="hello") ``` I also verified that the following code which uses the generate_content method is not impacted (continues to work as before) as a result of this fix. ``` llm = GenerativeModel("gemini-1.5-flash") resp = llm.generate_content(contents="hello") ``` ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOS-42]: https://datadoghq.atlassian.net/browse/MLOS-42?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> (cherry picked from commit 67a5a9c) ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) [MLOS-42]: https://datadoghq.atlassian.net/browse/MLOS-42?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [MLOS-42]: https://datadoghq.atlassian.net/browse/MLOS-42?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
diff --git a/ddtrace/contrib/internal/vertexai/_utils.py b/ddtrace/contrib/internal/vertexai/_utils.py
@@ -177,13 +177,13 @@ def _tag_request_content(span, integration, content, content_idx):
         tag_request_content_part_google("vertexai", span, integration, part, part_idx, content_idx)
 
 
-def tag_request(span, integration, instance, args, kwargs):
+def tag_request(span, integration, instance, args, kwargs, is_chat):
     """Tag the generation span with request details.
     Includes capturing generation configuration, system prompt, and messages.
     """
     # instance is either a chat session or a model itself
     model_instance = instance if isinstance(instance, GenerativeModel) else instance._model
-    contents = get_argument_value(args, kwargs, 0, "contents")
+    contents = get_argument_value(args, kwargs, 0, "content" if is_chat else "contents")
     history = _get_attr(instance, "_history", [])
     if history:
         if isinstance(contents, list):
diff --git a/ddtrace/contrib/internal/vertexai/patch.py b/ddtrace/contrib/internal/vertexai/patch.py
@@ -64,7 +64,7 @@ def _traced_generate(vertexai, pin, func, instance, args, kwargs, model_instance
     # history must be copied since it is modified during the LLM interaction
     history = getattr(instance, "history", [])[:]
     try:
-        tag_request(span, integration, instance, args, kwargs)
+        tag_request(span, integration, instance, args, kwargs, is_chat)
         generations = func(*args, **kwargs)
         if stream:
             return TracedVertexAIStreamResponse(
@@ -99,7 +99,7 @@ async def _traced_agenerate(vertexai, pin, func, instance, args, kwargs, model_i
     # history must be copied since it is modified during the LLM interaction
     history = getattr(instance, "history", [])[:]
     try:
-        tag_request(span, integration, instance, args, kwargs)
+        tag_request(span, integration, instance, args, kwargs, is_chat)
         generations = await func(*args, **kwargs)
         if stream:
             return TracedAsyncVertexAIStreamResponse(
diff --git a/ddtrace/llmobs/_integrations/vertexai.py b/ddtrace/llmobs/_integrations/vertexai.py
@@ -5,6 +5,7 @@
 from typing import Optional
 
 from ddtrace import Span
+from ddtrace.internal.utils import ArgumentError
 from ddtrace.internal.utils import get_argument_value
 from ddtrace.llmobs._constants import INPUT_MESSAGES
 from ddtrace.llmobs._constants import METADATA
@@ -45,7 +46,11 @@ def _llmobs_set_tags(
         metadata = llmobs_get_metadata_google(kwargs, instance)
 
         system_instruction = get_system_instructions_from_google_model(instance)
-        input_contents = get_argument_value(args, kwargs, 0, "contents")
+        input_contents = None
+        try:
+            input_contents = get_argument_value(args, kwargs, 0, "content")
+        except ArgumentError:
+            input_contents = get_argument_value(args, kwargs, 0, "contents")
         input_messages = self._extract_input_message(input_contents, history, system_instruction)
 
         output_messages = [{"content": ""}]
diff --git a/releasenotes/notes/fix-vertexai-content-extraction-b216207bd8192e5f.yaml b/releasenotes/notes/fix-vertexai-content-extraction-b216207bd8192e5f.yaml
@@ -0,0 +1,4 @@
+---
+fixes:
+  - |
+    vertexai: Resolves an issue with ``chat.send_message()`` where the content keyword argument was not parsed correctly.
diff --git a/tests/contrib/vertexai/test_vertexai.py b/tests/contrib/vertexai/test_vertexai.py
@@ -42,7 +42,7 @@ def test_vertexai_completion(vertexai):
     llm = vertexai.generative_models.GenerativeModel("gemini-1.5-flash")
     llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
     llm.generate_content(
-        "Why do bears hibernate?",
+        contents="Why do bears hibernate?",
         generation_config=vertexai.generative_models.GenerationConfig(
             stop_sequences=["x"], max_output_tokens=30, temperature=1.0
         ),
@@ -118,7 +118,7 @@ def test_vertexai_completion_stream(vertexai):
         (_mock_completion_stream_chunk(chunk) for chunk in MOCK_COMPLETION_STREAM_CHUNKS)
     ]
     response = llm.generate_content(
-        "How big is the solar system?",
+        contents="How big is the solar system?",
         generation_config=vertexai.generative_models.GenerationConfig(
             stop_sequences=["x"], max_output_tokens=30, temperature=1.0
         ),
@@ -278,7 +278,7 @@ def test_vertexai_chat(vertexai):
     llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
     chat = llm.start_chat()
     chat.send_message(
-        "Why do bears hibernate?",
+        content="Why do bears hibernate?",
         generation_config=vertexai.generative_models.GenerationConfig(
             stop_sequences=["x"], max_output_tokens=30, temperature=1.0
         ),
@@ -371,7 +371,7 @@ def test_vertexai_chat_stream(vertexai):
     ]
     chat = llm.start_chat()
     response = chat.send_message(
-        "How big is the solar system?",
+        content="How big is the solar system?",
         generation_config=vertexai.generative_models.GenerationConfig(
             stop_sequences=["x"], max_output_tokens=30, temperature=1.0
         ),
diff --git a/tests/contrib/vertexai/test_vertexai_llmobs.py b/tests/contrib/vertexai/test_vertexai_llmobs.py
@@ -21,7 +21,7 @@ def test_completion(self, vertexai, mock_llmobs_writer, mock_tracer):
         llm = vertexai.generative_models.GenerativeModel("gemini-1.5-flash")
         llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
         llm.generate_content(
-            "Why do bears hibernate?",
+            contents="Why do bears hibernate?",
             generation_config=vertexai.generative_models.GenerationConfig(
                 stop_sequences=["x"], max_output_tokens=30, temperature=1.0
             ),
@@ -126,7 +126,7 @@ def test_completion_stream(self, vertexai, mock_llmobs_writer, mock_tracer):
             (_mock_completion_stream_chunk(chunk) for chunk in MOCK_COMPLETION_STREAM_CHUNKS)
         ]
         response = llm.generate_content(
-            "How big is the solar system?",
+            contents="How big is the solar system?",
             generation_config=vertexai.generative_models.GenerationConfig(
                 stop_sequences=["x"], max_output_tokens=30, temperature=1.0
             ),
@@ -293,7 +293,7 @@ def test_chat(self, vertexai, mock_llmobs_writer, mock_tracer):
         llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
         chat = llm.start_chat()
         chat.send_message(
-            "Why do bears hibernate?",
+            content="Why do bears hibernate?",
             generation_config=vertexai.generative_models.GenerationConfig(
                 stop_sequences=["x"], max_output_tokens=30, temperature=1.0
             ),
@@ -389,7 +389,7 @@ def test_chat_stream(self, vertexai, mock_llmobs_writer, mock_tracer):
         ]
         chat = llm.start_chat()
         response = chat.send_message(
-            "How big is the solar system?",
+            content="How big is the solar system?",
             generation_config=vertexai.generative_models.GenerationConfig(
                 stop_sequences=["x"], max_output_tokens=30, temperature=1.0
             ),

-Original file line number
+Diff line change
@@ @@ -0,0 +1,4 @@ @@
 +---
 +fixes:
 +  - |
 +    vertexai: Resolves an issue with ``chat.send_message()`` where the content keyword argument was not parsed correctly.