feat: include complete turn response in Agent.create_turn #141

ehhuang · 2025-02-13T00:44:06Z

Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "

Test Plan:

Added test in meta-llama/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│ input_messages=[
│ │ UserMessage(
│ │ │ content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│ │ │ role='user',
│ │ │ context=None
│ │ )
│ ],
│ output_message=CompletionMessage(
│ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│ │ role='assistant',
│ │ stop_reason='end_of_turn',
│ │ tool_calls=[]
│ ),
│ session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│ steps=[
│ │ InferenceStep(
│ │ │ api_model_response=CompletionMessage(
│ │ │ │ content='',
│ │ │ │ role='assistant',
│ │ │ │ stop_reason='end_of_turn',
│ │ │ │ tool_calls=[
│ │ │ │ │ ToolCall(
│ │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│ │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ │ tool_name='load_url'
│ │ │ │ │ )
│ │ │ │ ]
│ │ │ ),
│ │ │ step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│ │ │ step_type='inference',
│ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│ │ ),
│ │ ToolExecutionStep(
│ │ │ step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│ │ │ step_type='tool_execution',
│ │ │ tool_calls=[
│ │ │ │ ToolCall(
│ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ tool_name='load_url'
│ │ │ │ )
│ │ │ ],
│ │ │ tool_responses=[
│ │ │ │ ToolResponse(
│ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│ │ │ │ │ tool_name='load_url'
│ │ │ │ )
│ │ │ ],
│ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│ │ ),
│ │ InferenceStep(
│ │ │ api_model_response=CompletionMessage(
│ │ │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│ │ │ │ role='assistant',
│ │ │ │ stop_reason='end_of_turn',
│ │ │ │ tool_calls=[]
│ │ │ ),
│ │ │ step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│ │ │ step_type='inference',
│ │ │ turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│ │ )
│ ],
│ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│ output_attachments=[]
)

src/llama_stack_client/lib/agents/agent.py

Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B

src/llama_stack_client/lib/agents/event_logger.py

Summary: In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used. However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call. This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn. I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs: "tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to " Test Plan: Added test in meta-llama/llama-stack#1078 Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns. Turn( │ input_messages=[ │ │ UserMessage( │ │ │ content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it', │ │ │ role='user', │ │ │ context=None │ │ ) │ ], │ output_message=CompletionMessage( │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ role='assistant', │ │ stop_reason='end_of_turn', │ │ tool_calls=[] │ ), │ session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba', │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186), │ steps=[ │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content='', │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[ │ │ │ │ │ ToolCall( │ │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ │ tool_name='load_url' │ │ │ │ │ ) │ │ │ │ ] │ │ │ ), │ │ │ step_id='d724a238-d02b-4d77-a4bc-a978a54979c6', │ │ │ step_type='inference', │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535) │ │ ), │ │ ToolExecutionStep( │ │ │ step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89', │ │ │ step_type='tool_execution', │ │ │ tool_calls=[ │ │ │ │ ToolCall( │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ tool_responses=[ │ │ │ │ ToolResponse( │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756) │ │ ), │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[] │ │ │ ), │ │ │ step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a', │ │ │ step_type='inference', │ │ │ turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449) │ │ ) │ ], │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199), │ output_attachments=[] ) ```

hardikjshah

Awesome, this makes it super consistent on server / client.

Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B

facebook-github-bot added the cla signed label Feb 13, 2025

ehhuang marked this pull request as ready for review February 13, 2025 00:44

ehhuang requested review from ashwinb, yanxi0830, hardikjshah, dltn, raghotham, dineshyv and vladimirivic as code owners February 13, 2025 00:44

hardikjshah reviewed Feb 13, 2025

View reviewed changes

src/llama_stack_client/lib/agents/agent.py Show resolved Hide resolved

ehhuang mentioned this pull request Feb 13, 2025

test: add test for Agent.create_turn non-streaming response meta-llama/llama-stack#1078

Merged

ehhuang force-pushed the pr141 branch 4 times, most recently from 777b11e to bc27ab3 Compare February 13, 2025 18:40

ehhuang commented Feb 13, 2025

View reviewed changes

src/llama_stack_client/lib/agents/event_logger.py Show resolved Hide resolved

ehhuang force-pushed the pr141 branch from bc27ab3 to 706a1e6 Compare February 13, 2025 18:41

hardikjshah approved these changes Feb 14, 2025

View reviewed changes

hardikjshah merged commit 8652757 into main Feb 14, 2025
3 checks passed

hardikjshah deleted the pr141 branch February 14, 2025 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: include complete turn response in Agent.create_turn #141

feat: include complete turn response in Agent.create_turn #141

Uh oh!

ehhuang commented Feb 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

hardikjshah left a comment

Uh oh!

Uh oh!

Uh oh!

feat: include complete turn response in Agent.create_turn #141

feat: include complete turn response in Agent.create_turn #141

Uh oh!

Conversation

ehhuang commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hardikjshah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ehhuang commented Feb 13, 2025 •

edited

Loading