-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: include complete turn response in Agent.create_turn #141
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hardikjshah
reviewed
Feb 13, 2025
ehhuang
added a commit
to meta-llama/llama-stack
that referenced
this pull request
Feb 13, 2025
Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
ehhuang
added a commit
to meta-llama/llama-stack
that referenced
this pull request
Feb 13, 2025
Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
777b11e
to
bc27ab3
Compare
ehhuang
commented
Feb 13, 2025
Summary: In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used. However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call. This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn. I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs: "tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to " Test Plan: Added test in meta-llama/llama-stack#1078 Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns. Turn( │ input_messages=[ │ │ UserMessage( │ │ │ content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it', │ │ │ role='user', │ │ │ context=None │ │ ) │ ], │ output_message=CompletionMessage( │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ role='assistant', │ │ stop_reason='end_of_turn', │ │ tool_calls=[] │ ), │ session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba', │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186), │ steps=[ │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content='', │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[ │ │ │ │ │ ToolCall( │ │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ │ tool_name='load_url' │ │ │ │ │ ) │ │ │ │ ] │ │ │ ), │ │ │ step_id='d724a238-d02b-4d77-a4bc-a978a54979c6', │ │ │ step_type='inference', │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535) │ │ ), │ │ ToolExecutionStep( │ │ │ step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89', │ │ │ step_type='tool_execution', │ │ │ tool_calls=[ │ │ │ │ ToolCall( │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ tool_responses=[ │ │ │ │ ToolResponse( │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756) │ │ ), │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[] │ │ │ ), │ │ │ step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a', │ │ │ step_type='inference', │ │ │ turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449) │ │ ) │ ], │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199), │ output_attachments=[] ) ```
hardikjshah
approved these changes
Feb 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this makes it super consistent on server / client.
hardikjshah
pushed a commit
to meta-llama/llama-stack
that referenced
this pull request
Feb 14, 2025
Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
franciscojavierarceo
pushed a commit
to franciscojavierarceo/llama-stack
that referenced
this pull request
Feb 14, 2025
…ma#1078) Summary: This tests the fix to the SDK in meta-llama/llama-stack-client-python#141 Test Plan: LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.
However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.
This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.
I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "
Test Plan:
Added test in meta-llama/llama-stack#1078
Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.
Turn(
│ input_messages=[
│ │ UserMessage(
│ │ │ content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│ │ │ role='user',
│ │ │ context=None
│ │ )
│ ],
│ output_message=CompletionMessage(
│ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│ │ role='assistant',
│ │ stop_reason='end_of_turn',
│ │ tool_calls=[]
│ ),
│ session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│ steps=[
│ │ InferenceStep(
│ │ │ api_model_response=CompletionMessage(
│ │ │ │ content='',
│ │ │ │ role='assistant',
│ │ │ │ stop_reason='end_of_turn',
│ │ │ │ tool_calls=[
│ │ │ │ │ ToolCall(
│ │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│ │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ │ tool_name='load_url'
│ │ │ │ │ )
│ │ │ │ ]
│ │ │ ),
│ │ │ step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│ │ │ step_type='inference',
│ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│ │ ),
│ │ ToolExecutionStep(
│ │ │ step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│ │ │ step_type='tool_execution',
│ │ │ tool_calls=[
│ │ │ │ ToolCall(
│ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ tool_name='load_url'
│ │ │ │ )
│ │ │ ],
│ │ │ tool_responses=[
│ │ │ │ ToolResponse(
│ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│ │ │ │ │ tool_name='load_url'
│ │ │ │ )
│ │ │ ],
│ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│ │ ),
│ │ InferenceStep(
│ │ │ api_model_response=CompletionMessage(
│ │ │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│ │ │ │ role='assistant',
│ │ │ │ stop_reason='end_of_turn',
│ │ │ │ tool_calls=[]
│ │ │ ),
│ │ │ step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│ │ │ step_type='inference',
│ │ │ turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│ │ )
│ ],
│ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│ output_attachments=[]
)