Skip to content

Commit bc27ab3

Browse files
committed
feat: include complete turn response in Agent.create_turn
Summary: In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used. However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call. This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn. I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs: "tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to " Test Plan: Added test in meta-llama/llama-stack#1078 Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns. Turn( │ input_messages=[ │ │ UserMessage( │ │ │ content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it', │ │ │ role='user', │ │ │ context=None │ │ ) │ ], │ output_message=CompletionMessage( │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ role='assistant', │ │ stop_reason='end_of_turn', │ │ tool_calls=[] │ ), │ session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba', │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186), │ steps=[ │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content='', │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[ │ │ │ │ │ ToolCall( │ │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ │ tool_name='load_url' │ │ │ │ │ ) │ │ │ │ ] │ │ │ ), │ │ │ step_id='d724a238-d02b-4d77-a4bc-a978a54979c6', │ │ │ step_type='inference', │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535) │ │ ), │ │ ToolExecutionStep( │ │ │ step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89', │ │ │ step_type='tool_execution', │ │ │ tool_calls=[ │ │ │ │ ToolCall( │ │ │ │ │ arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'}, │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ tool_responses=[ │ │ │ │ ToolResponse( │ │ │ │ │ call_id='5d09151b-8a53-4292-be8d-f21e134d5142', │ │ │ │ │ content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}', │ │ │ │ │ tool_name='load_url' │ │ │ │ ) │ │ │ ], │ │ │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756) │ │ ), │ │ InferenceStep( │ │ │ api_model_response=CompletionMessage( │ │ │ │ content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.", │ │ │ │ role='assistant', │ │ │ │ stop_reason='end_of_turn', │ │ │ │ tool_calls=[] │ │ │ ), │ │ │ step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a', │ │ │ step_type='inference', │ │ │ turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449', │ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107), │ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449) │ │ ) │ ], │ turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba', │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199), │ output_attachments=[] ) ```
1 parent b5dce10 commit bc27ab3

File tree

2 files changed

+63
-20
lines changed

2 files changed

+63
-20
lines changed

src/llama_stack_client/lib/agents/agent.py

Lines changed: 63 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,21 @@
1010
from llama_stack_client.types.agent_create_params import AgentConfig
1111
from llama_stack_client.types.agents.turn import Turn
1212
from llama_stack_client.types.agents.turn_create_params import Document, Toolgroup
13-
from llama_stack_client.types.agents.turn_create_response import AgentTurnResponseStreamChunk
13+
from llama_stack_client.types.agents.turn_create_response import (
14+
AgentTurnResponseStreamChunk,
15+
)
16+
from llama_stack_client.types.agents.turn_response_event import TurnResponseEvent
17+
from llama_stack_client.types.agents.turn_response_event_payload import (
18+
AgentTurnResponseStepCompletePayload,
19+
)
1420
from llama_stack_client.types.shared.tool_call import ToolCall
1521
from llama_stack_client.types.agents.turn import CompletionMessage
1622
from .client_tool import ClientTool
1723
from .tool_parser import ToolParser
24+
from datetime import datetime
25+
import uuid
26+
from llama_stack_client.types.tool_execution_step import ToolExecutionStep
27+
from llama_stack_client.types.tool_response import ToolResponse
1828

1929
DEFAULT_MAX_ITER = 10
2030

@@ -119,24 +129,36 @@ def create_turn(
119129
stream: bool = True,
120130
) -> Iterator[AgentTurnResponseStreamChunk] | Turn:
121131
if stream:
122-
return self._create_turn_streaming(messages, session_id, toolgroups, documents, stream)
132+
return self._create_turn_streaming(messages, session_id, toolgroups, documents)
123133
else:
124-
chunk = None
125-
for chunk in self._create_turn_streaming(messages, session_id, toolgroups, documents, stream):
134+
chunks = []
135+
for chunk in self._create_turn_streaming(messages, session_id, toolgroups, documents):
136+
if chunk.event.payload.event_type == "turn_complete":
137+
chunks.append(chunk)
126138
pass
127-
if not chunk:
128-
raise Exception("No chunk returned")
129-
if chunk.event.payload.event_type != "turn_complete":
139+
if not chunks:
130140
raise Exception("Turn did not complete")
131-
return chunk.event.payload.turn
141+
142+
# merge chunks
143+
return Turn(
144+
input_messages=chunks[0].event.payload.turn.input_messages,
145+
output_message=chunks[-1].event.payload.turn.output_message,
146+
session_id=chunks[0].event.payload.turn.session_id,
147+
steps=[step for chunk in chunks for step in chunk.event.payload.turn.steps],
148+
turn_id=chunks[0].event.payload.turn.turn_id,
149+
started_at=chunks[0].event.payload.turn.started_at,
150+
completed_at=chunks[-1].event.payload.turn.completed_at,
151+
output_attachments=[
152+
attachment for chunk in chunks for attachment in chunk.event.payload.turn.output_attachments
153+
],
154+
)
132155

133156
def _create_turn_streaming(
134157
self,
135158
messages: List[Union[UserMessage, ToolResponseMessage]],
136159
session_id: Optional[str] = None,
137160
toolgroups: Optional[List[Toolgroup]] = None,
138161
documents: Optional[List[Document]] = None,
139-
stream: bool = True,
140162
) -> Iterator[AgentTurnResponseStreamChunk]:
141163
stop = False
142164
n_iter = 0
@@ -161,10 +183,39 @@ def _create_turn_streaming(
161183
elif not tool_calls:
162184
yield chunk
163185
else:
164-
next_message = self._run_tool(tool_calls)
165-
yield next_message
186+
tool_execution_start_time = datetime.now()
187+
tool_response_message = self._run_tool(tool_calls)
188+
tool_execution_step = ToolExecutionStep(
189+
step_type="tool_execution",
190+
step_id=str(uuid.uuid4()),
191+
tool_calls=tool_calls,
192+
tool_responses=[
193+
ToolResponse(
194+
tool_name=tool_response_message.tool_name,
195+
content=tool_response_message.content,
196+
call_id=tool_response_message.call_id,
197+
)
198+
],
199+
turn_id=chunk.event.payload.turn.turn_id,
200+
completed_at=datetime.now(),
201+
started_at=tool_execution_start_time,
202+
)
203+
yield AgentTurnResponseStreamChunk(
204+
event=TurnResponseEvent(
205+
payload=AgentTurnResponseStepCompletePayload(
206+
event_type="step_complete",
207+
step_id=tool_execution_step.step_id,
208+
step_type="tool_execution",
209+
step_details=tool_execution_step,
210+
)
211+
)
212+
)
213+
214+
# HACK: append the tool execution step to the turn
215+
chunk.event.payload.turn.steps.append(tool_execution_step)
216+
yield chunk
166217

167218
# continue the turn when there's a tool call
168219
stop = False
169-
messages = [next_message]
220+
messages = [tool_response_message]
170221
n_iter += 1

src/llama_stack_client/lib/agents/event_logger.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -70,14 +70,6 @@ def _yield_printable_events(self, chunk, previous_event_type=None, previous_step
7070
yield TurnStreamPrintableEvent(role=None, content=chunk.error["message"], color="red")
7171
return
7272

73-
if not hasattr(chunk, "event"):
74-
# Need to check for custom tool first
75-
# since it does not produce event but instead
76-
# a Message
77-
if isinstance(chunk, ToolResponseMessage):
78-
yield TurnStreamPrintableEvent(role="CustomTool", content=chunk.content, color="green")
79-
return
80-
8173
event = chunk.event
8274
event_type = event.payload.event_type
8375

0 commit comments

Comments
 (0)