anthropic: add improved streaming thinking/reasoning token support#1418
Merged
anthropic: add improved streaming thinking/reasoning token support#1418
Conversation
Implement StreamingReasoningFunc support in the Anthropic client to enable real-time streaming of thinking tokens during extended thinking responses. Changes: - Add StreamingReasoningFunc field to messagePayload, MessageRequest structs - Modify handleThinkingDelta() to call StreamingReasoningFunc when thinking chunks arrive during streaming - Wire up StreamingReasoningFunc from llms.CallOptions through to the Anthropic client payload - Update setMessageDefaults to enable streaming when StreamingReasoningFunc is provided This follows the same pattern as the OpenAI client (chat.go:638-663) and enables thinking tokens to stream in real-time at the BEGINNING of the response, rather than appearing after the response completes. Fixes issue where thinking_delta events were not calling the streaming reasoning callback, causing thinking content to only be available after response completion.
0d3e668 to
5435f15
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds improved real-time streaming support for thinking/reasoning tokens in the Anthropic client, enabling thinking content to appear ahead of responses instead of after completion.
Changes
Implementation Details
Follows the same pattern as OpenAI client (llms/openai/internal/openaiclient/chat.go:638-663). When thinking_delta events arrive from the Anthropic API, they're immediately passed to the StreamingReasoningFunc callback instead of being buffered until response completion.
Testing