Skip to content

Conversation

HugoChaan
Copy link
Collaborator

@HugoChaan HugoChaan commented Oct 16, 2025

Summary by CodeRabbit

  • Refactor

    • Removed per-turn state tracking across TTS extensions; signals and events now track by request only.
    • Streamlined start/end/error signaling and improved metrics to include model and voice metadata.
  • Chores

    • Bumped manifest versions for updated TTS extensions.

Copy link

coderabbitai bot commented Oct 16, 2025

Walkthrough

Removed per-turn state (current_turn_id) from three TTS extensions and moved signal/metric reporting to request-based identifiers, adding extra_metadata (model/voice) for TTFB metrics; manifest versions updated for each extension.

Changes

Cohort / File(s) Summary
Cosy TTS extension
ai_agents/agents/ten_packages/extension/cosy_tts_python/extension.py
Removed current_turn_id attribute/usages; updated calls to send_tts_audio_start, send_tts_audio_end, send_tts_ttfb_metrics, and send_tts_error to use request_id and extra_metadata (model/voice) where applicable; updated logging and metrics paths.
HumeAI TTS extension
ai_agents/agents/ten_packages/extension/humeai_tts_python/extension.py
Removed current_turn_id; replaced turn-based payloads with request_id and extra_metadata (voice_id/voice_name) for metrics and signals; adjusted PCMWriter/session handling to not rely on turn ID.
StepFun TTS extension
ai_agents/agents/ten_packages/extension/stepfun_tts_python/extension.py
Removed current_turn_id; updated TTS start/end and TTFB metric calls to use request_id, request_event_interval_ms, request_total_audio_duration_ms, reason, and extra_metadata (model/voice).
Manifests (version bumps)
ai_agents/agents/ten_packages/extension/cosy_tts_python/manifest.json, ai_agents/agents/ten_packages/extension/humeai_tts_python/manifest.json, ai_agents/agents/ten_packages/extension/stepfun_tts_python/manifest.json
Bumped manifest versions: cosy 0.3.0 → 0.3.1, humeai 0.2.4 → 0.3.0, stepfun 0.2.0 → 0.3.0. No API surface changes beyond internal state removal.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client as Client
    participant Ext as TTS Extension
    participant Signaler as Signaling/Telemetry

    Note over Client,Ext: New request-based flow (no current_turn_id)
    Client->>Ext: request_tts(request_id, t, config)
    Ext->>Signaler: send_tts_audio_start(request_id)
    Note right of Signaler: start event recorded
    Ext->>Ext: stream audio chunks
    Ext->>Signaler: send_tts_ttfb_metrics(request_id, ttfb_ms, extra_metadata)
    Note right of Signaler: include model/voice in extra_metadata
    Ext->>Signaler: send_tts_audio_end(request_id, request_event_interval_ms, request_total_audio_duration_ms, reason)
    Note right of Signaler: end event recorded
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • halajohn
  • plutoless

Poem

🐇 I hopped through code to clear the trail,

No turns to chase, just one request's tale.
Voices bundled in metadata bright,
Signals flow clean, from start to night.
A tiny rabbit cheers: "Simplicity's right!"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: opt tts remove turnid" is related to the main changes in the changeset. The PR modifies multiple TTS extensions (CosyTTSExtension, HumeaiTTSExtension, and StepFunTTSExtension) by removing the current_turn_id attribute and refactoring the code to use request_id-based tracking with enhanced metadata instead. The title accurately captures this primary change, though it uses abbreviated and informal style ("opt" for optimization, "turnid" concatenated without spacing). Despite these stylistic choices, a developer scanning the history would understand that the change involves removing turn_id tracking from TTS extensions, which is indeed the core objective of this PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/tts_opt

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f61598 and 77d1eb7.

📒 Files selected for processing (3)
  • ai_agents/agents/ten_packages/extension/cosy_tts_python/extension.py (6 hunks)
  • ai_agents/agents/ten_packages/extension/humeai_tts_python/extension.py (3 hunks)
  • ai_agents/agents/ten_packages/extension/stepfun_tts_python/extension.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: build (speechmatics-diarization, ./agents/examples/speechmatics-diarization/Dockerfile, ten_agent...
  • GitHub Check: build (voice-assistant-sip-twilio, ./agents/examples/voice-assistant-sip-twilio/Dockerfile, ten_a...
  • GitHub Check: build (transcription, ./agents/examples/transcription/Dockerfile, ten_agent_example_transcription)
  • GitHub Check: build (voice-assistant-nodejs, ./agents/examples/voice-assistant-nodejs/Dockerfile, ten_agent_exa...
  • GitHub Check: build (stepfun-demo, ./agents/examples/stepfun-demo/Dockerfile, ten_agent_example_stepfun_demo)
  • GitHub Check: build (voice-assistant-realtime, ./agents/examples/voice-assistant-realtime/Dockerfile, ten_agent...
  • GitHub Check: build (demo, ./agents/examples/demo/Dockerfile, ten_agent_example_demo)
  • GitHub Check: claude-review
  • GitHub Check: build (voice-assistant-with-ten-vad, ./agents/examples/voice-assistant-with-ten-vad/Dockerfile, t...
  • GitHub Check: build (voice-assistant-nodejs, ./agents/examples/voice-assistant-nodejs/Dockerfile, ten_agent_exa...
  • GitHub Check: build (voice-assistant-live2d, ./agents/examples/voice-assistant-live2d/Dockerfile, ten_agent_exa...
  • GitHub Check: build (speechmatics-diarization, ./agents/examples/speechmatics-diarization/Dockerfile, ten_agent...
  • GitHub Check: build (voice-assistant-video, ./agents/examples/voice-assistant-video/Dockerfile, ten_agent_examp...
  • GitHub Check: build (voice-assistant-sip-twilio, ./agents/examples/voice-assistant-sip-twilio/Dockerfile, ten_a...
  • GitHub Check: build (stepfun-demo, ./agents/examples/stepfun-demo/Dockerfile, ten_agent_example_stepfun_demo)
  • GitHub Check: build (voice-assistant-with-memU, ./agents/examples/voice-assistant-with-memU/Dockerfile, ten_age...
  • GitHub Check: build (transcription, ./agents/examples/transcription/Dockerfile, ten_agent_example_transcription)
  • GitHub Check: build (voice-assistant-realtime, ./agents/examples/voice-assistant-realtime/Dockerfile, ten_agent...
  • GitHub Check: build (demo, ./agents/examples/demo/Dockerfile, ten_agent_example_demo)
  • GitHub Check: build (voice-assistant, ./agents/examples/voice-assistant/Dockerfile, ten_agent_example_voice_ass...
🔇 Additional comments (3)
ai_agents/agents/ten_packages/extension/stepfun_tts_python/extension.py (1)

124-128: LGTM! Clean migration to request-based tracking.

The removal of current_turn_id and migration to request-based identifiers simplifies the state management. Key improvements:

  • All send_tts_audio_end and send_tts_ttfb_metrics calls now use explicit keyword arguments, improving clarity
  • Addition of extra_metadata with model and voice_id enhances observability
  • Null safety maintained through conditional checks on current_request_id

Also applies to: 320-342, 387-390

ai_agents/agents/ten_packages/extension/humeai_tts_python/extension.py (1)

138-142: LGTM! Consistent migration pattern.

The changes align with the other TTS extensions in this PR. Notable improvements:

  • Migration to keyword arguments for audio start/end and TTFB metrics calls
  • extra_metadata includes voice_id and voice_name for better tracking
  • Proper null safety through conditional checks at lines 215 and 254

Also applies to: 221-235, 262-265

ai_agents/agents/ten_packages/extension/cosy_tts_python/extension.py (1)

260-260: LGTM! Thorough migration with improved logging.

The changes are consistent with the other extensions and include comprehensive log message updates. Strengths:

  • All method calls updated to use keyword arguments
  • extra_metadata properly includes model and voice information
  • Log messages consistently updated to remove current_turn_id references
  • State transitions properly handled through request_start_ts and current_request_id checks

Also applies to: 297-297, 323-323, 456-467, 498-502, 470-470, 510-510


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

claude bot commented Oct 16, 2025

Pull Request Review: feat: opt tts remove turnid

Summary

This PR removes the turn_id parameter from three TTS extensions. The change simplifies the TTS interface by removing conversation turn tracking and adds provider-specific metadata to TTFB metrics.

Positive Aspects

  1. Consistent Refactoring across all three TTS extensions
  2. Simplified Interface by removing turn_id parameter
  3. Enhanced Metrics with extra_metadata for provider-specific info
  4. Version Bumping in all manifest.json files
  5. Code Cleanup removing unused state variables

Code Quality

Good practices: Clean removal of turn_id references, updated log messages, no unnecessary code.

Areas for consideration:

  1. Method signature changes suggest base class interface changed - are other TTS extensions updated too?
  2. Extra metadata structure varies across providers (model/voice vs voice_id/voice_name) - is this intentional?

Potential Issues

  1. Breaking change if downstream consumers depend on turn_id
  2. Missing context in logs - is request_id sufficient for debugging?
  3. Scope completeness - found 17+ other TTS extensions that may need updates

Security: No issues identified

Performance: Positive impact from removing unused state tracking

Test Coverage

Test files exist. Recommendation: Verify tests cover new method signatures and extra_metadata.

Action Items Before Merging

  1. Verify base class AsyncTTS2BaseExtension updated
  2. Check all other TTS extensions for turn_id usage
  3. Run test suites for all three extensions
  4. Update API documentation if it exists
  5. Verify no downstream systems depend on turn_id

Overall Assessment

Quality: 4/5 stars

Clean refactoring with good execution. Main concern is ensuring consistent application across ALL TTS extensions in the codebase. Please confirm if only these three needed updates or if this is part 1 of a larger refactoring.

Great work on the consistent implementation!

Copy link

claude bot commented Oct 17, 2025

PR Review: feat: opt tts remove turnid

Summary

This PR removes the turn_id tracking mechanism from three TTS extensions (cosy_tts, humeai_tts, and stepfun_tts) and enhances metrics collection with additional voice/model metadata. The changes are well-structured and improve code maintainability.

✅ Strengths

  1. Consistent Refactoring: The removal of turn_id is applied consistently across all three extensions, showing good attention to detail.

  2. Enhanced Metrics: The addition of voice/model metadata to TTFB metrics is valuable for debugging and monitoring:

    • cosy_tts: Adds model and voice
    • humeai_tts: Adds voice_id and voice_name
    • stepfun_tts: Adds model and voice_id
  3. Improved API: Transition from positional to named parameters (kwargs) for method calls like send_tts_audio_start(), send_tts_ttfb_metrics(), and send_tts_audio_end() improves readability and reduces errors.

  4. Cleaner Logging: Removed unnecessary turn_id references from log messages, making logs more concise.

  5. Proper Version Bumps: All three extensions have appropriate version increments in their manifest files.

⚠️ Concerns & Recommendations

1. Inconsistency Across TTS Extensions

Issue: There are 14+ other TTS extensions in the codebase that still use self.current_turn_id:

  • tencent_tts_python
  • rime_tts
  • playht_tts_python
  • polly_tts
  • openai_tts2_python
  • elevenlabs_tts2_python
  • cartesia_tts2
  • azure_tts_python
  • and 6 more...

Recommendation:

  • Consider creating follow-up issues/PRs to remove turn_id from remaining TTS extensions for consistency
  • Document why only these three extensions were updated in this PR
  • Add a TODO comment or tracking issue for the remaining extensions

2. Breaking Changes Not Documented

Issue: The signature changes to send_tts_* methods appear to be breaking changes:

# Before
await self.send_tts_audio_start(self.current_request_id, self.current_turn_id)

# After  
await self.send_tts_audio_start(self.current_request_id)

Recommendation:

  • Verify if the base class methods have been updated to match these new signatures
  • Document any migration requirements in the PR description or commit message
  • Ensure backward compatibility or document the breaking change

3. Metadata Null Safety

Issue: In stepfun_tts_python/extension.py:207-212 (removed code), there was metadata extraction:

if t.metadata is not None:
    self.session_id = t.metadata.get("session_id", "")
    self.current_turn_id = t.metadata.get("turn_id", -1)

The session_id extraction was also removed, but it's unclear if this was intentional.

Recommendation:

  • Verify that session_id removal doesn't break any functionality
  • If session_id is still needed elsewhere, this should be restored

4. Error Handling Consistency

Issue: The _send_tts_error() method in cosy_tts_python/extension.py:573 now has one fewer parameter, but we can't verify if the base implementation has been updated.

Recommendation:

  • Add integration tests to verify error handling still works correctly
  • Ensure all error paths are covered

5. Test Coverage

Issue: While tests exist for these extensions, there's no evidence in the PR that tests were updated to reflect the API changes.

Recommendation:

  • Update test files to verify the new metric metadata is correctly sent
  • Add tests for the new named parameter API
  • Verify all existing tests pass with the changes

🔒 Security

No security concerns identified. The changes are internal refactoring with no exposure of sensitive data.

⚡ Performance

Positive Impact: Removing unused turn_id tracking reduces memory overhead and simplifies state management. The additional metadata in metrics has negligible performance impact.

📋 Minor Issues

  1. stepfun_tts_python Line 207: Commented code shows session_id was also being extracted from metadata - verify this isn't needed
  2. Code Formatting: All changes follow Python style conventions properly
  3. Type Hints: Existing type hints are maintained correctly

✅ Recommendations Summary

Before Merge:

  1. Verify base class method signatures have been updated
  2. Confirm session_id removal in stepfun_tts is intentional
  3. Run all tests and ensure they pass

After Merge:

  1. Create tracking issues for remaining TTS extensions
  2. Update documentation to reflect the API changes
  3. Consider adding migration guide if this is a breaking change

Overall Assessment

This is a well-executed refactoring that improves code quality and observability. The primary concern is ensuring consistency across all TTS extensions and verifying that dependent code has been properly updated. With the recommended verifications, this PR should be safe to merge.

Recommendation: ✅ Approve with minor verification of the points listed above.

@plutoless plutoless merged commit 5bb2def into main Oct 17, 2025
38 of 47 checks passed
@plutoless plutoless deleted the feat/tts_opt branch October 17, 2025 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants