Skip to content

Add runtime stream enable/disable/status commands#951

Open
ThomasK33 wants to merge 3 commits intovercel-labs:mainfrom
coder:runtime-stream-management
Open

Add runtime stream enable/disable/status commands#951
ThomasK33 wants to merge 3 commits intovercel-labs:mainfrom
coder:runtime-stream-management

Conversation

@ThomasK33
Copy link

@ThomasK33 ThomasK33 commented Mar 20, 2026

Summary

  • add runtime stream enable, stream status, and stream disable commands for already-running daemon sessions
  • add runtime stream server lifecycle management, .stream metadata handling, and explicit shutdown behavior
  • update help/docs and add unit + e2e coverage for runtime streaming flows

Validation

  • cd cli && cargo test stream_ -- --nocapture
  • cd cli && cargo test e2e_runtime_stream_enable_before_launch_attaches_and_disables -- --ignored --test-threads=1
  • cd cli && cargo test

📋 Implementation Plan

Implementation Plan: Runtime stream enablement for an already-running session

Goal

Add a supported runtime path to enable WebSocket streaming on an already-running agent-browser daemon/session without requiring a daemon restart, plus companion status/disable flows so the lifecycle is inspectable and reversible.

Verified repo context

  • The stream server is only created during daemon startup from AGENT_BROWSER_STREAM_PORT in cli/src/native/daemon.rs.
  • DaemonState already models streaming as optional runtime state via stream_client and stream_server, and update_stream_client() already re-wires the browser/client relationship when the browser launches or closes in cli/src/native/actions.rs.
  • cli/src/native/stream.rs already supports starting a stream server with or without an attached CDP client via StreamServer::start() and StreamServer::start_without_client().
  • The daemon already exposes screencast_start / screencast_stop, but those only control frame production; they do not create the WebSocket server.
  • There is currently no runtime daemon action or CLI command to enable, disable, or inspect streaming after startup, and .stream file management currently lives in daemon startup/shutdown code.
Implementation constraints inferred from the current code
  • .stream file lifecycle must stay accurate for discovery scripts and existing tooling.
  • Runtime enablement must work whether a browser is already attached or not.
  • Repeated enable/disable calls need deterministic behavior and clear user-facing errors.
  • The implementation should prefer explicit cleanup over relying on Arc drops alone if the current StreamServer lifecycle is ambiguous.

Proposed product surface

User-facing CLI

Add a stream command group:

  • agent-browser stream enable [--port <port>]
  • agent-browser stream disable
  • agent-browser stream status

Underlying daemon actions

  • stream_enable
  • stream_disable
  • stream_status

Recommended response shape

  • stream_enable{ enabled: true, port, connected, screencasting }
  • stream_disable{ disabled: true }
  • stream_status{ enabled, port: number|null, connected, screencasting }

Port behavior

Prefer allowing omitted/auto-assigned ports for the runtime command, because StreamServer already returns the actual bound port. If that adds unnecessary CLI complexity, MVP can require an explicit port and defer auto-assignment to a follow-up.

Workstream 1: Runtime daemon lifecycle

Files / symbols

  • cli/src/native/actions.rs
  • cli/src/native/stream.rs
  • cli/src/native/daemon.rs (reference/consistency, likely light or no edits)
  • shared socket-dir helpers if needed (cli/src/connection.rs or a small helper extraction)

Changes

  1. Add runtime handlers in cli/src/native/actions.rs

    • handle_stream_enable(cmd, state)
    • handle_stream_disable(state)
    • handle_stream_status(state)
    • Register them in execute_command().
  2. Implement handle_stream_enable

    • Validate requested port semantics.
    • Reject duplicate enable attempts with a clear error if streaming is already enabled.
    • Start a StreamServer at runtime using StreamServer::start_without_client(...).
    • Store the returned stream_server and stream_client in DaemonState.
    • Call state.update_stream_client().await so an already-running browser is attached immediately.
    • Write/update the session .stream file with the actual bound port.
    • Return a structured status payload.
  3. Implement handle_stream_disable

    • If screencasting is active, stop screencasting first so runtime state stays consistent.
    • Shut down the runtime stream server deterministically.
    • Clear state.stream_server and state.stream_client.
    • Remove the .stream file.
    • Return a structured success payload.
  4. Implement handle_stream_status

    • Report whether a stream server exists.
    • Report actual port (from the active server, not a cached request value).
    • Report whether a browser is connected and whether screencasting is active.
  5. Add deterministic stream shutdown support if needed

    • If StreamServer does not already guarantee clean task teardown on drop, add an explicit shutdown mechanism in cli/src/native/stream.rs and use it from handle_stream_disable.

Defensive-programming expectations

  • Validate port input and reject impossible values.
  • Assert or explicitly guard invariants around stream_server/stream_client pairing.
  • Fail loudly on .stream write/remove errors where user-visible correctness depends on them.
  • Keep repeated enable/disable/status flows idempotent or clearly errored; do not silently leave stale state behind.

Quality gate after Workstream 1

  • Targeted tests for action-level behavior pass.
  • Manual raw-daemon smoke check succeeds: enable → status → disable without restarting the session.

Workstream 2: CLI wiring and help output

Files / symbols

  • cli/src/commands.rs and/or cli/src/main.rs (depending on current parser split)
  • cli/src/output.rs

Changes

  1. Wire the new CLI surface to the new daemon actions.
  2. Add --json support and machine-readable output examples.
  3. Update help text in cli/src/output.rs:
    • command list
    • command-specific help
    • examples
    • any environment-variable notes clarifying startup vs runtime streaming

Quality gate after Workstream 2

  • agent-browser stream enable, disable, and status parse correctly.
  • Human-readable and JSON output stay consistent with existing conventions.

Workstream 3: Tests

Files / symbols

  • cli/src/native/e2e_tests.rs
  • targeted unit tests near new runtime handlers if appropriate
  • any existing parity/dispatch tests if command routing coverage exists

Minimum test matrix

  1. Enable streaming on a session that started without AGENT_BROWSER_STREAM_PORT.
  2. stream status reports disabled before enable, then enabled with the bound port after enable.
  3. Enable streaming while a browser is already open; verify the runtime server is immediately usable.
  4. Enable streaming before browser launch; verify later browser launch attaches cleanly.
  5. Disable streaming removes the .stream file and updates status.
  6. Double-enable returns a clear error.
  7. Double-disable returns a clear error or explicit no-op response, whichever product behavior is chosen.
  8. Port conflict path returns a useful error.
  9. If auto-port is supported, verify returned port is non-zero and connectable.

Validation commands

cd cli && cargo test
cd cli && cargo fmt -- --check
cd cli && cargo clippy
cd cli && cargo test e2e -- --ignored --test-threads=1

If the full ignored e2e suite is too slow during iteration, run a focused subset first, but finish with the full required validation before claiming success.

Workstream 4: Documentation and agent-facing docs

Required documentation updates for this user-facing feature

Per repo guidance, update all of the following:

  • cli/src/output.rs
  • README.md
  • skills/agent-browser/SKILL.md
  • docs/src/app/streaming/page.mdx
  • docs/src/app/commands/page.mdx
  • relevant inline doc comments in touched source files

Additional docs to consider

  • docs/src/app/configuration/page.mdx to clarify the difference between startup env-var streaming and runtime CLI streaming.

Documentation content to add

  • What runtime enablement does and does not do.
  • Relationship between stream enable and screencast_start.
  • How to discover the active stream port.
  • How to disable streaming cleanly.
  • Session-scoped examples using --session.

Quality gate after Workstream 4

  • Help text, README, docs site, and skill docs all describe the same command names and semantics.
  • MDX table formatting follows repo conventions.

Dogfooding and review evidence

Setup

  • Use a local Chrome-capable development environment.
  • Start one session without AGENT_BROWSER_STREAM_PORT so the new runtime path is exercised.
  • Prepare a lightweight WebSocket client/viewer to verify connection to the returned stream port.

Dogfooding flow

  1. Start a session without startup streaming.
  2. Open a simple page in that session.
  3. Run agent-browser --session <name> stream status and verify it reports disabled.
  4. Run agent-browser --session <name> stream enable [--port ...].
  5. Connect a WebSocket client/viewer to the returned port and verify frames/status arrive.
  6. Run agent-browser --session <name> stream disable.
  7. Verify the WebSocket connection drops or refuses reconnects, .stream is gone, and stream status reports disabled.
  8. Repeat once with the browser initially absent to verify later attach behavior.

Required evidence artifacts

Capture all of the following during implementation review:

  • Terminal screenshots showing:
    • status before enable
    • enable output with bound port
    • status after enable
    • disable output and final status
  • A short screen recording/video showing the end-to-end flow:
    • session running without streaming
    • runtime enable
    • successful live preview connection
    • runtime disable
  • If a visual preview page is used, capture at least one screenshot of the live preview.

If implementation happens in an environment that supports generated artifacts, attach the screenshots and video to the work report so a reviewer can verify the behavior without replaying the steps manually.

Risks and decisions to resolve during implementation

  1. Command naming

    • Default plan assumes a stream command group.
    • If the current parser architecture strongly prefers top-level verbs, keep the runtime action names stable and adjust only the CLI surface.
  2. Auto-port behavior

    • Decide whether omitted port means auto-select now or in a follow-up.
  3. Disable semantics

    • Decide whether disabling while screencasting should implicitly stop screencasting or error until the user stops it first. The recommended behavior is to stop screencasting as part of disable for a simpler UX.
  4. Shutdown mechanics

    • Confirm whether dropping the runtime StreamServer is sufficient; if not, add explicit shutdown for deterministic cleanup.

Acceptance criteria

  • A session started without AGENT_BROWSER_STREAM_PORT can enable streaming at runtime without restarting the daemon.
  • The returned/runtime-discovered port is connectable and reflected in .stream.
  • stream status accurately reports enabled state, port, browser connectivity, and screencasting status.
  • stream disable tears down runtime streaming cleanly and removes .stream.
  • Repeated enable/disable and port-conflict paths produce predictable, documented behavior.
  • Tests, formatting, linting, docs, and dogfooding evidence all pass/review cleanly.

Generated with mux • Model: openai:gpt-5.4 • Thinking: high

@vercel
Copy link
Contributor

vercel bot commented Mar 20, 2026

@ThomasK33 is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Collaborator

@ctate ctate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution! I found two issues to address before merging:

  1. handle_stream_disable unconditionally resets state.screencasting = false, which can orphan a user-initiated screencast that was started independently. Consider only resetting this if the stream server was responsible for starting it, or leaving it untouched entirely.
  2. In handle_stream_disable, if remove_stream_file fails, the ? early-return leaves stream_server/stream_client as Some pointing at a dead server. The state cleanup should be unconditional.

@ThomasK33
Copy link
Author

Hey Chris, thanks for the quick turnaround on the review. I resolved both issues by having handle_stream_disable leave unrelated state.screencasting values unchanged and by clearing stream_server/stream_client before attempting .stream file removal. I’ve also added regression tests for both cases.

@ThomasK33 ThomasK33 force-pushed the runtime-stream-management branch from af087cf to 7c5ac85 Compare March 21, 2026 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants