Add runtime stream enable/disable/status commands by ThomasK33 · Pull Request #951 · vercel-labs/agent-browser

ThomasK33 · 2026-03-20T19:58:11Z

Summary

add runtime stream enable, stream status, and stream disable commands for already-running daemon sessions
add runtime stream server lifecycle management, .stream metadata handling, and explicit shutdown behavior
update help/docs and add unit + e2e coverage for runtime streaming flows

Validation

cd cli && cargo test stream_ -- --nocapture
cd cli && cargo test e2e_runtime_stream_enable_before_launch_attaches_and_disables -- --ignored --test-threads=1
cd cli && cargo test

📋 Implementation Plan

Implementation Plan: Runtime stream enablement for an already-running session

Goal

Add a supported runtime path to enable WebSocket streaming on an already-running agent-browser daemon/session without requiring a daemon restart, plus companion status/disable flows so the lifecycle is inspectable and reversible.

Verified repo context

The stream server is only created during daemon startup from AGENT_BROWSER_STREAM_PORT in cli/src/native/daemon.rs.
DaemonState already models streaming as optional runtime state via stream_client and stream_server, and update_stream_client() already re-wires the browser/client relationship when the browser launches or closes in cli/src/native/actions.rs.
cli/src/native/stream.rs already supports starting a stream server with or without an attached CDP client via StreamServer::start() and StreamServer::start_without_client().
The daemon already exposes screencast_start / screencast_stop, but those only control frame production; they do not create the WebSocket server.
There is currently no runtime daemon action or CLI command to enable, disable, or inspect streaming after startup, and .stream file management currently lives in daemon startup/shutdown code.

Implementation constraints inferred from the current code

.stream file lifecycle must stay accurate for discovery scripts and existing tooling.
Runtime enablement must work whether a browser is already attached or not.
Repeated enable/disable calls need deterministic behavior and clear user-facing errors.
The implementation should prefer explicit cleanup over relying on Arc drops alone if the current StreamServer lifecycle is ambiguous.

Proposed product surface

User-facing CLI

Add a stream command group:

agent-browser stream enable [--port <port>]
agent-browser stream disable
agent-browser stream status

Underlying daemon actions

stream_enable
stream_disable
stream_status

Recommended response shape

stream_enable → { enabled: true, port, connected, screencasting }
stream_disable → { disabled: true }
stream_status → { enabled, port: number|null, connected, screencasting }

Port behavior

Prefer allowing omitted/auto-assigned ports for the runtime command, because StreamServer already returns the actual bound port. If that adds unnecessary CLI complexity, MVP can require an explicit port and defer auto-assignment to a follow-up.

Workstream 1: Runtime daemon lifecycle

Files / symbols

cli/src/native/actions.rs
cli/src/native/stream.rs
cli/src/native/daemon.rs (reference/consistency, likely light or no edits)
shared socket-dir helpers if needed (cli/src/connection.rs or a small helper extraction)

Changes

Add runtime handlers in cli/src/native/actions.rs
- handle_stream_enable(cmd, state)
- handle_stream_disable(state)
- handle_stream_status(state)
- Register them in execute_command().
Implement handle_stream_enable
- Validate requested port semantics.
- Reject duplicate enable attempts with a clear error if streaming is already enabled.
- Start a StreamServer at runtime using StreamServer::start_without_client(...).
- Store the returned stream_server and stream_client in DaemonState.
- Call state.update_stream_client().await so an already-running browser is attached immediately.
- Write/update the session .stream file with the actual bound port.
- Return a structured status payload.
Implement handle_stream_disable
- If screencasting is active, stop screencasting first so runtime state stays consistent.
- Shut down the runtime stream server deterministically.
- Clear state.stream_server and state.stream_client.
- Remove the .stream file.
- Return a structured success payload.
Implement handle_stream_status
- Report whether a stream server exists.
- Report actual port (from the active server, not a cached request value).
- Report whether a browser is connected and whether screencasting is active.
Add deterministic stream shutdown support if needed
- If StreamServer does not already guarantee clean task teardown on drop, add an explicit shutdown mechanism in cli/src/native/stream.rs and use it from handle_stream_disable.

Defensive-programming expectations

Validate port input and reject impossible values.
Assert or explicitly guard invariants around stream_server/stream_client pairing.
Fail loudly on .stream write/remove errors where user-visible correctness depends on them.
Keep repeated enable/disable/status flows idempotent or clearly errored; do not silently leave stale state behind.

Quality gate after Workstream 1

Targeted tests for action-level behavior pass.
Manual raw-daemon smoke check succeeds: enable → status → disable without restarting the session.

Workstream 2: CLI wiring and help output

Files / symbols

cli/src/commands.rs and/or cli/src/main.rs (depending on current parser split)
cli/src/output.rs

Changes

Wire the new CLI surface to the new daemon actions.
Add --json support and machine-readable output examples.
Update help text in cli/src/output.rs:
- command list
- command-specific help
- examples
- any environment-variable notes clarifying startup vs runtime streaming

Quality gate after Workstream 2

agent-browser stream enable, disable, and status parse correctly.
Human-readable and JSON output stay consistent with existing conventions.

Workstream 3: Tests

Files / symbols

cli/src/native/e2e_tests.rs
targeted unit tests near new runtime handlers if appropriate
any existing parity/dispatch tests if command routing coverage exists

Minimum test matrix

Enable streaming on a session that started without AGENT_BROWSER_STREAM_PORT.
stream status reports disabled before enable, then enabled with the bound port after enable.
Enable streaming while a browser is already open; verify the runtime server is immediately usable.
Enable streaming before browser launch; verify later browser launch attaches cleanly.
Disable streaming removes the .stream file and updates status.
Double-enable returns a clear error.
Double-disable returns a clear error or explicit no-op response, whichever product behavior is chosen.
Port conflict path returns a useful error.
If auto-port is supported, verify returned port is non-zero and connectable.

Validation commands

cd cli && cargo test
cd cli && cargo fmt -- --check
cd cli && cargo clippy
cd cli && cargo test e2e -- --ignored --test-threads=1

If the full ignored e2e suite is too slow during iteration, run a focused subset first, but finish with the full required validation before claiming success.

Workstream 4: Documentation and agent-facing docs

Required documentation updates for this user-facing feature

Per repo guidance, update all of the following:

cli/src/output.rs
README.md
skills/agent-browser/SKILL.md
docs/src/app/streaming/page.mdx
docs/src/app/commands/page.mdx
relevant inline doc comments in touched source files

Additional docs to consider

docs/src/app/configuration/page.mdx to clarify the difference between startup env-var streaming and runtime CLI streaming.

Documentation content to add

What runtime enablement does and does not do.
Relationship between stream enable and screencast_start.
How to discover the active stream port.
How to disable streaming cleanly.
Session-scoped examples using --session.

Quality gate after Workstream 4

Help text, README, docs site, and skill docs all describe the same command names and semantics.
MDX table formatting follows repo conventions.

Dogfooding and review evidence

Setup

Use a local Chrome-capable development environment.
Start one session without AGENT_BROWSER_STREAM_PORT so the new runtime path is exercised.
Prepare a lightweight WebSocket client/viewer to verify connection to the returned stream port.

Dogfooding flow

Start a session without startup streaming.
Open a simple page in that session.
Run agent-browser --session <name> stream status and verify it reports disabled.
Run agent-browser --session <name> stream enable [--port ...].
Connect a WebSocket client/viewer to the returned port and verify frames/status arrive.
Run agent-browser --session <name> stream disable.
Verify the WebSocket connection drops or refuses reconnects, .stream is gone, and stream status reports disabled.
Repeat once with the browser initially absent to verify later attach behavior.

Required evidence artifacts

Capture all of the following during implementation review:

Terminal screenshots showing:
- status before enable
- enable output with bound port
- status after enable
- disable output and final status
A short screen recording/video showing the end-to-end flow:
- session running without streaming
- runtime enable
- successful live preview connection
- runtime disable
If a visual preview page is used, capture at least one screenshot of the live preview.

If implementation happens in an environment that supports generated artifacts, attach the screenshots and video to the work report so a reviewer can verify the behavior without replaying the steps manually.

Risks and decisions to resolve during implementation

Command naming
- Default plan assumes a stream command group.
- If the current parser architecture strongly prefers top-level verbs, keep the runtime action names stable and adjust only the CLI surface.
Auto-port behavior
- Decide whether omitted port means auto-select now or in a follow-up.
Disable semantics
- Decide whether disabling while screencasting should implicitly stop screencasting or error until the user stops it first. The recommended behavior is to stop screencasting as part of disable for a simpler UX.
Shutdown mechanics
- Confirm whether dropping the runtime StreamServer is sufficient; if not, add explicit shutdown for deterministic cleanup.

Acceptance criteria

A session started without AGENT_BROWSER_STREAM_PORT can enable streaming at runtime without restarting the daemon.
The returned/runtime-discovered port is connectable and reflected in .stream.
stream status accurately reports enabled state, port, browser connectivity, and screencasting status.
stream disable tears down runtime streaming cleanly and removes .stream.
Repeated enable/disable and port-conflict paths produce predictable, documented behavior.
Tests, formatting, linting, docs, and dogfooding evidence all pass/review cleanly.

Generated with mux • Model: openai:gpt-5.4 • Thinking: high

vercel · 2026-03-20T19:58:17Z

@ThomasK33 is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

ctate

Thanks for this contribution! I found two issues to address before merging:

handle_stream_disable unconditionally resets state.screencasting = false, which can orphan a user-initiated screencast that was started independently. Consider only resetting this if the stream server was responsible for starting it, or leaving it untouched entirely.
In handle_stream_disable, if remove_stream_file fails, the ? early-return leaves stream_server/stream_client as Some pointing at a dead server. The state cleanup should be unconditional.

ThomasK33 · 2026-03-20T21:57:52Z

Hey Chris, thanks for the quick turnaround on the review. I resolved both issues by having handle_stream_disable leave unrelated state.screencasting values unchanged and by clearing stream_server/stream_client before attempting .stream file removal. I’ve also added regression tests for both cases.

ctate requested changes Mar 20, 2026

View reviewed changes

ThomasK33 added 3 commits March 21, 2026 10:03

Add runtime stream management commands

a481408

Run rustfmt and satisfy clippy

8332129

Fix stream disable cleanup semantics

7c5ac85

ThomasK33 force-pushed the runtime-stream-management branch from af087cf to 7c5ac85 Compare March 21, 2026 10:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add runtime stream enable/disable/status commands#951

Add runtime stream enable/disable/status commands#951
ThomasK33 wants to merge 3 commits intovercel-labs:mainfrom
coder:runtime-stream-management

ThomasK33 commented Mar 20, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 20, 2026

Uh oh!

ctate left a comment

Uh oh!

ThomasK33 commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ThomasK33 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Implementation Plan: Runtime stream enablement for an already-running session

Goal

Verified repo context

Proposed product surface

User-facing CLI

Underlying daemon actions

Recommended response shape

Port behavior

Workstream 1: Runtime daemon lifecycle

Files / symbols

Changes

Defensive-programming expectations

Quality gate after Workstream 1

Workstream 2: CLI wiring and help output

Files / symbols

Changes

Quality gate after Workstream 2

Workstream 3: Tests

Files / symbols

Minimum test matrix

Validation commands

Workstream 4: Documentation and agent-facing docs

Required documentation updates for this user-facing feature

Additional docs to consider

Documentation content to add

Quality gate after Workstream 4

Dogfooding and review evidence

Setup

Dogfooding flow

Required evidence artifacts

Risks and decisions to resolve during implementation

Acceptance criteria

Uh oh!

vercel bot commented Mar 20, 2026

Uh oh!

ctate left a comment

Choose a reason for hiding this comment

Uh oh!

ThomasK33 commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ThomasK33 commented Mar 20, 2026 •

edited

Loading