Skip to content

perf: use aiofiles for async browser artifact reads#4550

Open
CodeDotJS wants to merge 1 commit intoSkyvern-AI:mainfrom
CodeDotJS:feature/async-artifact
Open

perf: use aiofiles for async browser artifact reads#4550
CodeDotJS wants to merge 1 commit intoSkyvern-AI:mainfrom
CodeDotJS:feature/async-artifact

Conversation

@CodeDotJS
Copy link
Copy Markdown
Contributor

@CodeDotJS CodeDotJS commented Jan 26, 2026


⚡ This PR improves performance by replacing synchronous file I/O operations with asynchronous ones using the aiofiles library for reading browser artifacts (video files and HAR data). The change prevents blocking the event loop during file operations in async functions, leading to better concurrency and responsiveness.

🔍 Detailed Analysis

Key Changes

  • File I/O Operations: Replaced synchronous open() calls with aiofiles.open() for async file reading
  • Video Artifacts: Updated get_video_artifacts() method to use async file operations when reading video data
  • HAR Data: Modified get_har_data() method to use async file operations when reading HAR files
  • Dependencies: Added aiofiles import to support async file operations

Technical Implementation

sequenceDiagram
    participant Client
    participant RealBrowserManager
    participant FileSystem
    
    Client->>RealBrowserManager: get_video_artifacts()
    Note over RealBrowserManager: Before: Blocking file read
    RealBrowserManager->>FileSystem: open(path, "rb")
    FileSystem-->>RealBrowserManager: file.read() [BLOCKS]
    
    Note over RealBrowserManager: After: Non-blocking async read
    RealBrowserManager->>FileSystem: aiofiles.open(path, "rb")
    FileSystem-->>RealBrowserManager: await file.read() [NON-BLOCKING]
    RealBrowserManager-->>Client: video_artifacts
Loading

Impact

  • Performance: Eliminates blocking I/O operations that could freeze the event loop during large file reads
  • Concurrency: Allows other async operations to continue while file I/O is in progress
  • Scalability: Better resource utilization when handling multiple concurrent requests for browser artifacts
  • Code Consistency: Aligns file operations with the async nature of the surrounding codebase

Created with Palmier

Summary by CodeRabbit

Release Notes

  • Refactor
    • Enhanced system performance by converting file I/O operations to use asynchronous patterns, improving responsiveness when retrieving video artifacts and browser session data.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 26, 2026

Walkthrough

The pull request converts synchronous file I/O operations to asynchronous ones using the aiofiles library in the real browser manager. The get_video_artifacts and get_har_data methods now use await with aiofiles.open() instead of blocking file reads.

Changes

Cohort / File(s) Summary
Async File I/O Migration
skyvern/webeye/real_browser_manager.py
Replaced synchronous open() and read() calls with aiofiles.open() and await in get_video_artifacts and get_har_data methods. Added aiofiles import. No signature or error handling changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

Suggested reviewers

  • wintonzheng

Poem

🐰 With whiskers twitching, files now flow,
Async await, no more too slow,
aiofiles hops through I/O streams,
Fulfilling our concurrency dreams!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately reflects the main change: introducing aiofiles for asynchronous file I/O operations in browser artifact reads, which directly matches the modifications to get_video_artifacts and get_har_data functions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@skyvern/webeye/real_browser_manager.py`:
- Around line 263-268: The async file read of each video artifact can raise
OSError and currently aborts the loop; update the loop that iterates
browser_state.browser_artifacts.video_artifacts to wrap the aiofiles.open(...)
and await f.read() calls in a try/except (catch OSError and Exception), log the
failure (e.g., logger.exception or logger.error with context including
video_artifact.video_path and index i) and continue the loop so remaining
artifacts are processed; keep the existing os.path.exists(path) check and only
assign video_data on successful read.
♻️ Duplicate comments (1)
skyvern/webeye/real_browser_manager.py (1)

279-282: Apply the same async error handling here.

Same aiofiles.open/read error-handling gap as above; please mirror the try/except logging pattern. As per coding guidelines, always handle exceptions in async code.

Comment on lines 263 to 268
for i, video_artifact in enumerate(browser_state.browser_artifacts.video_artifacts):
path = video_artifact.video_path
if path and os.path.exists(path=path):
with open(path, "rb") as f:
browser_state.browser_artifacts.video_artifacts[i].video_data = f.read()
async with aiofiles.open(path, "rb") as f:
browser_state.browser_artifacts.video_artifacts[i].video_data = await f.read()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file structure and the specific lines mentioned
cat -n skyvern/webeye/real_browser_manager.py | sed -n '255,280p'

Repository: Skyvern-AI/skyvern

Length of output: 1214


🏁 Script executed:

# Check the function definition and broader context
cat -n skyvern/webeye/real_browser_manager.py | sed -n '240,290p'

Repository: Skyvern-AI/skyvern

Length of output: 2192


🏁 Script executed:

# Check for LOG usage in the file
rg "LOG\." skyvern/webeye/real_browser_manager.py | head -20

Repository: Skyvern-AI/skyvern

Length of output: 1099


🏁 Script executed:

# Check imports at the top of the file
head -50 skyvern/webeye/real_browser_manager.py

Repository: Skyvern-AI/skyvern

Length of output: 1779


🏁 Script executed:

# Check if there are similar patterns in get_har_data or other async file operations
rg "aiofiles\.open" skyvern/webeye/real_browser_manager.py -A 2

Repository: Skyvern-AI/skyvern

Length of output: 345


🏁 Script executed:

# Check if there's any broader exception handling in the get_video_artifacts function
sed -n '247,270p' skyvern/webeye/real_browser_manager.py

Repository: Skyvern-AI/skyvern

Length of output: 986


Add exception handling for async file read operations.

Lines 266-267 lack error handling for aiofiles.open() and await f.read(), which can raise OSError. If either fails, the entire artifact retrieval aborts. Per coding guidelines, always handle exceptions in async code. Wrap with try/except and log failures to allow the loop to continue processing remaining artifacts.

Suggested fix
             if path and os.path.exists(path=path):
-                async with aiofiles.open(path, "rb") as f:
-                    browser_state.browser_artifacts.video_artifacts[i].video_data = await f.read()
+                try:
+                    async with aiofiles.open(path, "rb") as f:
+                        browser_state.browser_artifacts.video_artifacts[i].video_data = await f.read()
+                except OSError:
+                    LOG.warning(
+                        "Failed to read video artifact",
+                        task_id=task_id,
+                        workflow_id=workflow_id,
+                        workflow_run_id=workflow_run_id,
+                        exc_info=True,
+                    )
🤖 Prompt for AI Agents
In `@skyvern/webeye/real_browser_manager.py` around lines 263 - 268, The async
file read of each video artifact can raise OSError and currently aborts the
loop; update the loop that iterates
browser_state.browser_artifacts.video_artifacts to wrap the aiofiles.open(...)
and await f.read() calls in a try/except (catch OSError and Exception), log the
failure (e.g., logger.exception or logger.error with context including
video_artifact.video_path and index i) and continue the loop so remaining
artifacts are processed; keep the existing os.path.exists(path) check and only
assign video_data on successful read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants