feat: fetch add wait_until parameter for page loads options#1896
feat: fetch add wait_until parameter for page loads options#1896karlseguin merged 3 commits intolightpanda-io:mainfrom
Conversation
Add `--wait_until` and `--wait_ms` CLI arguments to configure session wait behavior. Updates `Session.wait` to evaluate specific page load states (`load`, `domcontentloaded`, `networkidle`, `fixed`) before completing the wait loop.
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
src/Config.zig
Outdated
| networkidle, | ||
| fixed, | ||
|
|
||
| pub const js_enum_from_string = true; |
There was a problem hiding this comment.
| pub const js_enum_from_string = true; |
This is only needed for enums that are serialized to v8/js.
src/browser/Session.zig
Outdated
| var page = &(self.page orelse return .no_page); | ||
| while (true) { | ||
| const wait_result = self._wait(page, wait_ms) catch |err| { | ||
| const wait_result = self._wait(&page, wait_ms, wait_until) catch |err| { |
There was a problem hiding this comment.
There's no reason to pass a pointer to the pointer of the page.
| const wait_result = self._wait(&page, wait_ms, wait_until) catch |err| { | |
| const wait_result = self._wait(page, wait_ms, wait_until) catch |err| { |
src/browser/Session.zig
Outdated
| } | ||
|
|
||
| fn _wait(self: *Session, page: *Page, wait_ms: u32) !WaitResult { | ||
| fn _wait(self: *Session, page: **Page, wait_ms: u32, wait_until: lp.Config.WaitUntil) !WaitResult { |
There was a problem hiding this comment.
| fn _wait(self: *Session, page: **Page, wait_ms: u32, wait_until: lp.Config.WaitUntil) !WaitResult { | |
| fn _wait(self: *Session, page: *Page, wait_ms: u32, wait_until: lp.Config.WaitUntil) !WaitResult { |
related to the above comment. All of the page.X that were changed to page.*.X will have to be updated.
src/browser/Session.zig
Outdated
| .fixed => false, | ||
| .domcontentloaded => (page.*._load_state == .load or page.*._load_state == .complete), | ||
| .load => (page.*._load_state == .complete), | ||
| .networkidle => (page.*._load_state == .complete and http_active == 0), |
There was a problem hiding this comment.
What about using page._notified_network_idle == .done here? It would introduce a 500ms delay to allow pending work (e.g. a setTimeout) to start new connections. On the flip, that could result in some telemetry/beacon keep the page alive longer than desired.
| with_base: bool = false, | ||
| with_frames: bool = false, | ||
| strip: dump.Opts.Strip = .{}, | ||
| wait_ms: u32 = 5000, |
There was a problem hiding this comment.
Could you add the help text for these in the fetch section of the printUsageAndExit function?
- Refactor `wait` and `_wait` to handle `page` as `*Page` instead of `**Page`, preventing stale references during navigations. - Update `networkidle` wait condition to use `_notified_network_idle == .done`. - Document `--wait_ms` and `--wait_until` options in `Config.zig` help text.
|
Thank you for your suggestion. I have made the modifications. |
Small tweaks to #1896 Improve the wait ergonomics with an Option with default parameter. Revert page pointer logic to original (don't think that change was necessary).
Overview
Currently, the
fetchcommand uses a hardcoded 5000ms timeout and a fixed "wait until idle" strategy. This limits the tool's effectiveness when dealing with slow-loading pages, heavy Single Page Applications (SPAs), or scenarios requiring specific lifecycle events (e.g., only waiting for the DOM to become interactive).This PR introduces two new command-line arguments to the
fetchcommand:--wait_msand--wait_until.Key Changes
--wait_msflag (defaults to 5000ms to maintain backward compatibility).--wait_untilflag, allowing users to choose the condition for returning the DOM:load(default): Waits for thewindow.onloadevent to fire.domcontentloaded: Returns immediately after the HTML is parsed and scripts have finished running, without waiting for subresources like images to load.networkidle: Waits until there are no active HTTP requests and the page is in a stable state.fixed: Forces the engine to wait for the exact duration specified by--wait_ms, regardless of what events the page triggers.src/browser/Session.zigto support these strategies in the core wait loop, ensuring the engine can intelligently yield or return based on the requested events and pending JavaScript macrotasks.Session.waitfunction signature.Benefits:
--http_timeout).--wait_until domcontentloadedskips waiting for unnecessary static resources, significantly speeding up the dump process.fixedstrategy is highly valuable for debugging pages with complex, non-deterministic background loading logic.Usage Examples: