feat: support alerts across all drivers and languages#295
Conversation
|
Thank you for the PR. A few questions came up as I was looking through it:
al.do("click on the button that triggers alert")
# accessibility tree now has
# <alert id="1">
# <button id="2" text="Accept" />
# <button id="3" text="Dismiss" />
# </alert>
al.check("alert is shown")
al.do("accept alert") # clicks button with id=2 -> accepts alert
|
|
@p0deje Thanks for the feedback! Here's what I've changed: Selenium — CDP commands work even with an alert open, so we now return the full page tree + alert nodes appended at the end. The alert stays open for the user to accept/dismiss. Playwright — Auto-accept is unavoidable (Playwright freezes if the dialog handler doesn't resolve immediately). Added clear docstrings explaining this constraint. Dialog info is still exposed in the tree for check()/get(). Node structure — Buttons are now children of alertdialog via childIds: |
|
We've just merged a rewrite of the core into TypeScript - can you please rebase your PR? Let's also add some tests both in Python and TypeScript. Something simple where Alumnium clicks a button to trigger alert, then checks for alert presence and finally closes it. |
- Selenium: detect alerts via WebDriver, get full page tree via CDP (which works even with alert open), and append synthetic alertdialog nodes with Accept/Dismiss buttons as children - Playwright: auto-accept dialogs (required by Playwright API to prevent page freeze), capture dialog info, and expose as synthetic nodes for check()/get() verification - Add alert_action/alertAction to AccessibilityElement and handle in ChromiumAccessibilityTree for both Python and TypeScript - Route click() on alert buttons to accept/dismiss the alert - Add test HTML fixture and test cases for both Python and TypeScript Closes alumnium-hq#209 Made-with: Cursor
5ca2063 to
3994e60
Compare
| current_handles = self.driver.window_handles | ||
| try: | ||
| current_handles = self.driver.window_handles | ||
| except Exception: |
There was a problem hiding this comment.
Can we use a concrete exception error (e.g. UnexpectedAlertPresentException)
| new_handles = self.driver.window_handles | ||
| try: | ||
| new_handles = self.driver.window_handles | ||
| except Exception: |
There was a problem hiding this comment.
Ditto about UnexpectedAlertPresentException.
| try: | ||
| ActionChains(self.driver).move_to_element(element).click().perform() | ||
| except ElementNotInteractableException: | ||
| # Fallback to direct click if ActionChains fails (e.g. for <option> elements) |
There was a problem hiding this comment.
Please revert formatting changes!
There was a problem hiding this comment.
Please revert formatting changes!
There was a problem hiding this comment.
Please revert formatting changes!
There was a problem hiding this comment.
The tests need to be disabled for Appium, please see other test to check how it can be done.
There was a problem hiding this comment.
The tests need to be disabled for Appium, please see other test to check how it can be done.
| self.autoswitch_to_new_tab = True | ||
| self.full_page_screenshot = FULL_PAGE_SCREENSHOT | ||
| self._last_dialog_info: dict | None = None | ||
| self.page.on("dialog", self._on_dialog) |
There was a problem hiding this comment.
This needs to be applied to all new pages - let's move to _setup_page_tracking.
Summary
Resolves #209
switch_to.alert; Playwright captures dialogs viapage.on('dialog'); Appium handles WebView alerts viaswitch_to.alertalertdialog, OK (accept), and Cancel (dismiss) button nodes are added to the accessibility tree so the LLM can see and interact with themalert.accept()oralert.dismiss()instead of trying to find a DOM elementDriver-specific behavior
switch_to.alert.Files changed (11)
Python (6 files):
accessibility_element.py— addedalert_actionfieldchromium_accessibility_tree.py— store/read_alert_actionin XML treeselenium_driver.py— alert detection, synthetic nodes, click routing, resilient tab-switch decoratorplaywright_driver.py— dialog handler, captured dialog nodes, click acknowledgementplaywright_async_driver.py— same as sync, with async-safe dialog acceptanceappium_driver.py— basic alert routing in clickTypeScript (5 files):
AccessibilityElement.ts— addedalertActionfieldChromiumAccessibilityTree.ts— store/read_alert_actionin XML treeSeleniumDriver.ts— alert detection, synthetic nodes, click routing, resilient decoratorPlaywrightDriver.ts— dialog handler, captured dialog nodes, click acknowledgementAppiumDriver.ts— basic alert routing in clickTest plan
alert()→ verify LLM sees alertdialog + OK/Cancel → click OK → alert dismissedconfirm()→ click Cancel → confirm returns falsealert()→ verify dialog info appears in tree → LLM can assert on textconfirm()→ verify auto-accept behavior