Skip to content

Commit c6eace1

Browse files
authored
Merge pull request #3217 from seleniumbase/cdp-mode-patch-1
CDP Mode - Patch 1
2 parents c92181b + 2d7a184 commit c6eace1

File tree

10 files changed

+78
-23
lines changed

10 files changed

+78
-23
lines changed

examples/cdp_mode/ReadMe.md

Lines changed: 43 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
👤 <b translate="no">UC Mode</b> avoids bot-detection by first disconnecting WebDriver from the browser at strategic times, calling special <code>PyAutoGUI</code> methods to bypass CAPTCHAs (as needed), and finally reconnecting the <code>driver</code> afterwards so that WebDriver actions can be performed again. Although this approach works for bypassing simple CAPTCHAs, more flexibility is needed for bypassing bot-detection on websites with advanced protection. (That's where <b translate="no">CDP Mode</b> comes in.)
88

9-
🐙 <b translate="no">CDP Mode</b> is based on <a href="https://github.com/HyperionGray/python-chrome-devtools-protocol" translate="no">python-cdp</a>, <a href="https://github.com/HyperionGray/trio-chrome-devtools-protocol" translate="no">trio-cdp</a>, and <a href="https://github.com/ultrafunkamsterdam/nodriver" translate="no">nodriver</a>. <code>trio-cdp</code> was an early implementation of <code>python-cdp</code>, whereas <code>nodriver</code> is a modern implementation of <code>python-cdp</code>. (Refactored CDP code is imported from <a href="https://github.com/mdmintz/MyCDP" translate="no">MyCDP</a>.)
9+
🐙 <b translate="no">CDP Mode</b> is based on <a href="https://github.com/HyperionGray/python-chrome-devtools-protocol" translate="no">python-cdp</a>, <a href="https://github.com/HyperionGray/trio-chrome-devtools-protocol" translate="no">trio-cdp</a>, and <a href="https://github.com/ultrafunkamsterdam/nodriver" translate="no">nodriver</a>. <code>trio-cdp</code> is an early implementation of <code>python-cdp</code>, and <code>nodriver</code> is a modern implementation of <code>python-cdp</code>. (Refactored Python-CDP code is imported from <a href="https://github.com/mdmintz/MyCDP" translate="no">MyCDP</a>.)
1010

1111
🐙 <b translate="no">CDP Mode</b> includes multiple updates to the above, such as:
1212

@@ -19,12 +19,41 @@
1919

2020
--------
2121

22-
### 🐙 <b translate="no">CDP Mode</b> initialization:
22+
### 🐙 <b translate="no">CDP Mode</b> usage:
2323

24-
* `sb.activate_cdp_mode(url)`
24+
* **`sb.activate_cdp_mode(url)`**
2525

2626
> (Call that from a **UC Mode** script)
2727
28+
That disconnects WebDriver from Chrome (which prevents detection), and gives you access to `sb.cdp` methods (which don't trigger anti-bot checks).
29+
30+
### 🐙 Here are some common `sb.cdp` methods:
31+
32+
* `sb.cdp.click(selector)`
33+
* `sb.cdp.click_if_visible(selector)`
34+
* `sb.cdp.type(selector, text)`
35+
* `sb.cdp.press_keys(selector, text)`
36+
* `sb.cdp.select_all(selector)`
37+
* `sb.cdp.get_text(selector)`
38+
39+
When `type()` is too fast, use the slower `press_keys()` to avoid detection. You can also use `sb.sleep(seconds)` to slow things down.
40+
41+
To use WebDriver methods again, call:
42+
43+
* **`sb.reconnect()`** or **`sb.connect()`**
44+
45+
(Note that reconnecting allows anti-bots to detect you, so only reconnect if it is safe to do so.)
46+
47+
To disconnect again, call:
48+
49+
* **`sb.disconnect()`**
50+
51+
While disconnected, if you accidentally call a WebDriver method, then SeleniumBase will attempt to use the CDP Mode version of that method (if available). For example, if you accidentally call `sb.click(selector)` instead of `sb.cdp.click(selector)`, then your WebDriver call will automatically be redirected to the CDP Mode version. Not all WebDriver methods have a matching CDP Mode method. In that scenario, calling a WebDriver method while disconnected could raise an error, or make WebDriver automatically reconnect first.
52+
53+
To find out if WebDriver is connected or disconnected, call:
54+
55+
* **`sb.is_connected()`**
56+
2857
--------
2958

3059
### 🐙 <b translate="no">CDP Mode</b> examples:
@@ -45,13 +74,15 @@ from seleniumbase import SB
4574
with SB(uc=True, test=True, locale_code="en") as sb:
4675
url = "https://www.pokemon.com/us"
4776
sb.activate_cdp_mode(url)
48-
sb.sleep(1)
77+
sb.sleep(1.5)
4978
sb.cdp.click_if_visible("button#onetrust-reject-all-handler")
79+
sb.sleep(0.5)
5080
sb.cdp.click('a[href="https://www.pokemon.com/us/pokedex/"]')
5181
sb.sleep(1)
5282
sb.cdp.click('b:contains("Show Advanced Search")')
5383
sb.sleep(1)
5484
sb.cdp.click('span[data-type="type"][data-value="electric"]')
85+
sb.sleep(0.5)
5586
sb.cdp.click("a#advSearch")
5687
sb.sleep(1)
5788
sb.cdp.click('img[src*="img/pokedex/detail/025.png"]')
@@ -99,7 +130,7 @@ from seleniumbase import SB
99130
with SB(uc=True, test=True, locale_code="en") as sb:
100131
url = "https://www.hyatt.com/"
101132
sb.activate_cdp_mode(url)
102-
sb.sleep(1)
133+
sb.sleep(1.5)
103134
sb.cdp.click_if_visible('button[aria-label="Close"]')
104135
sb.sleep(0.5)
105136
sb.cdp.click('span:contains("Explore")')
@@ -188,10 +219,14 @@ with SB(uc=True, test=True, locale_code="en") as sb:
188219

189220
```python
190221
sb.cdp.get(url)
191-
sb.cdp.reload()
222+
sb.cdp.open(url)
223+
sb.cdp.reload(ignore_cache=True, script_to_evaluate_on_load=None)
192224
sb.cdp.refresh()
225+
sb.cdp.get_event_loop()
193226
sb.cdp.add_handler(event, handler)
194227
sb.cdp.find_element(selector)
228+
sb.cdp.find(selector)
229+
sb.cdp.locator(selector)
195230
sb.cdp.find_all(selector)
196231
sb.cdp.find_elements_by_text(text, tag_name=None)
197232
sb.cdp.select(selector)
@@ -205,6 +240,7 @@ sb.cdp.load_cookies(*args, **kwargs)
205240
sb.cdp.clear_cookies(*args, **kwargs)
206241
sb.cdp.sleep(seconds)
207242
sb.cdp.bring_active_window_to_front()
243+
sb.cdp.bring_to_front()
208244
sb.cdp.get_active_element()
209245
sb.cdp.get_active_element_css()
210246
sb.cdp.click(selector)
@@ -231,7 +267,7 @@ sb.cdp.medimize()
231267
sb.cdp.set_window_rect()
232268
sb.cdp.reset_window_size()
233269
sb.cdp.get_window()
234-
sb.cdp.get_text()
270+
sb.cdp.get_text(selector)
235271
sb.cdp.get_title()
236272
sb.cdp.get_current_url()
237273
sb.cdp.get_origin()

examples/cdp_mode/raw_async.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66

77
async def main():
8-
driver = await cdp_driver.cdp_util.start()
8+
driver = await cdp_driver.cdp_util.start_async()
99
page = await driver.get("https://www.priceline.com/")
1010
time.sleep(3)
1111
print(await page.evaluate("document.title"))
@@ -21,7 +21,7 @@ async def main():
2121
loop.run_until_complete(main())
2222

2323
# Call everything without using async / await
24-
driver = loop.run_until_complete(cdp_driver.cdp_util.start())
24+
driver = cdp_driver.cdp_util.start_sync()
2525
page = loop.run_until_complete(driver.get("https://www.pokemon.com/us"))
2626
time.sleep(3)
2727
print(loop.run_until_complete(page.evaluate("document.title")))

examples/cdp_mode/raw_footlocker.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@
44
url = "https://www.footlocker.com/"
55
sb.activate_cdp_mode(url)
66
sb.sleep(3)
7-
sb.cdp.click_if_visible("button#touAgreeBtn")
8-
sb.sleep(1)
7+
sb.cdp.click_if_visible('button[id*="Agree"]')
8+
sb.sleep(1.5)
9+
sb.cdp.mouse_click('input[aria-label="Search"]')
10+
sb.sleep(1.5)
911
search = "Nike Shoes"
10-
sb.cdp.click('input[aria-label="Search"]')
11-
sb.sleep(1)
1212
sb.cdp.press_keys('input[aria-label="Search"]', search)
1313
sb.sleep(2)
14-
sb.cdp.click('ul[id*="typeahead"] li div')
14+
sb.cdp.mouse_click('ul[id*="typeahead"] li div')
1515
sb.sleep(2)
1616
elements = sb.cdp.select_all("a.ProductCard-link")
1717
if elements:

examples/cdp_mode/raw_hyatt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
with SB(uc=True, test=True, locale_code="en") as sb:
44
url = "https://www.hyatt.com/"
55
sb.activate_cdp_mode(url)
6-
sb.sleep(1)
6+
sb.sleep(1.5)
77
sb.cdp.click_if_visible('button[aria-label="Close"]')
88
sb.sleep(0.5)
99
sb.cdp.click('span:contains("Explore")')

examples/cdp_mode/raw_pokemon.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,15 @@
33
with SB(uc=True, test=True, locale_code="en") as sb:
44
url = "https://www.pokemon.com/us"
55
sb.activate_cdp_mode(url)
6-
sb.sleep(1)
6+
sb.sleep(1.5)
77
sb.cdp.click_if_visible("button#onetrust-reject-all-handler")
8+
sb.sleep(0.5)
89
sb.cdp.click('a[href="https://www.pokemon.com/us/pokedex/"]')
910
sb.sleep(1)
1011
sb.cdp.click('b:contains("Show Advanced Search")')
1112
sb.sleep(1)
1213
sb.cdp.click('span[data-type="type"][data-value="electric"]')
14+
sb.sleep(0.5)
1315
sb.cdp.click("a#advSearch")
1416
sb.sleep(1)
1517
sb.cdp.click('img[src*="img/pokedex/detail/025.png"]')

examples/cdp_mode/raw_req_async.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ async def request_paused_handler(self, event, tab):
2323
)
2424

2525
async def start_test(self):
26-
driver = await cdp_driver.cdp_util.start(incognito=True)
26+
driver = await cdp_driver.cdp_util.start_async(incognito=True)
2727
tab = await driver.get("about:blank")
2828
tab.add_handler(mycdp.fetch.RequestPaused, self.request_paused_handler)
2929
url = "https://gettyimages.com/photos/firefly-2003-nathan"

seleniumbase/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# seleniumbase package
2-
__version__ = "4.32.0"
2+
__version__ = "4.32.1"

seleniumbase/core/browser_launcher.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -525,9 +525,6 @@ def uc_open_with_cdp_mode(driver, url=None):
525525
js_utils.call_me_later(driver, script, 3)
526526
time.sleep(0.012)
527527
driver.close()
528-
driver.clear_cdp_listeners()
529-
driver.delete_all_cookies()
530-
driver.delete_network_conditions()
531528
driver.disconnect()
532529

533530
cdp_details = driver._get_cdp_details()
@@ -546,6 +543,7 @@ def uc_open_with_cdp_mode(driver, url=None):
546543
cdp_util.start(host=cdp_host, port=cdp_port)
547544
)
548545
page = loop.run_until_complete(driver.cdp_base.get(url))
546+
loop.run_until_complete(page.activate())
549547
if not safe_url:
550548
time.sleep(constants.UC.CDP_MODE_OPEN_WAIT)
551549
cdp = types.SimpleNamespace()

seleniumbase/undetected/__init__.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,8 +133,11 @@ def __init__(
133133
options = ChromeOptions()
134134
try:
135135
if hasattr(options, "_session") and options._session is not None:
136-
# Prevent reuse of options
137-
raise RuntimeError("You cannot reuse the ChromeOptions object")
136+
# Prevent reuse of options.
137+
# (Probably a port overlap. Quit existing driver and continue.)
138+
logger.debug("You cannot reuse the ChromeOptions object")
139+
with suppress(Exception):
140+
options._session.quit()
138141
except AttributeError:
139142
pass
140143
options._session = self

seleniumbase/undetected/cdp_driver/cdp_util.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ async def start(
3737
Helper function to launch a browser. It accepts several keyword parameters.
3838
Conveniently, you can just call it bare (no parameters) to quickly launch
3939
an instance with best practice defaults.
40+
Note: Due to a Chrome-130 bug, use start_async or start_sync instead.
41+
(Calling this method directly could lead to an unresponsive browser)
4042
Note: New args are expected: Use kwargs only!
4143
Note: This should be called ``await start()``
4244
:param user_data_dir:
@@ -88,6 +90,20 @@ async def start(
8890
return await Browser.create(config)
8991

9092

93+
async def start_async(*args, **kwargs) -> Browser:
94+
headless = False
95+
if "headless" in kwargs:
96+
headless = kwargs["headless"]
97+
decoy_args = kwargs
98+
decoy_args["headless"] = True
99+
driver = await start(**decoy_args)
100+
kwargs["headless"] = headless
101+
kwargs["user_data_dir"] = driver.config.user_data_dir
102+
driver.stop() # Due to Chrome-130, must stop & start
103+
time.sleep(0.15)
104+
return await start(*args, **kwargs)
105+
106+
91107
def start_sync(*args, **kwargs) -> Browser:
92108
loop = asyncio.get_event_loop()
93109
headless = False

0 commit comments

Comments
 (0)