Skip to content

Save page as MHTML using Chrome - strange, try automating everything visually #314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
weihenglim opened this issue Oct 9, 2021 · 5 comments
Labels

Comments

@weihenglim
Copy link

I am trying to save a web page as a single MHTML file using Chrome on Windows.

Normally this is achieved be right clicking the web page, clicking "Save as..." in the context menu and then selecting "Save as type *.mhtml" in the Windows dialogue box. While I am able to achieve steps 1 and 2 using visual automation, nothing happens after the "Save as..." option is clicked (Windows dialogue box doesn't appear).

Help with resolving the issue would be much appreciated. Any alternative solutions are also welcome.

@kensoh
Copy link
Member

kensoh commented Oct 10, 2021

This seems strange because visual automation methods simulate the normal user keyboard entries and mouse clicks.

Can you try running with r.init(visual_automation = True, chrome_browser = False), and automate the processing of opening your normal Chrome window and entering the URL etc before doing this save as step, to see if that helps?

I'm suspecting if an automated browser is not allowed to do this step. Or maybe this feature of Chrome does not like it that the automate changes the default download location to the same folder as the current directory.

@kensoh kensoh changed the title Save page as MHTML using Chrome Save page as MHTML using Chrome - strange issue, try automating everything visually Oct 10, 2021
@kensoh kensoh added the query label Oct 10, 2021
@kensoh kensoh changed the title Save page as MHTML using Chrome - strange issue, try automating everything visually Save page as MHTML using Chrome - strange, try automating everything visually Oct 10, 2021
@weihenglim
Copy link
Author

weihenglim commented Oct 10, 2021

I ran a simple test script and can confirm that it works with chrome_browser = False and automating everything visually. I am also suspecting that the issue might be due to the automated Chrome browser having a default download location, thus causing the Windows dialogue box to not appear.

Is it possible to disable the default download location in the automated Chrome browser? While it is possible to perform the download with chrome_browser = False, it is not practical for my use case as I need access to the HTML code for certain tasks that can't be done through pure visual automation (e.g. iterating through a list of elements using XPath)

@kensoh
Copy link
Member

kensoh commented Oct 10, 2021

I see.. I'm afraid it is hard to do that because the code to set download location is embedded quite deeply in the upstream TagUI engine used by RPA for Python. One idea is trying r.download_location() to set to the same folder as your default download location, to see if that helps. But I'm not sure whether there will be any impact to this scenario.

More on this - #279 (comment)

@weihenglim
Copy link
Author

Unfortunately, setting r.download_location() does not seem to resolve the issue.

It seems like I will need to have 2 separate RPA processes, the first to collate all the URLs via browser automation and the second to download the pages using visual automation with chrome_browser = False.

I appreciate the assistance, thanks for your help!

@kensoh
Copy link
Member

kensoh commented Oct 10, 2021

Oh I see.. Ok no probs! I see, yes that sounds like a good workaround to batch the processes into 2 stages.

There are ways to hack TagUI to prevent setting download location, but it is a bad idea because whenever there is update for this package it will be overwritten the hacks. And the hacks are not straightforward, and probably buggy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants