Skip to content

Remote library fails to connect in some scenarios #3300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TChatzigiannakis opened this issue Sep 9, 2019 · 6 comments
Closed

Remote library fails to connect in some scenarios #3300

TChatzigiannakis opened this issue Sep 9, 2019 · 6 comments
Labels
acknowledge To be acknowledged in release notes beta 1 bug priority: medium
Milestone

Comments

@TChatzigiannakis
Copy link
Contributor

TChatzigiannakis commented Sep 9, 2019

Hello,
I tried to run this example with Robot Framework but ran into some issues. To reproduce what I'm about to describe:

  • Make sure you have Chrome installed
  • Clone the linked repository and navigate into the newly created local directory
  • npm install in the base directory
  • node webdriver-manager update in node_modules\webdriver-manager\bin
  • npm run start in the base directory
  • npm run robotremote in the base directory
  • npm run re2e in the base directory

You should get errors like the following:

==============================================================================
E2E
==============================================================================
[ ERROR ] Adding keyword 'startProtractor' to library 'Remote' failed: Calling dynamic method 'get_keyword_arguments' failed: ResponseNotReady: Request-sent
[ ERROR ] Adding keyword 'navigateToPage' to library 'Remote' failed: Calling dynamic method 'get_keyword_arguments' failed: ResponseNotReady: Request-sent
E2E.App.E2E-Spec
==============================================================================
Load App                                                              | FAIL |
No keyword with name 'Start Protractor' found.
------------------------------------------------------------------------------
Shutdown                                                              | PASS |
------------------------------------------------------------------------------
E2E.App.E2E-Spec                                                      | FAIL |
2 critical tests, 1 passed, 1 failed
2 tests total, 1 passed, 1 failed
==============================================================================
E2E                                                                   | FAIL |
2 critical tests, 1 passed, 1 failed
2 tests total, 1 passed, 1 failed
==============================================================================

I've tried to debug this issue and the source seems to be that a Transport object is shared between method calls in the XmlRpcRemoteClient class in Remote.py. I've attempted a change to have a new Transport and ServerProxy object every time, which seems to resolve the issue:

==============================================================================
E2E
==============================================================================
E2E.App.E2E-Spec
==============================================================================
Load App                                                              | PASS |
------------------------------------------------------------------------------
Shutdown                                                              | PASS |
------------------------------------------------------------------------------
E2E.App.E2E-Spec                                                      | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================
E2E                                                                   | PASS |
2 critical tests, 2 passed, 0 failed
2 tests total, 2 passed, 0 failed
==============================================================================

I'd like to hear your thoughts on whether this is a good approach, or if there are good reasons to keep the Transport and ServerProxy objects alive along with the XmlRpcRemoteClient. (If the latter is the case, it would be very appreciated if you could point me to a workaround to get the example working.)

@pekkaklarck
Copy link
Member

pekkaklarck commented Sep 9, 2019

This sounds pretty strange. Some questions related to the problem below:

  1. Do you always get this problem or only sporadically? Are you running the demo according to the demo project instructions? I don't have time to try it myself at least right now, but it sounds strange if the demo doesn't work at all. I would assume it has worked when it has been created and it would be good to know what has changed.

  2. There's no design reason to create only one ServerProxy object per library instance. Do you have more details why that approach causes problems for you?

  3. It's great that you have found a way to avoid the problem. One a quick look it seems changing _server to a property would make the fix simpler but we can discuss this more in PR Remove state sharing between calls in XML-RPC client #3299. Anyway, before merging the PR I'd like to understand a bit better what's going on and why the fix is needed.

@TChatzigiannakis
Copy link
Contributor Author

TChatzigiannakis commented Sep 9, 2019

Hi @pekkaklarck,
Thank you for the quick response!

  1. Do you always get this problem or only sporadically? Are you running the demo according to the demo project instructions? I don't have time to try it myself at least right now, but it sounds strange if the demo doesn't work at all. I would assume it has worked when it has been created and it would be good to know what has changed.

Always, and so far it's reproducible on all machines I've tried. I don't know if the example worked in the past.

  1. There's not design reason to create only one ServerProxy object per library instance. Do you have more details why that approach causes problems for you?

I'm not quite certain why it doesn't work as it is, but I'm inclined to think that something leaves the Transport object in a state where it's not ready to perform the next request. The clue for me was this (at the end of the "ERROR" lines): ResponseNotReady: Request-sent

It could well be the case that the mistake is in the example (but I couldn't figure out where), which is why I wanted to run this change by the Robot Framework maintainers, to see if my change is a red herring covering up the root cause instead of addressing it.

  1. It's great that you have found a way to avoid the problem. One a quick look it seems changing _server to a property would make the fix simpler but we can discuss this more in PR Remove state sharing between calls in XML-RPC client #3299. Anyway, before merging the PR I'd like to understand a bit better what's going on and why the fix is needed.

Sure, let's discuss this in the PR.

@pekkaklarck pekkaklarck added this to the v3.2 milestone Oct 2, 2019
@pekkaklarck
Copy link
Member

The PR #3299 looks great. It would be great to know why the fix is needed but investigating it can be a somewhat big task. The main reason I'd like to understand this is that I'm slightly worried does the change affect performance. We recently got #3362 about performance issues with the Remote library and making performance worse is definitely not a good idea. I think I'll test the performance locally before merging the PR.

@pekkaklarck pekkaklarck changed the title Seemingly simple use case of XmlRpcRemoteRemote fails Remote library fails to connect in some scenarios Nov 22, 2019
@pekkaklarck pekkaklarck added the acknowledge To be acknowledged in release notes label Nov 26, 2019
@pekkaklarck
Copy link
Member

I tested PR #3300 locally and found no performance problems. It has now been merged. Thanks @TChatzigiannakis!

@lmartorella
Copy link

Hi,
Actually this change is introducing important performance degradation in my case. It is depending on how the HTTP remote server is implemented.

For instance, for .NET Remote Libraries, it is possible that the XMLRPC is sitting on top the standard .NET Microsoft remoting channel (HttpChannel), that takes something like ~1 seconds to open and validate a new HTTP connection (not sure what cause such huge delay... perhaps to prevent DDOS attacks? or because it is obsolete technology?).

With this #3300 change, simply loading a remote library from .NET will now takes roughly 1+1+1 seconds for each keyword to load. Such 3 seconds are due to the three operations _get_kw_args, _get_kw_tags and _get_kw_doc done in stack in _create_handler(). In turn, the create_handler is called in stack for the whole library.
This makes a medium-sized library made of - say - 100 keywords several minutes to load.

Without this fix (e.g. RF 3.1) loading the same library was super-fast, since all the calls was going on the same context.

Would it be possible to refine the current fix, and at least register the whole library using the same context?

Thanks, L

@lmartorella
Copy link

Update: the performance degradation is due to a slow DNS resolution of 'localhost' to 127.0.0.1 used to connect the Remote library.
It seems that this is a common issue: https://stackoverflow.com/search?q=slow+localhost
The suggested solutions are ranging to switch from localhost to 127.0.0.1, to update the /etc/host file, to disable IPV6, etc..

If you are affected by this issue, the address resolution of localhost takes approx 1 second.

Now, using robotframework 3.1 we never noticed it, but the OS issue was still there. The fix #3300 that closes and reopen the xmlrpc channel for each and every function is letting this issue to surface, with the effect of super-slow library load when using localhost.
Obviously, switching to 127.0.0.1 will fix the problem, but I'm wondering of how many people will be affected by that.

Given that the xmlrpc is a protocol that natively supports multi-part requests (HTTP streams), this fix is essentially degrading performances at startup.
Perhaps it is appropriate to find another way to fix this issue, ensuring that the xmlrpc channel is reused, but the state is not shared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge To be acknowledged in release notes beta 1 bug priority: medium
Projects
None yet
Development

No branches or pull requests

3 participants