-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improper stream disconnection on Windows (Hub leaks streams) #14
Comments
Please indicate the version of Windows you are using. Moreover, if you manage to create a minimal reproducible test case (i.e. pure Dashel, without Aseba), it would be perfect. Thank you! |
I already indicated the OS version. However, I've never used Dashel on its own, I'm always using the Switch as an intermediate platform to use Dashel Hub ... If I manage to use Dahshel independently ill post the test case :) |
I am seeing what I think is the same problem on Windows 7 and Windows XP, during my end-to-end testing of asebahttp. These tests involve connecting 4 asebadummy nodes that talk to each other, and communicating with them through the HTTP interface. It is not intended as a stress test, but it easily overruns the size 64 event buffer in Dashel::Hub::step. What is curious is that, for a given stream, I see events (EvPotentialData, EvClosed, EvData) in that order. On the surface it looks like the events are arriving in the wrong order, although possibly I just don't understand how Dashel::Hub::step works. If the streams aren't being closed because data is still arriving, then this might explain the buffer overrun. The attached file is a sample of 400 passes through the loop on In this sample, 22 streams are used, most if them 20 times. I would have expected fewer than 10 streams: 4 aseba streams for the 4 dummy nodes, 1 incoming http steam, and after that one for each concurrent connection (granted the e2e testing is asynchronous but I don't think that much). For reference, the tests being run are from tests/e2e-http/1-http_spec.js in my fork. I only see this problem with this code, dashel-win32. The dashel-posix code works well on MacOSX and Ubuntu/Debian/Fedora. |
I think I had a similar problema to yours David Sherman. The way I |
npm install uses cmd //c when WIN32 run-e2e also runs on WIN32
I have a few more tests to run, but I'm not confident that there is a leak. I will test further that there are no actual leaks and raise that limit. |
I think that proper end-to-end testing could be useful to formally assess this problem (see #27). |
One way to test this is the following:
...while dumping the number of streams and events open. |
What about opening 64 simultaneous streams with
|
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
* Avoid waiting infinitively on any single socket. * Listen to FD_CLOSE * Handle EvClosed
USING: Windows 10 Pro 64-bit OS
On the file 'dashel/dashel/dashel-win32.cpp ', there is a bug on the method 'bool Hub::step(const)' which incurs in not closing appropriately clients connections, and therefore keeps listening to the past clients whose connection should had been destroyed.
This generates either a collapse of the hub method (since a 64 position vector gets filled up), or a timeout on the connection.
Our hypothesis is that there's a function called 'notifyEvent' which changes the vector ets[ ], thus changing the behavior of the conditionals that decide when to connect,receive data and disconnect. We changed the position in which this function is called, an managed to make it "kind" of work. A client can indeed connect and disconnect from the hub but the first time it connects it produces a timeout error. From here on it continues working.
This happens because when Client 1 connects at event = 0 and Client 2 connects at event = 1 and so on, each client should disconnect on the same event in which it made the connection. However Client 1 is disconnecting at event = 1 and Client 2 at event = 2; so i think this is the mistake
The text was updated successfully, but these errors were encountered: