Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper stream disconnection on Windows (Hub leaks streams) #14

Closed
david1309 opened this issue Jan 11, 2016 · 8 comments
Closed

Improper stream disconnection on Windows (Hub leaks streams) #14

david1309 opened this issue Jan 11, 2016 · 8 comments
Labels

Comments

@david1309
Copy link

USING: Windows 10 Pro 64-bit OS

On the file 'dashel/dashel/dashel-win32.cpp ', there is a bug on the method 'bool Hub::step(const)' which incurs in not closing appropriately clients connections, and therefore keeps listening to the past clients whose connection should had been destroyed.

This generates either a collapse of the hub method (since a 64 position vector gets filled up), or a timeout on the connection.

Our hypothesis is that there's a function called 'notifyEvent' which changes the vector ets[ ], thus changing the behavior of the conditionals that decide when to connect,receive data and disconnect. We changed the position in which this function is called, an managed to make it "kind" of work. A client can indeed connect and disconnect from the hub but the first time it connects it produces a timeout error. From here on it continues working.

This happens because when Client 1 connects at event = 0 and Client 2 connects at event = 1 and so on, each client should disconnect on the same event in which it made the connection. However Client 1 is disconnecting at event = 1 and Client 2 at event = 2; so i think this is the mistake

@stephanemagnenat stephanemagnenat changed the title Dashel Hub Errror for ASEBA Switch on Windows Improper disconnection support on Windows Jan 11, 2016
@stephanemagnenat
Copy link
Member

Please indicate the version of Windows you are using. Moreover, if you manage to create a minimal reproducible test case (i.e. pure Dashel, without Aseba), it would be perfect. Thank you!

@david1309
Copy link
Author

I already indicated the OS version. However, I've never used Dashel on its own, I'm always using the Switch as an intermediate platform to use Dashel Hub ... If I manage to use Dahshel independently ill post the test case :)

@davidjsherman
Copy link
Collaborator

I am seeing what I think is the same problem on Windows 7 and Windows XP, during my end-to-end testing of asebahttp. These tests involve connecting 4 asebadummy nodes that talk to each other, and communicating with them through the HTTP interface. It is not intended as a stress test, but it easily overruns the size 64 event buffer in Dashel::Hub::step.

What is curious is that, for a given stream, I see events (EvPotentialData, EvClosed, EvData) in that order. On the surface it looks like the events are arriving in the wrong order, although possibly I just don't understand how Dashel::Hub::step works. If the streams aren't being closed because data is still arriving, then this might explain the buffer overrun.

The attached file is a sample of 400 passes through the loop on stream->hEvents in line 1286, made from a trace using gdb with a breakpoint at line 1289, displaying the values of stream, hc, and ei->first. I've compressed the display to show the subsequent values for the same stream: each line is stream hc event [hc event ...]. Only very few lines end with an EvClosed for the stream.

In this sample, 22 streams are used, most if them 20 times. I would have expected fewer than 10 streams: 4 aseba streams for the 4 dummy nodes, 1 incoming http steam, and after that one for each concurrent connection (granted the e2e testing is asynchronous but I don't think that much). For reference, the tests being run are from tests/e2e-http/1-http_spec.js in my fork.

I only see this problem with this code, dashel-win32. The dashel-posix code works well on MacOSX and Ubuntu/Debian/Fedora.
hub-step-event-handler-trace.txt

@david1309
Copy link
Author

david1309 commented Apr 20, 2016

I think I had a similar problema to yours David Sherman. The way I
PARTIALLY, solved is explained in the Issues section of ASEBAS GitHub
repository

davidjsherman referenced this issue in davidjsherman/aseba Apr 22, 2016
npm install uses cmd //c when WIN32

run-e2e also runs on WIN32
@stephanemagnenat stephanemagnenat changed the title Improper disconnection support on Windows Improper stream disconnection on Windows (Hub leaks streams) Apr 28, 2017
@cor3ntin
Copy link
Contributor

cor3ntin commented Dec 4, 2017

I have a few more tests to run, but I'm not confident that there is a leak.
The system can't handle more than 64 events. with the proper condition, it can crash with as little as 17 streams connected ( that's a theoretical worst case scenario).
A closed connection takes a few seconds to be disposed of and fill out 4 out of 64 slots.
If your scenario involves as many as 20 streams in a short period of time, you will run into that limit.

I will test further that there are no actual leaks and raise that limit.

@stephanemagnenat
Copy link
Member

I think that proper end-to-end testing could be useful to formally assess this problem (see #27).

@stephanemagnenat
Copy link
Member

stephanemagnenat commented Dec 5, 2017

One way to test this is the following:

  1. Start examples/reconnect "tcp:localhost;8765"
  2. Start and stop repeatedly a chat server: examples/chat

...while dumping the number of streams and events open.

@davidjsherman
Copy link
Collaborator

What about opening 64 simultaneous streams with

for i in {0,1,2,3}{0,1,2,3}{0,1,2,3}
  do examples/dws http://localhost:8080/etc/hosts > /dev/null & done

cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 5, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 5, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 5, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 5, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 6, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
cor3ntin added a commit to cor3ntin/dashel that referenced this issue Dec 6, 2017
 * Avoid waiting infinitively on any single socket.
 * Listen to FD_CLOSE
 * Handle EvClosed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants