-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CPU spin in the JavaScript implementation side #567
Conversation
@lifengl I converted your PR to be a 'draft' to match your PR's title, and then I stripped 'draft: ' from the title since those tend to be accidentally left there when we merge. |
@AArnott : i think the PR is now ready to be reviewed. Not sure why C# unit tests failed, i didn't touch code there. |
When I look at the pipeline failures, there are no C# unit test failures. The failures or in the tslint step, presumably due to your code changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to chat with you about how we can write a test that verifies the fix. Once we have that, I may try my hand at authoring a fix that doesn't triple the size of the function (and make it complicated enough that I am struggling to understand it).
src/nerdbank-streams/src/tests/MultiplexingStream.SeededChannels.spec.ts
Show resolved
Hide resolved
Replacing with || will break the function here, because it will not keep 0 ?? 5 === 0 but 0 || 5 === 5.
It would make the logic block to check availableSize === 0 never returns true, and I believe this could leave the function to spin when the stream is closed.
Your statement that the behavior of '??' expression in TypeScript is different from JavaScript doesn't match documents I read earlier. It would be a real problem if TypeScript to break JavaScript behavior. void 0 means 'undefined' in JavaScript. It was written in this way, because undefined can be used as a variable name, and assigned with a different value.
|
Thanks for educating me on |
Also export new `sliceStream` and `readAsync` functions.
@AArnott : I added an unit test to cover the fix. Somehow, I could not reproduce backpressure in the unit test, which might be the behavior of a PassThrough stream being used there. But it turned out that it was more straightforward to write a test for the CPU spin issue. It just need the write side to write block of data in two pieces with await delay in the middle. If the read side spins the CPU, it would not allow the write side to finish the work, which would lead the test to hang without the fix. (I verified it by taking the unit test to the main branch). With the fix, the same test passes. I did some cleanup to remove some redundant condition checks. |
Thanks. I noticed your additional commits. Thank you very much for the test. I spent a good portion of Saturday working in and around your PR, with the goal to both understand and simplify the getBufferFrom method. I ended up adding a few more generally useful helper methods. I have almost everything working, but I discovered that Duplex.unshift doesn't work, which my new design depends on, so I'm looking for a solution to that before updating this PR and merging it. |
3121df7
to
a075a7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I force-pushed to your branch to keep your test changes but rewrite the product changes (and add features too).
Can you confirm this still works for your scenario (as the changes are substantial) and then I'll merge the PR.
It doesn't work. The product would just hang. |
Rats. Any chance you can debug into it and at least write a test that fails? |
BTW, using unshift is tricky, because it will fail once 'end' event is fired, in which case, it cannot unshift back a block of data for another reader to pick it up. |
oooh. That's a very interesting point. I wonder when the |
yes, with/without a later fix, there are still other problems. Basically after keep using for a while, connection will be suddenly closed. it is also hard to find out why. |
Do you recommend I bring back your version of the fix? |
yes, originally I wanted to take a look whether I could spend more time on it, but our check point is couple days away. I agree we need do further cleaning up here, but i think the priority is to unblock other work to move forward. As the issue essentially blocks testing slightly bigger input, and made no meaningful way to see anything potentially affected by scale. |
This PR is to fix problems that getBufferFrom keeps spinning and consumes lots of CPU, also leads product to hang.
This happens when the stream has partial data ready to read, because read(size) returns null (unless the stream is closed), but because there are unread data in the stream, the code will not block waiting on anything but immediately tries to read again. This would consume lots of CPU. On the other hand, because the length of pipeline can be limited. when the reader wants a larger block over the size of the pipeline buffer, it would never get the data, because the writer cannot write any more data until reader takes some bytes out of the stream.
This PR is intended to use readableLength to read partial data out, and joins them on the reader side when it is necessary. However, the PR turns out to be more complicated due to this state is not defined in the ReadableStream interface, but in the implementation (Readable). Not sure why an important property like this is not included in the contract. So the code lands to keep the old behavior unless readableLength is available. Also, i kept running into memory issues in unit tests. There were some event handler which can leak memory, and was fixed. But it turned out that the real reason is that reableEnded can be false when streamEnded event is fired. Because the earlier changed code depends on the state, it ends up spinning in the function. Interestingly, it leads out of memory.