Skip to content

Conversation

@iamarkdev
Copy link
Contributor

Overview

This PR hardens WebTransport session/stream handling against late-arriving events and fixes clear bugs in send ordering and error propagation. It also adds targeted tests to lock in the fixes.

Changes (Expanded)

  • Race-condition hardening for datagrams (main/lib/session.js)
    • Added state guard in onDatagramReceived and wrapped respond/enqueue in try/catch so late datagrams after close no longer throw ERR_INVALID_STATE (UDP can arrive late).
  • Race-condition hardening for incoming streams (main/lib/session.js)
    • Early-return if session is closed/failed; wrap enqueue to incoming stream controllers so late-arriving streams are safely dropped instead of crashing.
  • Session close robustness (main/lib/session.js)
    • Use (closeInfo?.reason ?? '').substring(0, 1023) to avoid calling substring on undefined.
    • When propagating close/error to stream controllers, wrap controller.error() to avoid throwing if already closed/errored.
  • Stream read/write teardown guards (main/lib/stream.js)
    • Wrap byob.respond, readableController.enqueue/close, and writable/readable controller.error with try/catch to tolerate closures that occur between checks and operations.
    • Log-and-continue pattern prevents ERR_INVALID_STATE during concurrent close/read/write.
  • sendOrder setter bug fix (main/lib/stream.js)
    • Setter now assigns the provided value (previously mistakenly used args.sendOrder), ensuring priority updates apply.
  • sendGroup null safety (main/lib/stream.js)
    • Use optional chaining/0n fallback for _sendGroupId to avoid null deref when no send group is set.
  • Server session visitor safety (main/lib/server.js)
    • Drop sessions if server already stopped; wrap session stream enqueue to avoid crashes when the consumer closed the stream.
  • stopServer cleanup safety (main/lib/server.js)
    • Wrap controller close() calls to avoid throwing if already closed during teardown.
  • HTTP2/HTTP3 transport error propagation (main/lib/http2/node/server.js, transports/http3-quiche/lib/serversocket.js)
    • Pass real error objects to onServerError instead of empty/undefined payloads, preserving error context.

Tests

  • test/race-conditions.spec.js covers:
    • Datagram-after-close race
    • Stream-after-close race
    • Session visitor after server stop
    • sendOrder setter behavior
    • sendGroup null safety (HTTP2 test skipped due to known scheduler limitation)
  • test/session.spec.js session lifecycle/error paths continue to pass.

How to Run

cd test
SERVER_URL="https://127.0.0.1:<port>" \
CERT_HASH="<cert-hash>" \
npx mocha --timeout 30000 './race-conditions.spec.js' './session.spec.js'

this.sessionController[i].close() // inform the controller, that we are closing
} catch {
// Controller already closed - expected race condition
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will not allow this change. As it will just hide problems.

} catch {
// Controller closed, session arrived too late - expected race condition
log('session dropped, session controller already closed')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a proper check. It just hides problems

} catch {
// Controller already errored/closed - expected during cleanup
}
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. This just hide problems. I am happy to fix problems, but not by just hiding the problems.

if (this.state === 'closed' || this.state === 'failed') {
log('stream dropped, session in wrong state')
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause a memory leak.

}
} catch {
// Stream controller closed, stream arrived too late - expected race condition
log('stream dropped, stream controller already closed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is not a proper check, it just hides problems.

} catch {
// Stream closed, datagram arrived too late - expected race condition with UDP
log('datagram dropped, stream controller already closed (byob)')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

} catch {
// Stream closed, datagram arrived too late - expected race condition with UDP
log('datagram dropped, stream controller already closed')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this just hides problems.

}
} catch {
// Stream closed between check and operation - expected race condition
log('commitReadBuffer: stream controller already closed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this just hides problems.

} catch {
// Stream already closed - expected race condition
log('commitReadBuffer: stream controller already closed during fin')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this just hides problems.

)
} catch {
// Controller already errored/closed - expected race condition
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

}
},
"main/node_modules/eslint": {
"version": "9.3.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is package.lock changed?

@@ -0,0 +1,279 @@
/* eslint-env mocha */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the PR to just include the test?
Then I can fix the issues in a way, that does not hide other problems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will modify myself. But most of the error do not trigger the error. At least on my machine. This can happen on race conditions, but then the tests are not functional.

@martenrichter
Copy link
Member

Thank you for testing, code changes and related work!
However, most changes just catch errors.
I do not consider this the proper way to fix issues.
Though your test is very valuable. I think just the test will do, and then I can try to include proper fixes. But it may be that I just throw other errors in some occasions (e.g. if not the internal object is calling session).

@martenrichter
Copy link
Member

Change will be land via:
#438
I hope I catched the causes of the crashes. (I can only hope as the automated test did not fail). If an issue persists please report again, perferably with a test that can reproduce the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants