Skip to content

fix: socket connection timeout #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 2, 2025
Merged

fix: socket connection timeout #53

merged 2 commits into from
Apr 2, 2025

Conversation

fioan89
Copy link
Collaborator

@fioan89 fioan89 commented Mar 31, 2025

Context:

  • okhttp uses an HTTP/2 connection to the coder rest api in order to resolves the workspaces.
  • HTTP/2 uses a single TCP connection for multiple requests (multiplexing). If the connection is idle, the http server can close that connection, with client side ending in a socket timeout if it doesn't detect the drop in time.
  • similarly on the client side, if the OS goes into sleep mode, the connection might have been interrupted. HTTP/2 doesn't always detect this quickly, leading to stale streams when Toolbox wakes up.

Implementation:

  • we could try to force the client to use HTTP/1 which creates a TCP connection for each request, but from my testing it seems that configuring a retry strategy when a client attempts to reuse a TCP connection that has unexpectedly closed plus detecting large gaps between the last poll time and socket timeout time allows us to reset the client and create fresh TCP connections.

  • resolves Timeout error after Toolbox remained opened overnight #13

Context:
- okhttp uses an HTTP/2 connection to the coder rest api in order to resolves
  the workspaces.
- HTTP/2 uses a single TCP connection for multiple requests (multiplexing).
  If the connection is idle, the http server can close that connection, with client side ending in a socket timeout if it doesn't detect the drop in time.
- similarly on the client side, if the OS goes into sleep mode, the
  connection might have been interrupted. HTTP/2 doesn't always detect this
  quickly, leading to stale streams when Toolbox wakes up.

Implementation:

- we could try to force the client to use HTTP/1 which creates a TCP
  connection for each request, but from my testing it seems that configuring
  a retry strategy when a client attempts to reuse a TCP connection that
  has unexpectedly closed does the job.

- resolves #13
@fioan89 fioan89 requested review from matifali and bcpeinhardt March 31, 2025 22:36
@matifali matifali requested a review from f0ssel April 1, 2025 06:34
- retryOnConnectionFailure seems to not be enough there are some residual socket timeouts that can still be reproduced
- with this patch we detect if the os went to sleep by measuring large gaps between the last poll time and the socket timeout exception.
- if sleep is detected we reset the OkHttp client and establish a fresh TCP connection.
@fioan89 fioan89 marked this pull request as ready for review April 2, 2025 13:38
@fioan89 fioan89 merged commit 4ca9190 into main Apr 2, 2025
5 checks passed
@fioan89 fioan89 deleted the fix-http-socket-timeout branch April 2, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Timeout error after Toolbox remained opened overnight
2 participants