Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API crashes on etag calculation during connection reset #1660

Open
CharlieC3 opened this issue May 16, 2023 · 3 comments
Open

API crashes on etag calculation during connection reset #1660

CharlieC3 opened this issue May 16, 2023 · 3 comments

Comments

@CharlieC3
Copy link
Member

CharlieC3 commented May 16, 2023

Describe the bug
The API crashes if a connection is terminated when an Etag is being calculated.

Logs attached:
Explore-logs-05_15_2023, 11_51_32 PM.txt

API Version: 7.1.10

@github-project-automation github-project-automation bot moved this to Recent issues in API Board May 16, 2023
@zone117x
Copy link
Member

The Unable to calculate transaction ETag log message appears to be a red herring. The error handling paths for that are solid, and I've simulated throwing the same error there and it doesn't crash the API. It's likely some other pg interaction that isn't handling the ECONNRESET error correctly and crashing. The challenging part is that the stack traces in the error log are very short -- they don't show where in the application code this is happening:

Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:217:20)
    at TCP.callbackTrampoline (node:internal/async_hooks:130:17)

Also, we do have ECONNRESET errors covered in the general postgres error handler:

} else if (error.code === 'CONNECTION_CLOSED') {

@CharlieC3 has this happened more than once and/or are you able to reproduce? Otherwise I think we'd need to manually test by injecting this error at the pg lib level then test a bunch of calls to see if/what causes the crash.

@zone117x
Copy link
Member

zone117x commented May 16, 2023

I'm also curious, could include more logs before the exit? I wonder if this lines up with the recent re-enabling of socket-io. Perhaps the bug could be in that area. I've scanned through the pg queries performed by socket-io related code and nothing immediately stood out.

@CharlieC3
Copy link
Member Author

@zone117x According to our logs this is a repeat occurrence and is becoming more frequent. Here's one example, and one more.
I do see some errors with the proxy server right before this etag error appears which I didn't notice before. It's possible this is the root cause and the etag error is a result of that.

@smcclellan smcclellan moved this from Recent issues to Backlog in API Board May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants