Skip to content

Conversation

@ProjectsByJackHe
Copy link
Contributor

@ProjectsByJackHe ProjectsByJackHe commented Dec 9, 2025

Description

As discussed in issue #5491 , from logs, the watchdog assert is firing.
For now, let's increase it by 100%.

Testing

CI

Documentation

N/A

@ProjectsByJackHe ProjectsByJackHe requested a review from a team as a code owner December 9, 2025 21:17
@ProjectsByJackHe ProjectsByJackHe changed the title increase watchdog timeout increase spinquic watchdog timeout Dec 9, 2025
@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.64%. Comparing base (4e84609) to head (18ddaa0).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5647      +/-   ##
==========================================
- Coverage   86.34%   85.64%   -0.71%     
==========================================
  Files          60       60              
  Lines       18663    18663              
==========================================
- Hits        16114    15983     -131     
- Misses       2549     2680     +131     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@guhetier
Copy link
Collaborator

guhetier commented Dec 9, 2025

I am not that familiar with the spin test, but isn't this only going to cause the spintest to run for a longer time?
Looking at the sources very fast, the time you change control the time spent spinning, and there is a WATCHDOG_WIGGLE_ROOM that gives a bit of extra time for the watchdog.

@ProjectsByJackHe
Copy link
Contributor Author

I am not that familiar with the spin test, but isn't this only going to cause the spintest to run for a longer time? Looking at the sources very fast, the time you change control the time spent spinning, and there is a WATCHDOG_WIGGLE_ROOM that gives a bit of extra time for the watchdog.

Yes! good catch

@guhetier
Copy link
Collaborator

Did you investigate, based on the traces, what was pending when the timeout fired? 2 / 3 seconds is already quite a lot. It is possible something was delayed on a slow VM, but it is possible too that a softlock / deadlock was happening in MsQuic.

@ProjectsByJackHe
Copy link
Contributor Author

Did you investigate, based on the traces, what was pending when the timeout fired? 2 / 3 seconds is already quite a lot. It is possible something was delayed on a slow VM, but it is possible too that a softlock / deadlock was happening in MsQuic.

Based on the ETL trace from the link I added in the issue, I couldn't find any deadlocks happening. Although, there are comments in SpinQuic itself that notes certain code paths will lead to deadlocks, but those are all disabled.

@guhetier
Copy link
Collaborator

Ok. This might help, but I suspect going from 2sec to 3sec won't be a definitive fix.
We should make sure dumps are collected so that next time, we can check the state of pending threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants