-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Timeouts when loading up jupyterhub singleuser images #3686
Comments
Next time this bug happens follow this thread: And enable debug logging But for now try to increase timeouts |
Might be related.
|
https://z2jh.jupyter.org/en/stable/administrator/debug.html Try these debug commands if things fail again. |
I turned on debugging then tried to start up Amanda's pod. Two new error messages (perhaps because I turned on debugging?). Or perhaps because I increased the start timeout to a long time.
(i'm not certain i turned on debugging correctly but it didn't throw up errors) Notes from @themightychris :
|
One idea - issues with containerd. I remember we had crazy log issues with containerd |
![]() This bug pops up a lot for @amandaha8 |
https://cloudlogging.app.goo.gl/xYoHuNueYvuemQcNA The pod logs are totally spammed with these errors. The notebook container seems to spend most of its time in a state which cannot be force terminated, which is undoubtedly the source of the symptom. Why it is unkillable is less clear at the moment. When the notebook container does finally restart itself, it throws tons of errors from certain extensions which fail to link. At least some of the errors are related to the runtime being incompatible, which makes me wonder if there is an upgrade we can take which may also help with this issue. |
I deleted amanda8's PVC. Her PV (pvc-537c2c55-3c90-44db-a36d-6bd4a4b0ebc2) should have disappeared but stayed in a "released" state. The error killing pod messages continued. I started to see error messages like: After a while, I started amanda8's profile in jupyterhub, and it attached a new volume, and it attached quite fast. The KillContainerError messages haven't stopped. I was easily able to kill Amanda8's newest pvc, and the related PV disappeared as well. |
Describe the bug
Some users in some Kubernetes zones having trouble spawning up their pods. It times out after a long time. Typically @amandaha8 .
I don't think it's related to spawning up a new node but related to taking to long to attach the users data.
To Reproduce
Only happens sometimes and goes away.
Expected behavior
Spawns GREAAATT
The text was updated successfully, but these errors were encountered: