-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support JFR emergency dumps on out of memory #10600
Comments
Hi @christianhaeubl, do you foresee any issues this could run into? |
I think you already summarized it nicely. The JFR implementation already avoids Java heap allocations most of the time but I am sure that you will encounter a few allocations that are problematic for the emergency dump (i.e., a bit more code needs to be |
Hi @christianhaeubl! Iterating the chunk repository directory and copying all the chunk file data to an emergency dump snapshot involves a lot of file IO and operations that are probably simpler to do in C. I think using Java would be possible, but would make it a bit more complicated to avoid using the Java heap. Is there a preference for sticking to Java, or is using native code okay in cases like this? |
We try to stick with Java, unless there is a clear benefit of using C code. Here are two examples:
If you are unsure, feel free to point me to the C implementation in HotSpot. Then, I can estimate more easily which approach is better. |
Thanks @christianhaeubl. Then I will try to finish the implementation in Java first. In Hotspot, the process starts in |
Is your feature request related to a problem? Please describe.
Currently it's possible to receive heap dumps on out of memory (OOM) but this is not yet possible for JFR. OpenJDK has this feature and we should try to implement it as well in Native Image. One of JFR's primary goals is to provide insight in the event of a crash like OOME.
Describe the solution you'd like.
The JFR implementation in Native Image should support emergency dumping like in OpenJDK.
Describe who do you think will benefit the most.
GraalVM users would be most likely to benefit. Heap dumps are probably the most important report in the event of OOM, but insights from JFR could also be very beneficial. For example, JFR's CPU and allocation profiling can help locate where problem areas might be occurring. JFR's garbage collection events and thread data could also be helpful with diagnosing problems.
Describe alternatives you've considered.
The alternative is just leaving this feature unimplemented.
Express whether you'd like to help contributing this feature
I can help contribute this.
Update #5410 when completed
Implementation details
Doing an emergency dump would require:
In order to do this, JFR flushing will have to be made fully allocation free (most of it already is). A handful of places like the
JfrTypeRepository
will need to be redone.The text was updated successfully, but these errors were encountered: