-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak/OutOfMemoryError when using 11.0.23+9 IBM Semeru Runtime Certified Edition for z/OS #20043
Comments
I would recommend to open a Sales Force case here : https://w3.ibm.com/w3publisher/ibm_runtime_technologies/support/reporting-java-issues The following link explains how to collect MustGather data: https://www.ibm.com/support/pages/mustgather-read-first-runtimes-java-technology @manqingl FYI |
If you by some reason can't / don't want open case with Service you can collect system cores, javacores, GC verbose logs from Java 11 and Java 8 runs both and provide them to us directly (myself or @pshipton). Files for working case Java 8 can be collected at |
BTW is it working with slightly larger -Xmx (stabilized there) or there is out-of-control memory leak? |
I'll look into the Sales Force case. BTW, https://w3.ibm.com/w3publisher/ibm_runtime_technologies/support/reporting-java-issues fails with message: "Hmm. We’re having trouble finding that site." The MustGather link is okay. It is an out-of-control memory leak. It runs about twice as long if I double the heap. |
@ForceRs I just opened it (https://w3.ibm.com/w3publisher/ibm_runtime_technologies/support/reporting-java-issues) and it was fine. Are you an IBM internal? |
No. Not IBM internal. We're a vendor and IBM Business Partner. |
I tried to go to https://w3.ibm.com/w3publisher/ibm_runtime_technologies/support/reporting-java-issues from all my browsers, both from work and home PCs, but all failed (even just https://w3.ibm.com fails). I also tried to access it from my phone with no Wi-Fi connection (thinking some kind of filtering was preventing the connection), but no go. II'll send stuff to dmitripivkine, as suggested Aug 22, 2024, 12:41 PM. |
Based on the call site information, this appears to be related to JNI local references (particularly JNI reference frames). I can see no reason why this would differ between JDK levels. |
I am actively gathering the requested information. We hit a speed bump with Transaction dump (TDump) generation. Working with our Systems Programmer to resolve it. I hope to provide files 26Aug2024. |
dmitripivkine, I have the requested diagnostic files. I've zipped them up into a 397MB file. How do I privately share the file with you? I'd attach it here, but it's large and there's some SDSF output we'd rather not make public. |
You can use any file sharing service and send me the link over email [email protected]. Alternatively if you use Slack you can try to find me in Eclipse OpenJ9 workspace and download file there (I am not sure about limits). |
Email with download instructions sent. |
@joransiu fyi |
This is a heap space out of memory. Have you tried looking at the heapdump files with MAT (https://eclipse.dev/mat/) to see what's consuming the heap and growing? |
I used the IBM Heap Analyzer. It didn't report any leaks. It did warn about many root objects. |
jnicsup.cpp:1964 is jniLocalReferences. From EOJ-OutOfMemory/javacore.20240826.094436.16908590.0015.txt
|
That's the question... Why does this happen with Java 11 and not Java 8. The Java 8 javacore files show no such issue (Java8\Snap-after 20 minutes\javacore.20240826.124744.16908357.0015.txt): The test performed for both Java 8 and Java 11 is identical. |
Probably not it, but to rule it out you can try running Java 11 with Another difference I noticed looking at https://www.ibm.com/support/pages/semeru-runtimes-migration-guide |
I'll add -Xshareclasses:none and test. |
See https://eclipse.dev/openj9/docs/djavalangstringsubstringnocopy/ |
Just curious, is the Java 8 that you are using to test also 64-bit as well? |
Yes, Java 8 is also 64-bit . I'll post the exact version tomorrow morning. |
It's 8.0.7.20, it's in the diagnostic files we have. You could try a later version of IBM Java 8, such as 8.0.8.25, which |
Will do. |
pshipton: I ran with -Xshareclasses:none, but as you suspected, it failed with OutOfMemoryError. joransiu: My current Java 8 version is: Items to-do and report back on:
|
Running with my current (SR7 FP20) Java 8 version with -Djava.lang.string.substring.nocopy=true did not cause an OutOfMemoryError. |
We installed the latest Java 8. Here's the full version: I ran my standard test using it and encountered no issues. I think I've tried all the recommendations. What do you recommend next? |
Are you able to open a case with IBM support? |
I can open one, but my experience with IBM and Java bugs is not good. I opened one on November 16, 2018, and we still haven't gotten a fix. Is the jnicsup.cpp:1964 a clue? I ran with -Xcheck:jni and it didn't show any errors. But why is jnicsup.cpp:1964 so high? Or am I reading that wrong? |
It indicates there are a lot of JNI local references, but I don't know why that is. I assume you have some JNI code, if you want to share it I can take a look to see if there is anything obvious. Except for (1) you tried mismatched versions of jdk8/jdk11, and (2) differences to support the jdk11 spec, the Virtual Machine/JNI implementation/Enable3164Interoperability (OpenJ9) is mostly the same between jdk8 and jdk11. Maybe the difference is something to do with modularity, such as new modules being continuously created each time you enter Java? |
I made an attempt to open your system core file with MAT, but it didn't open and I'm not sure how to fix it. MAT on a system core enables looking at the content of objects/char[]/byte[]. If we can figure out what the JNI local references are, that would help figure out where they come from. I think your best bet is to open an IBM Support case, they will know how to get MAT working, and have more experience looking at this type of problem. |
@ForceRs : jnicsup.cpp:1964 is a clue. It means the application has lots of JNI local references. The next question would be who was requesting all those JNI local references. Please open a SalesForce case and let me know the case number. I will make sure someone takes care of it. |
@manqingl: Will do. Going to re-inspect all JNI-related code first. |
@pshipton : You can open the provided heap dumps (*.phd) files in eclipse using MAT after installing the IBM DTFJ feature for IBM dumps. |
I installed the IBM feature, but still got an error opening the core. It could be the dtfj component is out of date. I expect service will figure it out. jmap/OpenJ9 doesn't support hprof. |
The Customer has created Case TS017177781 for this issue. |
For the record, I was trying to use the Eclipse MAT, but we need to use MAT from https://www.ibm.com/support/pages/eclipse-memory-analyzer-tool-dtfj-and-ibm-extensions |
FYI : we have identified a bug in the implementation of JNI API IBMZOS_NewStringPlatform where the references are not getting cleared before returning back to the caller in Semeru 11 but are getting cleared in 8. The JCL team is preparing a test patch to verify the fix. @yathamravali FYI. |
@pshipton , There is an issue with MAT (on Citrix). Please see my (internal) update in the Case describing how to use the fix for this. Also I've updated the Case (11th September) with the most common char[] contents, which match the string that the Customer later told me they used to create the OOM with their testcase. |
IBM has resolved this issue. It will be in the November 2024 release. See APAR. |
java version "11.0.23" 2024-04-16
IBM Semeru Runtime Certified Edition for z/OS 11.0.23.0 (build 11.0.23+9)
IBM J9 VM 11.0.23.0 (build z/OS-Release-11.0.23.0-b02, JRE 11 z/OS s390x-64-Bit Compressed References 20240528_334 (JIT enabled, AOT enabled)
OpenJ9 - aa39565b36f
OMR - e4ae704bb5e
IBM - 3c87141
JCL - b8bbe79f173 based on jdk-11.0.23+9)
Background
We have a small Java application that interacts with a Derby DB. Basically, we have some HLASM that interacts with COBOL; the COBOL interacts with Java via JNI. The COBOL JNI is native; that is, it calls CallStaticBooleanMethod, not INVOKE. One thing that may be considered unusual is that we use 64-bit Java to interact with 31-bit COBOL via -XX:+Enable3164Interoperability.
Summary of problem
When running our small Java application, it runs out of memory after about 15 minutes. This happens in Java 11, not Java 8. I'm unable to test Java 17, as our z/OS is too old to support it. I say small Java application because it requires a tiny heap of -Xms4m -Xmx8m. I am running with REGION=0M; thus, MEMLIMIT should not be a factor. Using the IBM HeapAnalyzer, I see normal stuff and then about 70,000 root objects, mostly of type byte[] and char[].
Diagnostics
I saw a somewhat similar issue posted here and followed its advice to provide some initial diagnostics. I can reproduce this issue at will, so further diagnostics can be provided.
I suspected a JNI issue, so I first ran with -Xcheck:jni. This produced no output other than the confirmation line proving it was accepted.
I next ran using -Xcheck:memory:quick,noscan,callsite=500. The file for this output is sdsf.output.callsite-J11.txt. If I read the file correctly, and I may not be, then it seems to indicate a leak in jnicsup.cpp at line 1964. Each line below is a different point in time. The first is the first one, the last is the last one, and the middle one is an arbitrarily time.
File sdsf.output.callsite-J8.txt contains output for the exact same test run using Java 8 with -Xcheck:memory:quick,noscan,callsite=500. That run appears to be fine. sdsf.output.callsite-J8.txt's line wrapping is kind of messed up -- sorry.
One other thing to note is that two JVMs are spawned in our use case. One JVM is relatively idle. The other is busy. Since we only provide one set of exports (i.e., COBJVMINITOPTIONS=-Xms4m -Xmx8m -Xquickstart -XX:+Enable3164Interoperability -Xcheck:memory:quick,noscan,callsite=500), both the idle and the busy JVM produce diagnostics. If you look at just the output for jnicsup.cpp:1964, you'll see that it grows, but then appears to drop suddenly; that is not the case. What you're seeing is the idle JVM interspersed with the busy JVM.
The text was updated successfully, but these errors were encountered: