Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change return type of getStringUTF8Length #21005

Merged
merged 2 commits into from
Feb 18, 2025

Conversation

hzongaro
Copy link
Member

The return type of the JVM's getStringUTF8Length method has changed from IDATA to UDATA. This change adjusts the JIT compiler's uses of that method. In particular, the return type of TR_FrontEnd::getStringUTF8Length and its overriding methods change from intptr_t to int32_t. Similarly, the bufferSize argument of TR_FrontEnd::getStringUTF8 becomes uintptr_t where it was int32_t.

The motivation was that the length of a UTF-8 encoded String could be greater than 232 bytes, so the length could overflow intptr_t on a 32-bit platform. All uses of getStringUTF8Length in the JIT involve signatures, descriptors, method names and class names, which will never be large enough to exceed the range of the int32_t type. This was something that @jpmpapin was good enough to verify. Just to be cautious, however, this change includes an assertion test that the computed length is in range for the int32_t type.

This change also introduces a method, getStringUTF8UnabbreviatedLength, that returns a length value of type uint64_t if the JIT compiler ever needs to determine the UTF-8 encoded length of an arbitrary String. The method is currently unused.

In addition, String Builder Transformer uses the result of getStringUTF8Length to estimate the StringBuilder buffer size needed to accommodate appending a constant String to a StringBuilder. That could overestimate the space required. This has been changed to use getStringLength instead, to use the actual lengths of constant String objects. A test has also been added to detect integer overflow of the capacity estimate, aborting the transformation, as StringBuilder.<init>
will throw a NegativeArraySizeException if the specified capacity is negative.

This pull request requires a coordinated merge with OMR pull request eclipse-omr/omr#7620

@hzongaro hzongaro added comp:jit depends:omr Pull request is dependent on a corresponding change in OMR labels Jan 24, 2025
@hzongaro hzongaro requested a review from jdmpapin January 24, 2025 14:58
@hzongaro hzongaro requested a review from dsouzai as a code owner January 24, 2025 14:58
@hzongaro
Copy link
Member Author

@jdmpapin, thank you for your off-line review of an earlier version of these changes. May I ask you to review this updated version?

@jdmpapin jdmpapin changed the title Change return type of Change return type of getStringUTF8Length Jan 24, 2025
@hzongaro
Copy link
Member Author

hzongaro commented Feb 3, 2025

Thanks for your review, @jdmpapin. I have addressed your review comments. May I ask you to rereview? Once you are OK with the changes, I will squash the extra commits.

Copy link
Contributor

@jdmpapin jdmpapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please squash

@hzongaro hzongaro force-pushed the long-utf8-string-length branch from a4974d4 to fb3e0f2 Compare February 5, 2025 14:40
@hzongaro
Copy link
Member Author

hzongaro commented Feb 5, 2025

Thanks, @jdmpapin! I have squashed the commits.

@jdmpapin
Copy link
Contributor

jdmpapin commented Feb 5, 2025

Jenkins test sanity.functional,sanity.openjdk all jdk8,jdk11,jdk17,jdk21 depends eclipse-omr/omr#7620

@hzongaro
Copy link
Member Author

hzongaro commented Feb 7, 2025

It looks like there is a known problem — issue #21066 — affecting JITServer tests. I will rerun PR testing once that problem has been resolved.

@jdmpapin
Copy link
Contributor

#21066 was closed with a fix this morning

Jenkins test sanity.functional,sanity.openjdk all jdk8,jdk11,jdk17,jdk21 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

Jenkins test sanity.functional,sanity.openjdk aix jdk11 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

Jenkins test sanity.functional,sanity.openjdk aix jdk11 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

Jenkins test sanity.openjdk aix jdk21 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

The AccessDeniedException on AIX keeps reoccurring, so I'm going to start ignoring those without restarts

@jdmpapin
Copy link
Contributor

The PPC64LE Linux JDK11 sanity.openjdk failure is #13756

@jdmpapin
Copy link
Contributor

This Mac test job failed to fetch origin/pr/21005/merge

Jenkins test sanity.functional xmac jdk21 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

The PPC64LE Linux JDK17 sanity.functional failure is #17474

@jdmpapin
Copy link
Contributor

Some of the AIX jobs failed with GitException instead of AccessDeniedException, but that's still infrastructure

The x86-64 Windows JDK11 sanity.functional job also hit AccessDeniedException, but first both test lists passed

The AIX JDK17 sanity.openjdk failure is #19962

The x86-32 Windows JDK8 sanity.functional failures are #19678 and #19408

The x86-64 Windows JDK21 sanity.openjdk job (reported here as still pending) had its testList_0 cancelled. It looks like it was still making progress but just running very slowly, which can happen sometimes in CI. There were instances of #21000 and #18463 in testList_2

The x86-64 Windows JDK11 sanity.openjdk job (reported here as still pending) similarly had its testList_1 cancelled. I don't see any real failures

@jdmpapin
Copy link
Contributor

The tests have finished and all failures are accounted for. @hzongaro, could you please rebase to fix the conflict?

String Builder Transformer uses the result of getStringUTF8Length
to estimate the StringBuilder buffer size needed to accommodate
appending a constant String to a StringBuilder.  That could
overestimate the space required.  This has been changed to use
getStringLength instead, to use the actual lengths of constant String
objects.  A test has also been added to detect integer overflow of the
capacity estimate, aborting the transformation, as StringBuilder.<init>
will throw a NegativeArraySizeException if the specified capacity is
negative.

Signed-off-by:  Henry Zongaro <[email protected]>
The return type of the JVM's getStringUTF8Length method has changed from
IDATA to UDATA.  This change adjusts the JIT compiler's uses of that
method.  In particular, the return type of
TR_FrontEnd::getStringUTF8Length and its overriding methods changes from
intptr_t to int32_t.  Similarly, the bufferSize argument of
TR_FrontEnd::getStringUTF8 becomes uintptr_t where it was int32_t.

The motivation was that the length of a UTF-8 encoded String could be
greater than 2^32 bytes, so the length could overflow on a 32-bit
platform.  All uses of getStringUTF8Length in the JIT involve
signatures, descriptors, method names and class names, which will never
be large enough to exceed the range of the int32_t type.  Just to be
cautious, however, this change includes an assertion test that the
computed length is in range for the int32_t type, allowing for a
maximum length of 2^31-2.  That ensures that any code that then uses
that length to allocate a buffer to contain the encoded String with a
NUL terminator will not overflow a 32-bit signed integer representation
for the length plus the NUL byte.

This change also introduces a method, getStringUTF8UnabbreviatedLength,
that returns a length value of type uint64_t if the JIT compiler ever
needs to determine the UTF-8 encoded length of an arbitrary String.  The
method is currently unused.

Finally, this change removes the VM_getStringUTF8Length JITServer
message type, which is never used.

Signed-off-by:  Henry Zongaro <[email protected]>
@hzongaro hzongaro force-pushed the long-utf8-string-length branch from fb3e0f2 to 998b809 Compare February 11, 2025 18:20
@hzongaro
Copy link
Member Author

The tests have finished and all failures are accounted for. @hzongaro, could you please rebase to fix the conflict?

@jdmpapin, thank you for investigating all those failures! Rebased.

@jdmpapin
Copy link
Contributor

The diffs are identical except for the update to CommunicationStream::MINOR_NUMBER. Doing just one cursory build after the rebase

Jenkins compile xlinux jdk17 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

Jenkins compile xlinux jdk17 depends eclipse-omr/omr#7620

@jdmpapin
Copy link
Contributor

jdmpapin commented Feb 12, 2025

The build segfaulted in the sampler thread 😮‍💨 😞

I really can't imagine it has anything to do with these changes. It doesn't seem to be a known issue, but it happened twice in a row, so I'm sure it will be one in no time...

@BradleyWood
Copy link
Member

@jdmpapin My local builds have been segfaulting with the same backtrace on the latest OMR/OpenJ9 changes. Not sure exactly what the issue is but its from the last ~20 commits.

@jdmpapin
Copy link
Contributor

jdmpapin commented Feb 14, 2025

Waiting for OMR to promote now that the build should be fixed by eclipse-omr/omr#7659

@jdmpapin
Copy link
Contributor

Repeating the cursory build

Jenkins compile xlinux jdk17 depends eclipse-omr/omr#7620

@jdmpapin jdmpapin merged commit a549556 into eclipse-openj9:master Feb 18, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit depends:omr Pull request is dependent on a corresponding change in OMR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants