-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change return type of getStringUTF8Length #21005
base: master
Are you sure you want to change the base?
Conversation
@jdmpapin, thank you for your off-line review of an earlier version of these changes. May I ask you to review this updated version? |
Thanks for your review, @jdmpapin. I have addressed your review comments. May I ask you to rereview? Once you are OK with the changes, I will squash the extra commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please squash
String Builder Transformer uses the result of getStringUTF8Length to estimate the StringBuilder buffer size needed to accommodate appending a constant String to a StringBuilder. That could overestimate the space required. This has been changed to use getStringLength instead, to use the actual lengths of constant String objects. A test has also been added to detect integer overflow of the capacity estimate, aborting the transformation, as StringBuilder.<init> will throw a NegativeArraySizeException if the specified capacity is negative. Signed-off-by: Henry Zongaro <[email protected]>
The return type of the JVM's getStringUTF8Length method has changed from IDATA to UDATA. This change adjusts the JIT compiler's uses of that method. In particular, the return type of TR_FrontEnd::getStringUTF8Length and its overriding methods changes from intptr_t to int32_t. Similarly, the bufferSize argument of TR_FrontEnd::getStringUTF8 becomes uintptr_t where it was int32_t. The motivation was that the length of a UTF-8 encoded String could be greater than 2^32 bytes, so the length could overflow on a 32-bit platform. All uses of getStringUTF8Length in the JIT involve signatures, descriptors, method names and class names, which will never be large enough to exceed the range of the int32_t type. Just to be cautious, however, this change includes an assertion test that the computed length is in range for the int32_t type, allowing for a maximum length of 2^31-2. That ensures that any code that then uses that length to allocate a buffer to contain the encoded String with a NUL terminator will not overflow a 32-bit signed integer representation for the length plus the NUL byte. This change also introduces a method, getStringUTF8UnabbreviatedLength, that returns a length value of type uint64_t if the JIT compiler ever needs to determine the UTF-8 encoded length of an arbitrary String. The method is currently unused. Finally, this change removes the VM_getStringUTF8Length JITServer message type, which is never used. Signed-off-by: Henry Zongaro <[email protected]>
a4974d4
to
fb3e0f2
Compare
Thanks, @jdmpapin! I have squashed the commits. |
Jenkins test sanity.functional,sanity.openjdk all jdk8,jdk11,jdk17,jdk21 depends eclipse-omr/omr#7620 |
It looks like there is a known problem — issue #21066 — affecting JITServer tests. I will rerun PR testing once that problem has been resolved. |
The return type of the JVM's
getStringUTF8Length
method has changed fromIDATA
toUDATA
. This change adjusts the JIT compiler's uses of that method. In particular, the return type ofTR_FrontEnd::getStringUTF8Length
and its overriding methods change fromintptr_t
toint32_t
. Similarly, thebufferSize
argument ofTR_FrontEnd::getStringUTF8
becomesuintptr_t
where it wasint32_t
.The motivation was that the length of a UTF-8 encoded String could be greater than 232 bytes, so the length could overflow
intptr_t
on a 32-bit platform. All uses ofgetStringUTF8Length
in the JIT involve signatures, descriptors, method names and class names, which will never be large enough to exceed the range of theint32_t
type. This was something that @jpmpapin was good enough to verify. Just to be cautious, however, this change includes an assertion test that the computed length is in range for theint32_t
type.This change also introduces a method,
getStringUTF8UnabbreviatedLength
, that returns a length value of typeuint64_t
if the JIT compiler ever needs to determine the UTF-8 encoded length of an arbitrary String. The method is currently unused.In addition, String Builder Transformer uses the result of
getStringUTF8Length
to estimate theStringBuilder
buffer size needed to accommodate appending a constantString
to aStringBuilder
. That could overestimate the space required. This has been changed to usegetStringLength
instead, to use the actual lengths of constantString
objects. A test has also been added to detect integer overflow of the capacity estimate, aborting the transformation, asStringBuilder.<init>
will throw a
NegativeArraySizeException
if the specified capacity is negative.This pull request requires a coordinated merge with OMR pull request eclipse-omr/omr#7620