Skip to content

Closes #23681: Unnecessary reference duplicate in backend #23685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

guillaumecarraux
Copy link

@guillaumecarraux guillaumecarraux commented Aug 7, 2025

from #23681:

Compiler version

Scala 3.8.0-RC1-bin-20250731-fe6b7eb-NIGHTLY, JVM (21)

Minimized code

//> using scala 3.nightly

def spin = while  true do {}

class A

@main def main =
  //creating a large object
  var large: Array[Double] | Null = Array.fill(10_000_000)(1.0)
  val other = A()

  //comparing it with another object (on the rhs)
  println((other == large))

  //dropping the reference to the large object
  large = null

  //spin to check memory usage
  spin

Output

The memory grabbed by the large object is not freed, even though there is no variable holding it.

Root cause

I suspect this is because of the current backend behaviour when generating the byte code for an == operation (in BCodeBodyBuilder.scala:genEqEqPrimitive):

val eqEqTempLocal = locals.makeLocal(ObjectRef, nme.EQEQ_LOCAL_VAR.mangledString, defn.ObjectType, r.span)

It currently creates a copy of the right-hand side of the "==", so that it can be used later without reevaluating it. However, this copy can create a new GC root, and is a potential memory leak, invisible to the user.

Suggested fix

I suggest to replace this local variable copy by some stack operations, which might also be slightly faster.

current backend code:

val eqEqTempLocal = locals.makeLocal(ObjectRef, nme.EQEQ_LOCAL_VAR.mangledString, defn.ObjectType, r.span)
val lNull    = new asm.Label
val lNonNull = new asm.Label

genLoad(l, ObjectRef) //  load lhs
stack.push(ObjectRef)
genLoad(r, ObjectRef) // load rhs
stack.pop()
locals.store(eqEqTempLocal) // store rhs in a local variable
bc dup ObjectRef // duplicate top stack variable (lhs)
genCZJUMP(lNull, lNonNull, Primitives.EQ, ObjectRef, targetIfNoJump = lNull) // compare lhs with NULL

markProgramPoint(lNull)
bc drop ObjectRef // drop top stack variable (lhs)
locals.load(eqEqTempLocal) // load rhs then compare with NULL
genCZJUMP(success, failure, Primitives.EQ, ObjectRef, targetIfNoJump = lNonNull)

markProgramPoint(lNonNull)
locals.load(eqEqTempLocal) // load rhs then compare with lhs
genCallMethod(defn.Any_equals, InvokeStyle.Virtual)
genCZJUMP(success, failure, Primitives.NE, BOOL, targetIfNoJump)

I suggest replacing with the following simple stack operations, with the resulting operand stack in comment:

-          val eqEqTempLocal = locals.makeLocal(ObjectRef, nme.EQEQ_LOCAL_VAR.mangledString, defn.ObjectType, r.span)
           val lNull    = new asm.Label
           val lNonNull = new asm.Label
 
-          genLoad(l, ObjectRef)
+          genLoad(r, ObjectRef) //  load rhs --> (r)
           stack.push(ObjectRef)
-          genLoad(r, ObjectRef)
+          genLoad(l, ObjectRef) // load lhs --> (l,r)
           stack.pop()
-          locals.store(eqEqTempLocal)
           bc dup ObjectRef // duplicate top stack variable --> (l,l,r)
           genCZJUMP(lNull, lNonNull, Primitives.EQ, ObjectRef, targetIfNoJump = lNull) // compare lhs with NULL --> (l,r)
 
           markProgramPoint(lNull)
           bc drop ObjectRef // drop top stack variable --> (r)
-          locals.load(eqEqTempLocal)
           genCZJUMP(success, failure, Primitives.EQ, ObjectRef, targetIfNoJump = lNonNull) // --> (-)
 
           markProgramPoint(lNonNull)
-          locals.load(eqEqTempLocal)
+          emit(asm.Opcodes.SWAP) //swap l and r for correct .equals ordering --> (r,l)
           genCallMethod(defn.Any_equals, InvokeStyle.Virtual) // --> (-)
           genCZJUMP(success, failure, Primitives.NE, BOOL, targetIfNoJump)

The test DottyBytecodeTests:patmatControlFlow is modified to match the new bytecode.

@sjrd
Copy link
Member

sjrd commented Aug 7, 2025

We were actually considering generating a call to java.util.Objects.equals a few weeks ago. I believe that would also solve this issue, while (hopefully?) produce shorter bytecode.

/cc @lrytz

@lrytz
Copy link
Member

lrytz commented Aug 8, 2025

That would be nice. The motivation we brought it up is code coverage tools, the null branch is typically unused. cc @retronym

The Objects.equals call needs to be inlined by the JVM, as the call to Object.equals inside is megamorphic. I assume if that doesn't happen reliably we would find out pretty quickly, given how widespread case class equality is used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants