Skip to content

Heap addresses are not reliable to produce diffs #31

@mttkay

Description

@mttkay

I've been looking at heapy diff, and one thing I noticed is that it keys on an object's address to decide whether it was present in the heap dump we compare against.

Is this reliable? I looked at MRI, and address is just an object's VALUE pointer: https://github.com/ruby/ruby/blob/e315f3a1341f123051b75e589b746132c3510079/ext/objspace/objspace_dump.c#L238

The way I understood the Ruby GC to work is that each 40B slot (once populated) always contains an RVALUE, which is a union type so it can "morph" into a different type. If this object gets GC'ed, it is not removed from the heap page, but rather a flag is cleared that tags this slot (or object) as "empty": https://github.com/ruby/ruby/blob/6ef46f71c743507a0e2ae0eef14dce0539b0ff52/gc.c#L569. This makes a slot reusable by changing its union type, but its memory address does not change.

Wouldn't this mean that if an object is GC'ed between two snapshots and the same slot is reused for a completely different object, it would then be omitted from the heapy diff, because the slot address already appeared in the first snapshot?

I'm sure I'm missing something but I wanted to make sure I understand how this works. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions