Skip to content

Make work packet buffer size configurable from one location #1285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 24, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/plan/tracing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

use crate::scheduler::gc_work::{ProcessEdgesWork, SlotOf};
use crate::scheduler::{GCWorker, WorkBucketStage};
use crate::util::ObjectReference;
use crate::util::{self, ObjectReference};
use crate::vm::SlotVisitor;

/// This trait represents an object queue to enqueue objects during tracing.
Expand All @@ -25,7 +25,7 @@ pub struct VectorQueue<T> {

impl<T> VectorQueue<T> {
/// Reserve a capacity of this on first enqueue to avoid frequent resizing.
const CAPACITY: usize = 4096;
const CAPACITY: usize = util::constants::BUFFER_SIZE;

/// Create an empty `VectorObjectQueue`.
pub fn new() -> Self {
Expand Down
2 changes: 1 addition & 1 deletion src/scheduler/gc_work.rs
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,7 @@ pub trait ProcessEdgesWork:
/// Higher capacity means the packet will take longer to finish, and may lead to
/// bad load balancing. On the other hand, lower capacity would lead to higher cost
/// on scheduling many small work packets. It is important to find a proper capacity.
const CAPACITY: usize = 4096;
const CAPACITY: usize = util::constants::BUFFER_SIZE;
/// Do we update object reference? This has to be true for a moving GC.
const OVERWRITE_REFERENCE: bool = true;
/// If true, we do object scanning in this work packet with the same worker without scheduling overhead.
Expand Down
3 changes: 3 additions & 0 deletions src/util/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ pub const LOG_BYTES_IN_KBYTE: u8 = 10;
/// The number of bytes in a kilobyte
pub const BYTES_IN_KBYTE: usize = 1 << LOG_BYTES_IN_KBYTE;

/// Work packet buffer size
pub const BUFFER_SIZE: usize = 4096;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is part of the public API, we should be more careful about naming.

This constant is a bit different from others in the util::constants module, such as {LOG_,}BYTES_IN_{K,M,G}BYTE. BUFFER_SIZE is related to the GC work packet design, and we chose 4096 arbitrarily (and may not be the optimal value). On the contrary, others are mathematical constants.

And the name "buffer" alone does not make it clear what kind of buffer it is referring to.

I suggest we add a pub mod gc_work and split BUFFER_SIZE into two constants:

  • NODES_PACKET_SIZE: the number of nodes in a ScanObjects packet, and
  • EDGES_PACKET_SIZE: the number of edges (represented as Slot now, but may be represented as ObjectReference if we think appropriate for MarkSweep) in a ProcessEdgesWork work packet.

We can define both as 4096 so we don't change the semantics, but we may, in the future, make them different.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the naming, but I'm not so sure if we want to split the constant into two, at least in this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(represented as Slot now, but may be represented as ObjectReference if we think appropriate for MarkSweep)

I don't know what you mean here. An edge is a slot, no? Regardless of the GC algorithm used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(represented as Slot now, but may be represented as ObjectReference if we think appropriate for MarkSweep)

I don't know what you mean here. An edge is a slot, no? Regardless of the GC algorithm used.

You can find "Node-ObjRef", "Edge-ObjRef" and "Edge-Slot" in Angus' paper). In that paper, node and edge refer to the queuing strategy, while objref and slot refer to the thing it enqueues. For MarkSweep, since we don't need to update any slot, we can load from the slots immediately while we scan the object, and enqueue the referents directly.

In the current mmtk-core, Slot represents a slot, that is, something that holds a reference, and can be updated. But "edge" is more about the queuing strategy. The ProcessEdgesWork work packet does process edges, in the sense that (1) it provides the trace_object method which in theory traces an edge in the object graph and returns how the edge should be updated, and (2) its elements (slots) are created when scanning objects without looking at the child objects. In MarkSweep, we can in theory replace ProcessEdgesBase::slots with a Vec<ObjectReference> (i.e. the values in the slots). We can also replace it with a Vec<(Slot, ObjectReference)> (slots and their values) which is the "Edge-Tuple" strategy in Angus' paper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it doesn't matter too much whether we split it into two constants. At least for now, we do not have plans to have different values for edges vs nodes.

The document for the constant should include why we expose this constant to the user, and how users should use this constant.

I suggest we add a pub mod gc_work

The constant may be put to the scheduler module. We tend to expose constants from their own modules now, rather than putting to util::constants.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/// Some card scanning constants ported from Java MMTK.
/// As we haven't implemented card scanning, these are not used at the moment.
mod card_scanning {
Expand Down
Loading