diff --git a/docs/userguide/src/SUMMARY.md b/docs/userguide/src/SUMMARY.md index 14fe7ec1c9..73eb3362ca 100644 --- a/docs/userguide/src/SUMMARY.md +++ b/docs/userguide/src/SUMMARY.md @@ -30,7 +30,7 @@ - [Before Starting a Port](portingguide/before_start.md) - [How to Undertake a Port](portingguide/howto/prefix.md) - [NoGC](portingguide/howto/nogc.md) - - [Next Steps](portingguide/howto/next_steps.md) + - [Non-Moving GC](portingguide/howto/non_moving_gc.md) - [Debugging Tips](portingguide/debugging/prefix.md) - [Enabling Debug Assertions](portingguide/debugging/assertions.md) - [Performance Tuning](portingguide/perf_tuning/prefix.md) diff --git a/docs/userguide/src/portingguide/howto/next_steps.md b/docs/userguide/src/portingguide/howto/next_steps.md deleted file mode 100644 index 039f62445b..0000000000 --- a/docs/userguide/src/portingguide/howto/next_steps.md +++ /dev/null @@ -1,12 +0,0 @@ -# Next Steps - -Your choice of the next GC plan to implement depends on your situation. -If you’re developing a new VM from scratch, or if you are intimately familiar with the internals of your target VM, then implementing a SemiSpace collector is probably the best course of action. -Although the GC itself is rather simplistic, it stresses many of the key components of the MMTk <-> VM binding that will be required for later (and more powerful) GCs. -In particular, since it always moves objects, it is an excellent stress test. - -An alternative route is to implement MarkSweep. -This may be necessary in scenarios where the target VM doesn’t support object movement, or would require significant refactoring to do so. -This can then serve as a stepping stone for future, moving GCs such as SemiSpace. - -We hope to have an Immix implementation available soon, which provides a nice middle ground between moving and non-moving (since it copies opportunistically, and can cope with a strictly non-moving requirement if needs be). diff --git a/docs/userguide/src/portingguide/howto/nogc.md b/docs/userguide/src/portingguide/howto/nogc.md index de6368059b..8196df22c5 100644 --- a/docs/userguide/src/portingguide/howto/nogc.md +++ b/docs/userguide/src/portingguide/howto/nogc.md @@ -8,7 +8,7 @@ At a high level, in order to implement NoGC, we need to handle MMTk initializati If you're ever stuck at any point, feel free to send a message in the `#Porting` channel of our [Zulip](https://mmtk.zulipchat.com/)! -## Set up +## Set Up You want to set up the binding repository/directory structure before starting the port. For the sake of the tutorial guide we assume you have a directory structure similar to the one below. Note that such a directory structure is not a requirement[^1] but a recommendation. We assume you are using some form of version control system (such as `git` or `mercurial`) in this guide. [^1]: In fact some bindings may not be able to have such a directory structure due to the build tools used by the runtime. @@ -31,7 +31,7 @@ You may also find it helpful to take inspiration from the [OpenJDK binding](http For this guide, we will assume your runtime is implemented in C or C++ as they are the most common implementation languages. However note that your runtime does not *need* to be implemented in C/C++ to work with MMTk. -## Adding a Rust library to the runtime +## Adding a Rust Library to the Runtime We recommend learning the ins and outs of your runtime's build system. You should try and add a simple Rust "hello world" library to your runtime's code and build system to investigate how easy it will be to add MMTk. Unfortunately this step is highly dependent on the runtime build system. We recommend taking a look at what other bindings do, but keep in mind that no two runtime build systems are the same even if they are using the same build tools. In case the build system is too complex and you want get to hacking, a quick and dirty way to add MMTk could be to build a static and/or dynamic binary for MMTk and link it to the runtime directly, manually building new binaries as necessary, like so: @@ -44,14 +44,14 @@ In case the build system is too complex and you want get to hacking, a quick and Later, you can edit the runtime build process to build MMTk at the same time automatically. -**Note:** If the runtime you are targeting already links some Rust FFI libraries, then you may notice "multiple definition" linker errors for Rust stdlib functions. Unfortunately this is a current limitation of Rust FFI wherein all symbols are bundled together in the final C lib which will cause multiple definitions errors when two or more Rust FFI libraries are linked together. There is ongoing work to stabilize the Rust package format that would hopefully make it easier in the future. A current workaround would be to use the `-Wl,--allow-multiple-definition` linker flag, but this unfortunately isn't ideal as it increases code sizes. See [here](https://internals.rust-lang.org/t/pre-rfc-stabilize-a-version-of-the-rlib-format/17558) and [here](https://github.com/rust-lang/rust/issues/73632) for more details. +> **Note:** If the runtime you are targeting already links some Rust FFI libraries, then you may notice "multiple definition" linker errors for Rust stdlib functions. Unfortunately this is a current limitation of Rust FFI wherein all symbols are bundled together in the final C lib which will cause multiple definitions errors when two or more Rust FFI libraries are linked together. There is ongoing work to stabilize the Rust package format that would hopefully make it easier in the future. A current workaround would be to use the `-Wl,--allow-multiple-definition` linker flag, but this unfortunately isn't ideal as it increases code sizes. See [here](https://internals.rust-lang.org/t/pre-rfc-stabilize-a-version-of-the-rlib-format/17558) and [here](https://github.com/rust-lang/rust/issues/73632) for more details. -**Note:** It is *highly* recommended to also check-in the generated `Cargo.lock` file into your version control. This improves the reproducibility of the build and ensures the same package versions are used when building in the future in order to prevent random breakages. +> **Note:** It is *highly* recommended to also check-in the generated `Cargo.lock` file into your version control. This improves the reproducibility of the build and ensures the same package versions are used when building in the future in order to prevent random breakages. We recommend using the `debug` build when doing development work as it has helpful logging statements and assertions that will make catching bugs in your implementation easier. -## The `VMBinding` trait -Now let's actually start implementing the binding. Here we take a look at the Rust side of the binding first (i.e. `mmtk-X/mmtk`). What we want to do is implement the [`VMBinding`](https://docs.mmtk.io/api/mmtk/vm/trait.VMBinding.html) trait. +## The `VMBinding` Trait +Now let's actually start implementing the binding. Here we take a look at the MMTk-side of the binding first (i.e. `mmtk-X/mmtk`). What we want to do is implement the [`VMBinding`](https://docs.mmtk.io/api/mmtk/vm/trait.VMBinding.html) trait. The `VMBinding` trait is a "meta-trait" (i.e. a trait that encapsulates other traits) that we expect every binding to implement. In essence, it is the contract established between MMTk and the runtime. We discuss each of its seven key traits briefly: @@ -65,11 +65,11 @@ The `VMBinding` trait is a "meta-trait" (i.e. a trait that encapsulates other tr For the time-being we can implement all the above traits via `unimplemented!()` stubs. If you are using the Dummy VM binding as a starting point, you will have to edit some of the concrete implementations to `unimplemented!()`. Note that you should change the type that implements `VMBinding` from `DummyVM` to an appropriately named type for your runtime. For example, the OpenJDK binding defines the zero-struct [`OpenJDK`](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/lib.rs#L139-L162) which implements the `VMBinding` trait. -### Object model +### Object Model The `ObjectModel` trait is a fundamental trait describing the layout of an object to MMTk. This is important as MMTk's core doesn't know of how objects look like internally as each runtime will be different. There are certain key aspects you need to be aware of while implementing the `ObjectModel` trait. We discuss them in this section. -#### Header vs Side metadata +#### Header vs Side Metadata Per-object metadata can live in one of two places: in the object header or in a separate space used just for metadata. Each one has its pros and cons. @@ -77,7 +77,7 @@ Header metadata sits in close proximity to the actual object address but it is n The choice of metadata location depends on the runtime and its object model and header layout. For example the JikesRVM runtime reserved extra space at the start of each object for GC-related metadata. Such space may not be available in your runtime. In such cases you can use side metadata to reserve per-object metadata. -#### Local vs Global metadata +#### Local vs Global Metadata MMTk uses multiple GC policies and each policy may use a different set of object metadata from each other. A moving policy, for example, may require extra metadata (in comparison to a non-moving policy) to store the forwarding bits and forwarding pointer. Such a metadata, which is local to a policy, is referred to as "local" metadata. @@ -95,7 +95,7 @@ You might be interested in reading the *Demystifying Magic: High-level Low-level [^3]: https://users.cecs.anu.edu.au/~steveb/pubs/papers/vmmagic-vee-2009.pdf -#### Miscellaneous configuration options +#### Miscellaneous Configuration Options There are many constants in the `ObjectModel` trait that can be overridden in your binding in order to meet your runtime's requirements. For example, the `OBJECT_REF_OFFSET_LOWER_BOUND` constant which defines the minimum offset from allocation result start (i.e. the address that MMTk will return to the runtime) and the actual start of the object, i.e. the `ObjectReference`. In other words, the constant represents the minimum offset from the allocation result start such that the following invariant always holds: @@ -103,13 +103,13 @@ There are many constants in the `ObjectModel` trait that can be overridden in yo We recommend going through the [list of constants in the documentation](https://docs.mmtk.io/api/mmtk/vm/trait.ObjectModel.html) and seeing if the default values suit your runtime's semantics, changing them if required. -## MMTk initialization +## MMTk Initialization Now that we have most of the boilerplate set up, the next step is to initialize MMTk so that we can start allocating objects. ### Runtime-side changes Create a `mmtk.h` header file in the runtime folder of the binding (i.e. `mmtk-X/X`) which exposes the functions required to implement NoGC and `#include` it in the relevant runtime code. You can use the [DummyVM `mmtk.h` header file](https://github.com/mmtk/mmtk-core/blob/master/vmbindings/dummyvm/api/mmtk.h) as an example. -**Note:** It is convention to prefix all MMTk API functions exposed with `mmtk_` in order to avoid name clashes. It is *highly* recommended that you follow this convention. +> **Note:** It is convention to prefix all MMTk API functions exposed with `mmtk_` in order to avoid name clashes. It is *highly* recommended that you follow this convention. Having a clean heap API for MMTk to implement makes life easier. Some runtimes may already have a sufficiently clean abstraction such as OpenJDK after the merging of [JEP 304](https://openjdk.org/jeps/304). In (most) other cases, the runtime doesn't provide a clean enough heap API for MMTk to implement. In such cases, it is recommended to create a class (or equivalent) that abstracts allocation and other heap functions like what the [V8](https://chromium.googlesource.com/v8/v8/+/a9976e160f4755990ec065d4b077c9401340c8fb/src/heap/third-party/heap-api.h) and ART bindings do. This allows making minimal changes to the actual runtime and having a concrete implementation of the exposed heap API in the binding, reducing MMTk-specific code in the runtime. Ideally these changes are upstreamed like in the case of V8. @@ -154,14 +154,12 @@ void mmtk_set_heap_size(size_t min, size_t max); Now we can initialize MMTk in the runtime. Note that MMTk should ideally be initialized around when the default heap of the runtime is initialized. You will have to figure out where is the best location to initialize MMTk in your runtime. -Initializing MMTk requires two steps. First, we set the heap size by calling `mmtk_set_heap_size` with the initial heap size and the maximum heap size. Then, we initialize MMTk by calling `mmtk_init`. In the future, you may wish to make the heap size configurable via a command line argument or environment variable (See [setting options for MMTk](#setting-options-for-mmtk)). - - +Initializing MMTk requires two steps. First, we set the heap size by calling `mmtk_set_heap_size` with the initial heap size and the maximum heap size. Then, we initialize MMTk by calling `mmtk_init`. In the future, you may wish to make the heap size configurable via a command line argument or environment variable (See [Setting Options for MMTk](#setting-options-for-mmtk)). ### MMTk-side changes -On the Rust side of the binding, we want to implement the two functions exposed by the `mmtk.h` file above. We use an [`MMTKBuilder`](https://docs.mmtk.io/api/mmtk/struct.MMTKBuilder.html) instance to actually create our concrete [`MMTK`](https://docs.mmtk.io/api/mmtk/struct.MMTK.html) instance. We recommend following the paradigm used by all our bindings wherein we have a `static` single `MMTK` instance and an `MMTKBuilder` instance that we can use to set relevant options. See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/lib.rs#L169-L178) for an example. +On the MMTk-side of the binding, we want to implement the two functions exposed by the `mmtk.h` file above. We use an [`MMTKBuilder`](https://docs.mmtk.io/api/mmtk/struct.MMTKBuilder.html) instance to actually create our concrete [`MMTK`](https://docs.mmtk.io/api/mmtk/struct.MMTK.html) instance. We recommend following the paradigm used by all our bindings wherein we have a `static` single `MMTK` instance and an `MMTKBuilder` instance that we can use to set relevant options. See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/lib.rs#L169-L178) for an example. -**Note:** MMTk currently assumes that there is only one `MMTK` instance in your runtime process. Multiple `MMTK` instances are currently not supported. +> **Note:** MMTk currently assumes that there is only one `MMTK` instance in your runtime process. Multiple `MMTK` instances are currently not supported. The `mmtk_set_heap_size` function is fairly straightforward. We recommend using the implementation in the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L94-L104). The `mmtk_init` function is straightforward as well. It should simply manually initialize the `MMTK` `static` variable using `lazy_static`, like [here](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L83-L86) in the OpenJDK binding. @@ -173,7 +171,7 @@ By this point, you should have MMTk initialized. If you are using a debug build [...] ``` -## Binding mutator threads to MMTk +## Binding Mutator Threads to MMTk For MMTk to allocate objects, it needs to be aware of mutator threads. MMTk only allows mutator threads to allocate objects. We do this by "binding" a mutator thread to MMTk when it is initialized in the runtime. @@ -201,7 +199,7 @@ The placement of the `mmtk_bind_mutator` call in the runtime depends on the runt ### MMTk-side changes -The Rust side of the binding should simply defer the actual implementation to [`mmtk::memory_manager::bind_mutator`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.bind_mutator.html). See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L106-L109) for an example. +The MMTk-side of the binding should simply defer the actual implementation to [`mmtk::memory_manager::bind_mutator`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.bind_mutator.html). See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L106-L109) for an example. ## Allocation Now we can finally implement the allocation functions. @@ -222,7 +220,7 @@ Add the following two functions to the `mmtk.h` file: * @param allocator the allocation semantics to use for the allocation * @return the address of the newly allocated object */ -void *mmtk_alloc(MmtkMutator mutator, size_t size, size_t align, +void* mmtk_alloc(MmtkMutator mutator, size_t size, size_t align, ssize_t offset, int allocator); /** @@ -250,17 +248,17 @@ For the time-being, you can ignore the `allocator` parameter in both these funct Finally, you need to call `mmtk_post_alloc` with the object address returned from the previous `mmtk_alloc` call in order to initialize object metadata. -**Note:** Currently MMTk assumes object sizes are multiples of the `MIN_ALIGNMENT`. If you encounter errors with alignment, a simple workaround would be to align the requested object size up to the `MIN_ALIGNMENT`. See [here](https://github.com/mmtk/mmtk-core/issues/730) for the tracking issue to fix this bug. +> **Note:** Currently MMTk assumes object sizes are multiples of the `MIN_ALIGNMENT`. If you encounter errors with alignment, a simple workaround would be to align the requested object size up to the `MIN_ALIGNMENT`. See [here](https://github.com/mmtk/mmtk-core/issues/730) for the tracking issue to fix this bug. ### MMTk-side changes -The Rust side of the binding should simply defer the actual implementation to [`mmtk::memory_manager::alloc`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.alloc.html) and [`mmtk::memory_manager::post_alloc`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.post_alloc.html) respectively. See the [OpenJDK](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L125-L136) [binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L151-L161) for an example. +The MMTk-side of the binding should simply defer the actual implementation to [`mmtk::memory_manager::alloc`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.alloc.html) and [`mmtk::memory_manager::post_alloc`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.post_alloc.html) respectively. See the [OpenJDK](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L125-L136) [binding](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L151-L161) for an example. Congratulations! At this point, you hopefully have object allocation working and can run simple programs with your runtime using MMTk! -## Miscellaneous implementation steps +## Miscellaneous Implementation Steps -### Setting options for MMTk +### Setting Options for MMTk The preferred method of setting [options for MMTk](https://docs.mmtk.io/api/mmtk/util/options/index.html) is by setting them via the `MMTKBuilder` instance. See [here](https://github.com/mmtk/mmtk-openjdk/blob/54a249e877e1cbea147a71aafaafb8583f33843d/mmtk/src/api.rs#L79) for an example in the OpenJDK binding. @@ -270,7 +268,7 @@ MMTk also supports setting options via environment variables. This is generally A full list of available options that you can set can be found [here](https://docs.mmtk.io/api/mmtk/util/options/struct.Options.html). -### Runtime-specific steps +### Runtime-specific Steps Often it is the case that the above changes are not enough to allow a runtime to work with MMTk. For example, for the ART binding, the runtime required that all inflated locks be deflated prior to writing the boot image. In order to fix this, we had to implement a heap visitor that visited each allocated object and checked if it had inflated locks, deflating them if they were. diff --git a/docs/userguide/src/portingguide/howto/non_moving_gc.md b/docs/userguide/src/portingguide/howto/non_moving_gc.md new file mode 100644 index 0000000000..8099ea1d88 --- /dev/null +++ b/docs/userguide/src/portingguide/howto/non_moving_gc.md @@ -0,0 +1,319 @@ +# Collecting Garbage: Getting Started with Integrating MMTk + +Your choice of the next GC plan to implement depends on your situation. +If you’re developing a new VM from scratch, or if you are intimately familiar with the internals of your target VM, then implementing a SemiSpace collector is probably the best course of action. +Although the GC itself is rather simplistic, it stresses many of the key components of the MMTk <-> VM binding that will be required for later (and more powerful) GCs. +In particular, since it always moves objects, it is an excellent stress test. +Otherwise, a non-moving GC like MarkSweep or a non-moving Immix implementation would work better. + +We note that most of the API you need to implement between the moving and non-moving GC will be the same (with moving GCs having to implement a few extra APIs), so regardless of what you choose, the steps in this guide will be applicable. +For this guide, we start by integrating a non-moving Immix implementation and then add support for moving objects. +In order to use a non-moving Immix implementation, enable the ["immix_non_moving" feature of mmtk-core](TODO(kunals)). +We also recommend turning the ["immix_zero_on_release" feature](TODO(kunals)) on for debugging. + +Like with the [NoGC guide](./nogc.md), "Runtime-side changes" mean any changes you have to make to your runtime or the part of the MMTk binding interfacing with the runtime; and "MMTk-side changes" mean any changes you have to make to the part of the MMTk binding interfacing with MMTk core. + +## Initializing and Enabling Collection + +In the NoGC port, we actually skipped over initializing and enabling garbage collection as we were only concerned with allocating objects. This is required as MMTk spawns GC threads when you enable collection. +This is a separate step as it is often the case that the threading subsystem of a runtime has not been fully set up when the `MMTK` instance is created. + + + +### Runtime-side changes + +Add the following function to the `mmtk.h` file: + +```C +[...] + +/** + * Initialize collection for MMTk + * + * @param tls reference to the calling VMThread + */ +void mmtk_initialize_collection(VMThread tls); + +[...] +``` + +You should call this function after the threading subsystem of your runtime has initialized and allows new threads to be spawned. +You can pass a reference to the calling runtime thread, but passing in a `nullptr` will also suffice. + +### MMTk-side changes + +The MMTk-side of the binding should simply defer the actual implementation to [`mmtk::memory_manager::initialize_collection`](https://docs.mmtk.io/api/mmtk/memory_manager/fn.initialize_collection.html). +See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/0ed99cd8cf51bb5ff8184ef64f8236d85e960e87/mmtk/src/api.rs#L245-L248) for an example. + +## "Upcalls" Design Pattern + +The nature of the bi-directional API means that there are things that MMTk requires or expects from the runtime and vice-versa. +While it is easy for a language runtime to use the API exposed by MMTk (the set of public functions in `mmtk.h`), it is not always easy for the Rust source of MMTk to directly call into the runtime given they may be implemented in a different language. + +In order to facilitate this, we utilize a design pattern wherein we define a `struct` of function pointers that is passed on to MMTk during initialization. +These function pointers essentially are the API exposed by the VM to MMTk. +The `struct` is often termed "Upcalls" given MMTk is calling up to the runtime. + +Let's take the example of a simple upcall and implement that: getting the size of a given object. + +> **Note:** If your runtime is already implemented in Rust, then it should be easy to directly call into your runtime from the MMTk binding, greatly simplifying the bi-directional API. + +### Runtime-side changes + +We define a new `struct` type with the desired upcall: + +```C +[...] + +// API from the runtime "Rt" to MMTk +typedef struct { + size_t (*size_of) (void* object); +} RtUpcalls; + +[...] +``` + +where "`Rt`" is the name of the runtime (for example, OpenJDK would be `OpenjdkUpcalls`, etc.). + +We also change the initialization function to take in a pointer to the upcalls: + +```C +[...] + +/** + * Initialize MMTk instance + * + * @param upcalls the set of Rt upcalls used by MMTk + */ +void mmtk_init(RtUpcalls* upcalls); + +[...] +``` + +Create a new file `mmtk_upcalls.h[pp]` (or whatever the naming scheme of your runtime is) in the runtime-side folder (`mmtk-X/X`) and declare a global instance of the upcalls: + +```C +#ifndef MMTK_RT_MMTK_UPCALLS_H +#define MMTK_RT_MMTK_UPCALLS_H + +#include "mmtk.h" + +// Single global instance of upcalls passed to MMTk +extern RtUpcalls rt_upcalls; + +#endif // MMTK_RT_MMTK_UPCALLS_H +``` + +This instance is then defined by `mmtk_upcalls.c[pp]`. +We define it like so as all the functions are usually defined as `static` functions (`static` in C/C++ means public in current file, but private to others) to avoid being made public to other users: + +```C +#include "mmtk_upcalls.h" // Use the correct location/name + +static size_t size_of(void* object) { + // Runtime-specific implementation of size_of function +} + +RtUpcalls rt_upcalls = { + size_of, +}; +``` + +The `size_of` function above depends on how your runtime implements getting the size of an object. + +Finally, pass the `rt_upcalls` `struct` to where you call the `mmtk_init` function: + +```C +[...] + +#include "mmtk_upcalls.h" // Use the correct location/name + +// Initialize MMTk +mmtk_init(&rt_upcalls); + +[...] +``` + +### MMTk-side changes + +In the MMTk-side of the binding, we need to change the `mmtk_init` API to accept the `RtUpcalls` `struct` as defined above. +We will have to carefully redefine the same `struct` in Rust so that the Rust code can type-check the API correctly. +Unfortunately, this is a brittle approach since you have to carefully maintain the invariant that the Rust and C/C++ definitions of the upcalls `struct` are the same. +An avenue of research that could make it easier and less error-prone would be investigating [`libcxx`](https://cxx.rs/) integration with MMTk. + +```Rust +[...] + +#[repr(C)] +/// API from the runtime "Rt" to MMTk +pub struct RtUpcalls { + pub size_of: extern "C" fn(object: ObjectReference) -> usize, +} + +/// Global static instance of RtUpcalls +pub static mut UPCALLS: *const RtUpcalls = std::ptr::null_mut(); + +[...] + +pub fn mmtk_init(upcalls: *const RtUpcalls) { + unsafe { UPCALLS = upcalls }; + // Keep this the same +} + +[...] + +``` + +Now, in the `VMObjectModel` trait, we can implement the [`get_current_size`](TODO(kunals): API) function: + +```Rust +[...] + +fn get_current_size(object: ObjectReference) -> usize { + use mmtk::util::conversions; + conversions::raw_align_up(unsafe { ((*UPCALLS).size_of)(object) }, RtName::MIN_ALIGNMENT) +} + +[...] +``` + +We align the object size to the runtime's minimum alignment in case we want to copy the object while maintaining the alignment requirements. + +Astute readers may have noticed that there is an overhead of an indirect call which is not necessarily great for performance. +For performance, we can pull runtime-specific knowledge (such as internal `struct` definitions or state, etc.) into the MMTk-side of the binding to reduce cross-language function calls. +However, this is out of scope for this tutorial as the optimization(s) are highly runtime-dependant. + +## Spawning GC Threads + +You will notice that now your runtime immediately panics since MMTk is unable to spawn its GC threads. We need to implement the [`VMCollection::spawn_gc_thread`](https://docs.mmtk.io/api/mmtk/vm/trait.Collection.html#tymethod.spawn_gc_thread) API. + +Currently there are two kinds of GC threads: the Coordinator thread and GC Worker threads. +There is always only one Coordinator thread and its job is to coordinate GC activities between the different worker threads. +The Coordinator thread does not perform any GC activities itself. +The GC Worker threads actually perform GC activities such as roots scanning, marking objects, etc. +The number of GC Worker threads can be controlled with the [`threads` MMTk option](https://docs.mmtk.io/api/mmtk/util/options/struct.Options.html#structfield.threads) (See the [NoGC guide](./nogc.md#setting-options-for-mmtk) for more information about setting MMTk options). + +> **Note:** Since the Coordinator thread always exists, if we set the number of GC threads to 1, the actual number of threads spawned is still 2. + +MMTk calls into the runtime to spawn GC threads since there are runtimes that expect all threads to be registered with it. + +### Runtime-side changes + +Spawning GC threads is highly dependant on your runtime's threading subsystem. +Given MMTk expects the runtime to spawn the threads, we have to implement a new upcall: + +In `mmtk.h`: +```C +[...] + +// Type of GC worker +enum GcThreadKind { + MmtkGcController, + MmtkGcWorker +}; + +// API from the runtime "Rt" to MMTk +typedef struct { + size_t (*size_of) (void* object); + void (*spawn_gc_thread) (void* tls, GcThreadKind kind, void* ctx); +} RtUpcalls; + +/** + * Start the GC Controller thread + * + * @param tls the thread that will be used as the GC Controller + * @param context the context for the GC Controller + */ +void mmtk_start_gc_controller_thread(void* tls, void* context); + +/** + * Start a GC Worker thread + * + * @param tls the thread that will be used as the GC Worker + * @param context the context for the GC Worker + */ +void mmtk_start_gc_worker_thread(void* tls, void* context); + +[...] +``` + +In `mmtk_upcalls.c[pp]`: +```C +[...] + +static void spawn_gc_thread(void* tls, GcThreadKind kind, void* ctx) { + // Runtime-specific implementation of spawning GC worker threads +} + +RtUpcalls rt_upcalls = { + size_of, + spawn_gc_thread, +}; + +[...] +``` + +See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/96e868b107b5b13c40c7f4946dff9ac96145c64e/openjdk/mmtkUpcalls.cpp#L95-L121) for an example. + +### MMTk-side changes + +Define the `GcThreadKind` `enum` in Rust: +```Rust +[...] + +/// Type of GC worker +#[repr(C)] +pub enum GcThreadKind { + /// GC Controller Context thread + Controller = 0, + /// Simple GC Worker thread + Worker = 1, +} + +[...] +``` + +The MMTk-side changes then should simply call the above upcalls function. +See the [OpenJDK binding](https://github.com/mmtk/mmtk-openjdk/blob/96e868b107b5b13c40c7f4946dff9ac96145c64e/mmtk/src/collection.rs#L39-L52) for an example. (Note the OpenJDK binding uses `int`s directly to signify what kind of GC thread is being spawned, but we define the above `enum`). + +If your runtime is single-threaded or perhaps it is too difficult to support creating MMTk GC threads, then you could spawn GC threads in the MMTk-side of the binding instead. +For example, the Ruby binding does this. + +## Suspending (and Resuming) Mutator Threads + +The first thing MMTk core does when it finds itself out of memory is block the mutator thread that failed the allocation. +This check only happens in the slow-path (when the runtime goes and gets a new thread-local buffer from MMTk). +You ha + +TODO(kunals): VM Companion Thread + +### Runtime-side changes +### MMTk-side changes + +## Miscellaneous API + +TODO(kunals): `mutators`, `get_current_size`, etc. + +### Runtime-side changes +### MMTk-side changes + +## Scanning Roots + +### Thread Roots + +### Runtime-specific Roots + +### Runtime-side changes +### MMTk-side changes + +## Scanning Objects + +### Runtime-side changes +### MMTk-side changes + +## Miscellaneous API + +TODO(kunals): `handle_user_collection_request`, `is_mmtk_object`, `pin_object`, etc. + +### Runtime-side changes +### MMTk-side changes