Skip to content

[RFC] Separate the runtime context within runtime instances. #4035

Open
@lum1n0us

Description

@lum1n0us

Background

The primary purpose of this feature is to prevent the use of global variables in the C language for storing runtime arguments and context.

This requirement has emerged from several embedded platforms where there is no POSIX-like process isolation. On these platforms, when multiple tasks or threads execute wasm_runtime_full_init() with various RuntimeInitArgs, such as using different MemAllocOption to execute different allocation strategies, conflicts arise. This is because the incoming RuntimeInitArgs are stored in global variables, which are actually shared among tasks or threads.

All tasks and threads are required to share the same runtime configuration. This encompasses a range of settings, including but not limited to: memory allocation strategy, running mode, gc heap size, JIT policy, WASI context, registered native functions, logging levels, error information, and more. This is not the solution that many products anticipate.

Design choice

let's refer to the outcome of wasm_runtime_full_init() as a runtime instance,

Smallest unit

The first question we need to address is whether we allow multiple runtime instances within a single task or thread. In other words, what is the smallest execution unit for a runtime instance? The concept of "nanoprocesses" comes to mind. If tasks or threads are the smallest unit, then Thread Local Storage (TLS) might be an option for storing the runtime context. However, if we agree that the smallest unit is smaller than a task or thread, or if it should not be tied to the current process/task/thread concepts(may let embedding decide), then creating a new runtime instance structure becomes necessary.

Therefore, whether using TLS to maintain runtime context becomes a consideration. Either binding a thread with a runtime context(using TLS), meaning every thread has one unique runtime context and can't have more, or allowing multiple runtime contexts within a single thread. The only difference is whether users can create variants multiple runtime contexts even within a thread. This could be helpful when users want to run multiple wasm instances in one working thread and hope each wasm instance has a unique runtime context. It is a classic execution environment for FaaS on the cloud.

After examining both ideas with a proof of concept, the first option, thread-level-singleton, is going to be applied. Simply because the first option requires less modification work and, in some cases, is the only way to be compatible with existing code. The significant advantage of using TLS to keep the runtime context is that it allows the original code to access the runtime context without additional input arguments. For example, wasm_runtime_malloc() should become wasm_runtime_malloc(WASMRuntime*) in the new framework. WASMRuntime* is used to keep global variables of wasm_memory.c, like Memory_Mode and mem_allocator_t. It requires all callers of wasm_runtime_malloc() to change their code, such as adding a WASMRuntime* (not to mention how to obtain one). With the help of TLS, the original function signature can be preserved. All callers of old APIs remain untouched.

Global variables

Examine the global variables within the libvm.so library carefully. $ readelf -s --wide --demangle libvmlib.a | grep OBJECT.

LLVM ORC JIT

The LLVM environment should be initialized just once per process. It is permissible to create multiple LLJIT instances for different target machines. Each LLJIT instance can include multiple passes with varying optimization levels and size requirements for the generated binary. Every LLJIT instance possesses its own thread-safe module to house jitted functions and its own JITDyLib to manage symbols, which helps prevent symbols from being accessed unexpectedly. Thus, it appears to be acceptable for each runtime instance to create its own LLJIT instances, generate jitted code, and execute it.

The remaining issue to address is the potential consequences of calling LLVMInitializeXXX() multiple times within a single process.

ASM JIT

to be continued

Consts

Some global variables are used as .rodata and can be shared across multiple runtime-instances. Below is a brief list of such variables:

  • aot_stack_xxx
  • handle_table in interpreters
  • invokeNative_XXX
  • exception_msgs for conversion from exception ids to strings
  • MEMORY_PAGE_SIZE and wasm_limits_max_default in wasm_c_api.h
  • quick_aot_entries
  • native_symbols_xxx and native_globals_xxx
  • valid_xxx, bit_cnt_llvm_intrinsic, block_name_xxx, target_sym_map, g_intrinsic_xxx, section_ids for aot and llvm jit
  • more

Variables

Other global variables should be converted into field members of a runtime instance and will be utilized in various ways across multiple runtime instances.

  • aot_error
  • g_shared_memory_lock, wait_map for shared-memory feature
  • externref_xxx for externref recording
  • reader, destroyer, register_module_xxx, loading_module_xxx for multi-module feature.
    It appears correct that all runtime instances share the content of loaded module files(untouch) while maintaining their own linking resources (instances).
  • runtime_ref_count, runtime_lock. These are utilized to create a singleton runtime instance, which may no longer be necessary. Plus singleton_engine and engine_lock in wasm_c_api.c
  • runtime_running_mode
  • llvm_jit_options
  • jit_options for fast-jit
  • g_context_dtors, g_wasi_context_keys
  • g_native_symbols_list
  • global_pool_size, free_fuc, realloc_func, malloc_func, enlarge_memory_error_used_data, enlarge_memory_error_cb, pool_allocator, memory_mode from wasm_memory.
  • total_time_ms, last_time_ms, log_verbose_level from bh_log
  • prev_sig_act_XXX from pthread_manager
  • g_blocking_op_xxx from posix_blocking_ops
  • RuntimeInitArgs
  • more

Basic ideas of implementation

  • Every original API in wasm_export.h will have a new signature that includes an extra argument, WASMRuntime*. (After the feature is complete, the entire new set will be reviewed, and some may not need the extra argument.)
  • Add a new header, wasm_export2.h, to contain all new APIs.
  • In the original API implementation, retrieve the thread-local runtime context from TLS and call its new API with the returned WASMRuntime.
  • Besides the exported APIs, all internal APIs should include an extra argument as necessary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions