When compiling autoprecompiles from crush assembly, we could have access to lifetime information of registers. This is a unique advantage of crush over RISC-V, because we control the compiler from WebAssembly, which in turn has a concept of scopes.
Autoprecompiles already have the property that each used register is accessed exactly once. We could skip the register access entirely under the following conditions:
- The first access to the register in the basic block is a write access, i.e., the previous value doesn't matter.
- The register is not used again after the end of the basic block.
Under these conditions, the register is simply a temporary variable, storing some intermediate result that is consumed within the same basic block (i.e., within the autoprecompile circuit).
Note that the tried this before for RISC-V, but condition (2) is very hard to guarantee because of RISC-V's dynamic jumps.
First task: Evaluate expected gains
To evaluate whether this is a worthwhile direction, we should check how much we would expect to save with this optimization:
- In Crush, also output a
drop annotation, to indicate that the value of that register does not matter after this point (which allows us to detect condition (2)).
- For a given benchmark, analyze each basic block to find how many registers meet the conditions above and can be removed.
- For each such register, we save 6 columns (example): Two related to the previous timestamp, and 4 to commit to the (unused) previous data of that register.
- Note that for a 64-bit register (e.g.
ADD_64), this accesses effectively 2 32-Bit registers.
When compiling autoprecompiles from crush assembly, we could have access to lifetime information of registers. This is a unique advantage of crush over RISC-V, because we control the compiler from WebAssembly, which in turn has a concept of scopes.
Autoprecompiles already have the property that each used register is accessed exactly once. We could skip the register access entirely under the following conditions:
Under these conditions, the register is simply a temporary variable, storing some intermediate result that is consumed within the same basic block (i.e., within the autoprecompile circuit).
Note that the tried this before for RISC-V, but condition (2) is very hard to guarantee because of RISC-V's dynamic jumps.
First task: Evaluate expected gains
To evaluate whether this is a worthwhile direction, we should check how much we would expect to save with this optimization:
dropannotation, to indicate that the value of that register does not matter after this point (which allows us to detect condition (2)).ADD_64), this accesses effectively 2 32-Bit registers.