-
Notifications
You must be signed in to change notification settings - Fork 11
The RUPTA Design
RUPTA is plugged into the Rust Compiler, i.e., rustc
, and is implemented as a callback that is invoked during the Rust compilation process.
Running rustc
with a customized callback is straightforward:
let compiler = rustc_driver::RunCompiler::new(&at_args, &mut callbacks);
compiler.run();
Plugging into the existing Rust compiler offers significant advantages: RUPTA bypasses the need for scanning, parsing, macro expansion, type analysis, and syntactic desugaring. It has access to the compiler's symbol table and can leverage the compiler to de-virtualize functions. Additionally, RUPTA integrates smoothly with Cargo, the official Rust package manager, which alleviates concerns related to handling project dependencies.
The Rust compiler processes Rust programs through various Intermediate Representations (IRs): High-level IR (HIR), Mid-level IR (MIR), and LLVM IR. RUPTA operates on Rust's MIR, which is characterized by its structured format and rich type information.
MIR simplifies Rust by distilling it to a basic core, stripping away much of the complexity that is irrelevant to static analysis. It is specifically tailored for the Rust language and is more high-level than LLVM IR, making it easier to reason about Rust-specific constructs and concepts. Because of these advantages, MIR is the preferred choice for Rust analysis.
We anticipate that the analysis results of RUPTA will be more readily accessible to other analysis tools.
RUPTA is designed to perform whole program analysis on a Rust project. When given a target to analyze, it initiates from an entry point—defaulting to the main
function if not specified—and analyzes all functions reachable from this point, including those in dependencies or libraries used by the project.
Since RUPTA needs to resolve calls in reachable functions, specifying a generic function as an entry point can lead to incomplete analysis.
Like many other pointer analysis frameworks, RUPTA constructs a call graph on-the-fly during the analysis.
Rust supports dynamic dispatch
through dynamic trait objects
, which can hold values of any type that implements a specific trait. Virtual functions called on dynamic trait objects are resolved at runtime, based on the actual type of the objects. Additionally, Rust also supports function pointers for dynamic runtime function invocation.
During its analysis, RUPTA may discover new potential targets for indirect function calls based on the points-to information computed for pointer variables. Consequently, RUPTA resolves pointers to dynamic trait objects and function pointers.
Generic data types are frequently used in Rust MIR, adding complexities to pointer analysis. However, Rust’s robust type system facilitates compile-time reasoning for generics. This capability allows a whole program analysis on Rust MIR to fully resolve generic data types in a program.
RUPTA determines the actual types of generics during analysis, treating generic functions with different concrete types as separate function instances and analyzing them individually.
RUPTA represents variables and abstract memory objects using Path
, which closely resembles a Place
in MIR. An object with a struct type is modeled with all its field information, including the projection and type for each field.
RUPTA transforms Rust MIR statements into a graph representation, PAG
. Each node (PAGNode
) in the graph represents either pointer variables or abstract memory locations, while each edge (PAGEdge
) represents a constraint between two nodes. We classify PAG edges into different categories.
The PAG serves as the central data structure in RUPTA, capturing the flow of pointer values and their potential targets throughout the program. RUPTA propagates points-to information along the PAG edges.