1
1
# From MIR to Binaries
2
2
3
- All of the preceding chapters of this guide have one thing in common: we never
4
- generated any executable machine code at all! With this chapter, all of that
5
- changes.
3
+ All of the preceding chapters of this guide have one thing in common:
4
+ we never generated any executable machine code at all!
5
+ With this chapter, all of that changes.
6
6
7
- So far, we've shown how the compiler can take raw source code in text format
8
- and transform it into [ MIR] . We have also shown how the compiler does various
9
- analyses on the code to detect things like type or lifetime errors. Now, we
10
- will finally take the MIR and produce some executable machine code.
7
+ So far,
8
+ we've shown how the compiler can take raw source code in text format
9
+ and transform it into [ MIR] .
10
+ We have also shown how the compiler does various
11
+ analyses on the code to detect things like type or lifetime errors.
12
+ Now, we will finally take the MIR and produce some executable machine code.
11
13
12
14
[ MIR ] : ./mir/index.md
13
15
14
- > NOTE: This part of a compiler is often called the _ backend_ . The term is a bit
15
- > overloaded because in the compiler source, it usually refers to the "codegen
16
- > backend" (i.e. LLVM, Cranelift, or GCC). Usually, when you see the word "backend"
17
- > in this part, we are referring to the "codegen backend".
16
+ > NOTE: This part of a compiler is often called the _ backend_ .
17
+ > The term is a bit overloaded because in the compiler source,
18
+ > it usually refers to the "codegen backend" (i.e. LLVM, Cranelift, or GCC).
19
+ > Usually, when you see the word "backend" in this part,
20
+ > we are referring to the "codegen backend".
18
21
19
22
So what do we need to do?
20
23
21
- 0 . First, we need to collect the set of things to generate code for. In
22
- particular, we need to find out which concrete types to substitute for
23
- generic ones, since we need to generate code for the concrete types.
24
- Generating code for the concrete types (i.e. emitting a copy of the code for
25
- each concrete type) is called _ monomorphization_ , so the process of
26
- collecting all the concrete types is called _ monomorphization collection_ .
24
+ 0 . First, we need to collect the set of things to generate code for.
25
+ In particular,
26
+ we need to find out which concrete types to substitute for generic ones,
27
+ since we need to generate code for the concrete types.
28
+ Generating code for the concrete types
29
+ (i.e. emitting a copy of the code for each concrete type) is called _ monomorphization_ ,
30
+ so the process of collecting all the concrete types is called _ monomorphization collection_ .
27
31
1 . Next, we need to actually lower the MIR to a codegen IR
28
32
(usually LLVM IR) for each concrete type we collected.
29
- 2 . Finally, we need to invoke the codegen backend, which runs a bunch of
30
- optimization passes, generates executable code, and links together an
31
- executable binary.
33
+ 2 . Finally, we need to invoke the codegen backend,
34
+ which runs a bunch of optimization passes,
35
+ generates executable code,
36
+ and links together an executable binary.
32
37
33
38
[ codegen1 ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html
34
39
35
40
The code for codegen is actually a bit complex due to a few factors:
36
41
37
- - Support for multiple codegen backends (LLVM, Cranelift, and GCC). We try to share as much
38
- backend code between them as possible, so a lot of it is generic over the
39
- codegen implementation. This means that there are often a lot of layers of
40
- abstraction.
42
+ - Support for multiple codegen backends (LLVM, Cranelift, and GCC).
43
+ We try to share as much backend code between them as possible,
44
+ so a lot of it is generic over the codegen implementation.
45
+ This means that there are often a lot of layers of abstraction.
41
46
- Codegen happens asynchronously in another thread for performance.
42
47
- The actual codegen is done by a third-party library (either of the 3 backends).
43
48
@@ -48,5 +53,5 @@ while the [`rustc_codegen_llvm`][llvm] crate contains code specific to LLVM code
48
53
[ llvm ] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/index.html
49
54
50
55
At a very high level, the entry point is
51
- [ ` rustc_codegen_ssa::base::codegen_crate ` ] [ codegen1 ] . This function starts the
52
- process discussed in the rest of this chapter.
56
+ [ ` rustc_codegen_ssa::base::codegen_crate ` ] [ codegen1 ] .
57
+ This function starts the process discussed in the rest of this chapter.
0 commit comments