diff --git a/0102-compiling-lambda-sets.md b/0102-compiling-lambda-sets.md
new file mode 100644
index 0000000..55808fe
--- /dev/null
+++ b/0102-compiling-lambda-sets.md
@@ -0,0 +1,1354 @@
+# Compiling lambda sets
+
+Status: Review
+
+## Overview
+
+This document describes an approach to compiling lambda sets that will be
+simpler and more reliable than the approach used today.
+
+I'll start from the bottom and explain how to compile programs in general, then
+provide extensions for abilities and explain how to support multiple
+modules/caching in the future.
+
+I'll also name "lambda sets" "function sets" since I think it will be an easier
+name to manage.
+
+## Quick view
+
+Let's do a walkthrough of what we're trying to do. This overview will introduce
+the language and high-level details of the procedure. In subsequent sections
+I'll dive deeper on each of these steps.
+
+The pipeline we will take is the following. Arrows indicate the name of the
+procedure. The output of an arrow is the name of the intermediate
+representation produced.
+
+```mermaid
+flowchart TB
+ subgraph FRONT-END
+ T["Text (input source code)"]
+ T -->|parse| PA["ParseAST"]
+ PA -->|canonicalize| CA["CanAST"]
+ CA -->|solve| Sol["Solved"]
+ Sol -->|ab_well_formed| AbWF["AbWellFormed"]
+ end
+
+ subgraph MIDDLE-END
+ AbWF -->|type_specialize| TS["TypeSpecialized"]
+ TS -->|function_lift| FL["FunctionLifted"]
+ FL -->|function_solve| FS["FunctionSolved"]
+ FS -->|function_specialize| FSp["FunctionSpecialized"]
+ end
+
+ subgraph BACK-END
+ FSp -->|lower_ir| LIR["LowerIR"]
+
+ LIR -->|refcount/tco/morphic| LIR
+
+ LIR -->|gen_dev| MC["Machine code"]
+ LIR -->|gen_wasm| WASM["WASM"]
+ LIR -->|gen_llvm| LLVM["LLVM IR"]
+ LIR -->|"..other backends.."| OTHER["Code to evaluate a program"]
+ end
+```
+
+
+Detailed pipeline
+
+```python
+>> FRONT-END
+ The user may have input an invalid program at this point.
+
+# input source code
+ Text
+
+# create a machine-traversable tree of tokens
+# after this step, the program is syntactically valid
+Text -parse-> ParseAST
+
+# normalize program, desugar, resolve links between names like x = 1; x + x
+# after this step, all non-ability variable usages are defined
+ParseAST -canonicalize-> CanAST
+
+# run type inference and type checking on the program
+# after this step, the program is known to have no type errors
+CanAST -solve-> Solved
+
+# check that all concrete variables used as abilities satisfy the ability
+# after this step, there is no surface for user errors in the program
+Solved -ab_well_formed-> AbWellFormed
+
+>> MIDDLE-END
+ At this point, there can be no more errors that are due to an invalid program.
+ All errors here are compiler bugs.
+
+# replace all calls to generic functions and abilities with concrete instances
+# after this step, the program has no generic types
+AbWellFormed -type_specialize-> TypeSpecialized
+
+# lift all closures to the top-level and leave behind closure captures
+# after this step, the program has no more implicit closures
+TypeSpecialized -function_lift-> FunctionLifted
+
+# infer all sets of functions passed to higher-order function (HOF) calls
+# after this step, every call to a HOF is assigned a variable with the set of functions passed (function set) or a generalized variable
+FunctionLifted -function_solve-> FunctionSolved
+
+# replace all calls to HOFs with concrete copies
+# after this step, the program has no HOFs (the program is first-order)
+FunctionSolved -function_specialize-> FunctionSpecialized
+
+>> BACK-END
+ At this point, the program is equivalent to a subset of Roc with no HOFs and no generic types.
+
+# convert the program to a single-step IR and concrete memory layouts
+# after this step, the program is trivially reducible to ANF or SSA.
+FunctionSpecialized -lower_ir-> LowerIR
+
+# perform refcount-insertions and various optimizations
+LowerIR -refcount-> LowerIR
+LowerIR -tco-> LowerIR
+
+# infer the program's borrow/ownership semantics and specialize calls to
+# borrow-generic functions to instances of borrowing or owning calls
+LowerIR -morphic-> LowerIR
+
+# compile the program to one of a number of machine-codes or runtimes
+LowerIR -gen_dev-> machine code
+LowerIR -gen_wasm-> WASM
+LowerIR -gen_llvm-> LLVM IR -llvm-> machine code
+LowerIR -..other backends..-> ..
+```
+
+
+
+In this document, we will mostly just discuss the middle-end.
+
+Suppose we have the source program
+
+```roc
+apply = \f, x -> f x
+
+main =
+ num = 1
+ x1 = apply (\x -> x + num) 1
+ x2 = apply (\s -> "${s}!") "hi"
+ { x1, x2 }
+```
+
+where `main` is the entry-point of the program (exposed to the host, or an
+expect).
+
+### solve
+
+Suppose we have already parsed this program and resolved symbols
+(canonicalization). The next step is to infer the type of this program. I won't
+go into all the details here, but the idea is to assign "type variables" to
+variables and expressions that are solved via constraints (e.g. a string literal
+forces the type variable to be assigned to `Str`). For simplicity, let's suppose
+all number literals are `I64` for now.
+
+Type inference must process the reduction of the dependency graph, i.e. a DAG.
+In this case the DAG is `apply -> main` on the toplevel, and `x1 -> {x1, x2} <-
+x2` inside `main`.
+
+I'll use the prefix `g_`
+denotes a type variable that can be instantiated with any type when used (it is
+a generalized, or generic, type). The type of this program is
+
+```roc
+apply : (g_f_arg1 -> g_f_ret), g_f_arg1 -> g_f_ret
+apply = \f, x -> f x
+
+main : { x1: I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ x1 : I64
+ x1 = apply (\x -> x + num) 1 # apply here is : (I64 -> I64) -> I64
+
+ x2 : Str
+ x2 = apply (\s -> "${s}!") "hi" # apply here is : (Str -> Str) -> Str
+
+ { x1, x2 } # : { x1: I64, x2: Str }
+```
+
+### ab_well_formed
+
+Skipped for now
+
+### type_specialize
+
+The next thing we're going to do is get rid of all generalized types in this
+program. This is necessary that calls to `apply` can operate on a concrete
+instance of `apply` that what the memory layout of its arguments and return type
+is. This produces more memory- and CPU-efficient code than operating over the
+generic representations. A procedure of this kind is called "specialization" or
+"monomorphization". In this document I'll call it "type specialization", because
+there is another kind of specialization we'll see later.
+
+To perform type specialization, we create copies of generalized values for each
+concerete usage, and then fix-up the call sites to reference the copy. Doing
+this over the program above, we end up with
+
+```roc
+apply_1 : (I64 -> I64), I64 -> I64
+apply_1 = \f, x -> f x
+
+apply_2 : (Str -> Str), Str -> Str
+apply_2 = \f, x -> f x
+
+main : { x1 : I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ x1 : I64
+ x1 = apply_1 (\x -> x + 1) 1
+
+ x2 : Str
+ x2 = apply_2 (\s -> "${s}!") "hi"
+
+ { x1, x2 }
+```
+
+### function_lift
+
+Next, we lift up all nested functions to the top-level. We'll want to allocate a
+new variable for all values that the function captures, and pass those to the
+function as well. We'll introduce a new operator, `@fnpack`, that packages a
+capturing function into a tuple of `(function name, captures)`.
+
+If a function captures, we pass its captures as a separate record in the first
+parameter. When we need to hold on to a value of a capturing function, we pass
+around the `@fnpack` tuple.
+
+```roc
+apply_1 : (I64 -> I64), I64 -> I64
+apply_1 = \f, x -> f x
+
+apply_2 : (Str -> Str), Str -> Str
+apply_2 = \f, x -> f x
+
+fn_1 : { num: I64 }, I64 -> I64
+fn_1 = \captures, x ->
+ num = captures.num
+ x + num
+
+fn_2 : Str -> Str
+fn_2 = \s -> "${s}!"
+
+main : { x1 : I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ fn_1_pack = @fn_pack( fn_1, { num } )
+
+ x1 : I64
+ x1 = apply_1 fn_1_pack 1
+
+ fn_2_pack = @fn_pack( fn_2 )
+
+ x2 : Str
+ x2 = apply_2 fn_2_pack "hi"
+
+ { x1, x2 }
+```
+
+### function_solve
+
+We're now ready to start what we need for lambda-set specialization, which I'll
+call function specialization from now on.
+
+The key idea here is to start thinking about functions as "generic" or "concrete" over the values of functions they take in.
+
+For example, the higher-order function `mapListI64 : List I64, (I64 -> I64) -> List I64` is generic over functions `(I64 -> I64)` in the second parameter.
+
+This is very similar to type specialization. Like specializing generic types to
+concrete types, we want to specialize generic higher-order functions to concrete instances of functions they use.
+
+This matters for the same reason type specialization matters.
+While the memory layout of `List I64` is now known to be concrete, a function `(I64 -> I64)` has a variable memory layout depending on how many values it captures.
+At minimum, the runtime must be given a function pointer.
+If the function captures variables, those must be passed as well. Since
+`(I64 -> I64)` tells us nothing about the size of the captures, in general the
+captures must be placed somewhere in memory (typically allocated on the heap)
+and a pointer to them must be passed. The general form for compiling functions
+like this is to pass a tuple of `(function pointer, captures pointer)` for each
+function argument.
+
+It also matters for other optimization reasons which will be discussed later.
+
+Our goal is to eliminate the need for indirecting the captures behind a pointer,
+and we'll see that in many cases we can even do much better than that.
+
+Okay, so first we annotate all function types as either generic (higher-order)
+or consisting of a set of concrete functions.
+Function types are given a variable, orthogonal to type variables, called "function set variables".
+A function set variable is either generic, in which case it is prefixed with `G_`, or otherwise concrete, in which case it is annotated as a union of function values that can be in that position.
+Each function value is accompanied with a record the values that it captures.
+For example, a function type `I64 -[f {a: I64}, g {b: Str}]-> I64`
+has the function set `[f {a: I64}, g {b: Str}, h]`,
+indicating that the function is either `f` with a capture of `a: I64`,
+`g` with a capture of `b: Str`, or `h` with no captures.
+
+This is pretty much the same as type inference of tag unions. Solving for our
+program, we get
+
+```roc
+apply_1 : (I64 -G_1-> I64), I64 -[apply_1]-> I64
+apply_1 = \f, x -> f x
+
+apply_2 : (Str -G_2-> Str), Str -[apply_2]-> Str
+apply_2 = \f, x -> f x
+
+fn_1 : { num: I64 }, I64 -[fn_1]-> I64
+fn_1 = \captures, x ->
+ num = captures.num
+ x + num
+
+fn_2 : Str -[fn_2]-> Str
+fn_2 = \s -> "${s}!"
+
+main : { x1 : I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ fn_1_pack : FunctionSet[fn_1 {num: I64}]
+ fn_1_pack = @fn_pack( fn_1, { num } )
+
+ x1 : I64
+ x1 = apply_1 fn_1_pack 1 # apply_1 has type (I64 -[fn_1 {num: I64}]-> I64) -[apply_1]-> I64
+
+ fn_1_pack : FunctionSet[fn_2]
+ fn_2_pack = @fn_pack( fn_2 )
+
+ x2 : Str
+ x2 = apply_2 fn_2_pack "hi" # apply_2 has type (Str -[fn_2]-> Str) -[apply_2]-> Str
+
+ { x1, x2 }
+```
+
+### function_specialize
+
+To perform function specialization, we create copies of generic higher-order
+functions (HOFs) for concerete usage, and then fix-up the call sites to reference the copy.
+Moreover, we re-write each function set to be a tag union that we pass to the
+function. Let's do this in two steps to understand how this works.
+
+First, we replace all HOFs with specialized copies and update the usage sites
+
+```roc
+apply_1_1 : (I64 -[fn_1 {num: I64}]-> I64), I64 -[apply_1]-> I64
+apply_1_1 = \f, x -> f x
+
+apply_2_1 : (Str -[fn_2]-> Str), Str -[apply_2]-> Str
+apply_2_1 = \f, x -> f x
+
+fn_1 : { num: I64 }, I64 -[fn_1]-> I64
+fn_1 = \captures, x ->
+ num = captures.num
+ x + num
+
+fn_2 : Str -[fn_2]-> Str
+fn_2 = \s -> "${s}!"
+
+main : { x1 : I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ fn_1_pack : FunctionSet[fn_1 {num: I64}]
+ fn_1_pack = @fn_pack( fn_1, { num } )
+
+ x1 : I64
+ x1 = apply_1_1 fn_1_pack 1
+
+ fn_2_pack : FunctionSet[fn_2]
+ fn_2_pack = @fn_pack( fn_2 )
+
+ x2 : Str
+ x2 = apply_2 fn_2_pack "hi"
+
+ { x1, x2 }
+```
+
+Next, we need to make sure we're passing a union of functions the caller can
+actually match on to figure out what to call. We'll do so by rewriting the passed functions as tag unions.
+We can also now get rid of the `@fn_pack` constructor and just pass the captures
+directly as the payload of the tag union.
+In this case it's not particularly interesting because each function set is
+singleton.
+
+Rearranging to a topological sort of the dependencies after rewriting, we get:
+
+```roc
+fn_1 : { num: I64 }, I64 -> I64
+fn_1 = \captures, x ->
+ num = captures.num
+ x + num
+
+fn_2 : Str -> Str
+fn_2 = \s -> "${s}!"
+
+apply_1_1 : [Fn_1 {num: I64}], I64 -> I64
+apply_1_1 = \f, x -> when f is
+ Fn_1 captures -> fn_1 captures x
+
+apply_2_1 : [Fn_2], Str -> Str
+apply_2_1 = \f, x -> when f is
+ Fn_2 -> fn_2 x
+
+main : { x1 : I64, x2: Str }
+main =
+ num : I64
+ num = 1
+
+ fn_1_pack : [Fn_1 {num: I64}]
+ fn_1_pack = Fn_1 {num}
+
+ x1 : I64
+ x1 = apply_1_1 fn_1_pack 1
+
+ fn_2_pack : [Fn_2]
+ fn_2_pack = Fn_2
+
+ x2 : Str
+ x2 = apply_2 fn_2_pack "hi"
+
+ { x1, x2 }
+```
+
+### lower_ir
+
+Finally, we have a first-order program with no generic types and no generic
+calls. We can convert this to a lower-level IR that is can then be used to add
+refcounts or easily lower to machine code, like the Mono IR is used today. We
+re-write all functions to procedures, and types to their layout in memory
+(`{..}` is a contiguous struct, `[..]` is a union with a discriminant)
+
+```
+procedure fn_1 (captures: {I64}, x: I64) -> I64 {
+ num = captures.0
+ ret = Num_add_I64 (x, num)
+ return ret
+}
+
+procedure fn_2 (s: Str) -> Str {
+ ret = Str_concat (s, "!")
+ return ret
+}
+
+procedure apply_1_1 (f: [0 {I64}], x: I64) -> I64 {
+ f_discriminant = f.0
+ is_0 = Bool_eq_U64 (f_discriminant, 0)
+ if is_0 then {
+ captures = f.1
+ ret = fn_1 (captures, x)
+ return ret
+ } else {
+ @crash()
+ }
+}
+
+procedure apply_2_1 (f: [0 {}], x: Str) -> Str {
+ f_discriminant = f.0
+ is_0 = Bool_eq_U64 (f_discriminant, 0)
+ if is_0 then {
+ ret = fn_2 (x)
+ } else {
+ @crash()
+ }
+}
+
+procedure main () -> {I64, Str} {
+ num = 1
+ fn_1_pack = {0, num}
+ temp_1 = 1
+ x1 = apply_1_1 (fn_1_pack, temp_1)
+ fn_2_pack = {0}
+ temp_2 = Str.alloc "hi"
+ x2 = apply_2_1 (fn_2_pack, temp_2)
+ ret {x1, x2}
+ return ret
+}
+```
+
+And using type-directed application, we can reduce all singletons to
+perform unconditional calls
+
+```
+procedure fn_1 (captures: I64, x: I64) -> I64 {
+ ret = Num_add_I64 (x, captures)
+ return ret
+}
+
+procedure fn_2 (s: Str) -> Str {
+ ret = Str_concat (s, "!")
+ return ret
+}
+
+procedure apply_1_1 (f: I64, x: I64) -> I64 {
+ ret = fn_1 (f, x)
+ return ret
+}
+
+procedure apply_2_1 (f: {}, x: Str) -> Str {
+ ret = fn_2 (x)
+ return ret
+}
+
+procedure main () -> {I64, Str} {
+ num = 1
+ fn_1_pack = num
+ temp_1 = 1
+ x1 = apply_1_1 (fn_1_pack, temp_1)
+ fn_2_pack = {}
+ temp_2 = Str.alloc "hi"
+ x2 = apply_2_1 (fn_2_pack, temp_2)
+ ret = {x1, x2}
+ return ret
+}
+```
+
+Okay, that's it. Let's dive into each step with some more detail.
+
+## solve
+
+This should precede as is done today. The existing implementation can be
+simplified significantly once these changes are made.
+
+## ab_well_formed
+
+Existing ability resolution in solve should be removed. The current procedure is
+self-contained but quite involved because it tries to do well-formedness
+checking, resolution of abilities, and function solving at the same time.
+
+`ab_well_formed` does not actually need to be a separate pass - it is easier to
+run it during inference - but I have broken it out just so it's easier to
+explain.
+
+First off, let's clarify what we're trying to do here. We're operating over the
+input program. That means we want to check the following cases:
+
+1. Generic ability used concretely as a paremeter
+
+ ```
+ test = \x -> x == x
+
+ test ""
+ ```
+
+ From here on out, I will use type variables prefixed with `_` to mean type
+ variables that are not generalized, i.e. can be instantiated to at most one type
+ (weak type variables). Generalized variables have no `_` prefix.
+
+ During inference we will have solved `test : a -> Bool | a has Eq`. When
+ inferring `test ""`, where `test` is instantiated to `_a1 -> Bool | _a1 has Eq`,
+ we will unify `_a1 has Eq ~ Str`. It is at this time that we want to record the
+ constraint `Str implements Eq` (to be solved now or later).
+
+1. Generic ability used concretely as a value
+
+ ```
+ ability Materialize has
+ materialize : {} -> a
+
+ test = \{} -> materialize {}
+
+ test (@Q {})
+ ```
+
+ Note that this follows the same pattern as the case above.
+
+1. Generic ability used generically
+
+ ```
+ test1 : \x -> x == x
+
+ test2 : \x ->
+ dbg x
+ test1 x
+ ```
+
+ During inference we will have solved `test1 : a -> Bool | a has Eq`.
+
+ Taking `x : _b1`, `dbg x` forces `x : _b1 has Inspect`
+
+ When inferring `test1 x`, where
+
+ ```
+ test1 : _a1 -> Bool | _a1 has Eq
+ x : _b1 has Inspect
+ ```
+
+ we will solve `_a1 = _b1 has Eq, Inspect`. Now, we generalize `test2` and end up
+ with
+
+ ```
+ test1 : a -> Bool | a has Eq
+
+ test2 : b -> Bool | b has Eq, Inspect
+ ```
+
+ Note that this case is why we cannot resolve abilities until we have fully
+ specialized types.
+
+1. Shy abilities/Evil abilities
+
+ Shy abilities are those that can not be resolved because they hide away from
+ any concrete type. For example,
+
+ ```
+ ability Shy has
+ show : {} -> a
+ hide : a -> {}
+
+ test1 : {} -> {}
+ test1 = \{} -> hide (show {})
+ ```
+
+ We will never know what instance (if any) of `Shy` the author intended, so this
+ must be caught as an error. Fortunately, the way to do so is pretty simple.
+
+ Whenever a generalizable scope (function scope) is entered, create a lookaside
+ table. Store in the lookaside table all introduced variables that are bound to
+ an ability. When doing checking the generalizable scope, generalize, and then
+ scan through the lookaside type variables. If any are weak (not concrete and not
+ generalized), they must be an ability bound to a type that will never be
+ resolved. Emit an error for those instances.
+
+These cases are all that is needed to check that abilities in a program are
+well-formed. Because abilities can only escape through input and output types of
+functions (shy abilities are the only exception), they will ulimately be all
+resolved because the entrypoint of a Roc program is of a concrete type.
+
+## type_specialize
+
+### General idea
+
+Up to this point, each procedure has relied on runnning on a program in
+topological dependency order. To specialize a program, we must turn around and
+run in reverse order: starting from the entry point of the program, we traverse
+everything until we have specialized all generic functions we need to.
+
+There are two reasons to do this. The first is that the entry point of the
+program is concrete and determines all other types needed in the program. The
+second is this provides a free form of dead-code elimination.
+
+For sake of simplicity, let's lump all the code together into one module
+(sometimes called a "compilation unit"). We'll discuss later how to recover
+parallelization within a compilation unit, but let's suppose for now that we are
+going to type-specialize the entire program in a single thread.
+
+The first thing to do is to define the data types we're going to work with.
+Let's say that our input is a struct of the AbWellFormed AST and some metadata,
+like the so-called `Subs` that holds substitutions of type variables in the
+current compiler implementation.
+
+
+
+`ab_well_formed` and `type_specialized` AST/type definition
+
+
+```rust
+// mod ab_well_formed = abwf
+
+struct AbWellFormed {
+ program: abwf::AST,
+ subs: abwf::Subs,
+ // ..others..
+}
+```
+
+Our output will look like
+
+```rust
+// mod type_specialized = ts
+// Ignoring mutability, boxing, etc.
+
+enum AST {
+ // Same shape as the existing AST, modulo type variables
+}
+
+/// Reference to a Type
+struct TypeVariable {
+ // contents hidden
+}
+
+enum Type {
+ // Same as existing type shape, but without any unbound variables
+ Primitive(..),
+ Record(Vec<(String, TypeVariable)>),
+ TagUnion(Vec<(String, Vec)>),
+ // ..etc
+}
+
+impl Type {
+ const UNFILLED: Type = /* some sentinel value to mark placeholder types */
+ const VOID: Type = Type::TagUnion([]);
+}
+
+struct TypeContext {
+ // contents hidden
+}
+
+impl TypeContext {
+ /// `deref(tvar)` returns the `Type` referenced by tvar
+ pub fn deref(&self, tvar: TypeVariable) -> &Type;
+ /// `tvar = create(ty)` results in `deref(tvar) = ty`
+ pub fn create(&mut self, ty: Type) -> TypeVariable;
+ /// `set(tvar, type)` results in `deref(tvar) = type`
+ pub fn set(&mut self, tvar: TypeVariable, ty: Type);
+ /// `link(tvar1, tvar2)` results in `ty_eq(tvar1, tvar2) = true`
+ pub fn link(&mut self, tvar1: TypeVariable, tvar2: TypeVariable);
+ /// `ty_eq(tvar1, tvar2)` is true iff `deref(tvar1)` is equivalent to `deref(tvar2)`
+ pub fn ty_eq(&self, tvar1: TypeVariable, tvar2: TypeVariable) -> bool;
+}
+
+struct TypeSpecialized {
+ ast: AST,
+ type_context: TypeContext,
+}
+```
+
+
+Our approach will be the following:
+
+- Create a `SpecializationQueue`, a queue (FIFO or LIFO) of tuples to specialize. Each
+ tuple consists of the symbol to specialize from the `abwf` AST, the type
+ to specialize to from the `abwf` type definition, and a specialization key.
+ The specialization key is a tuple `(original_symbol: abwf::Symbol, ty: TypeVariable)`.
+ The `SpecializationQueue` should also keep track of a mapping `SpecializationKey -> ts::Symbol`, mapping specialization keys to their specialized symbol.
+- Initialize the queue with the entry point of the program.
+- While the queue is non-empty:
+ - Pull the next entry from the queue, `specialization_goal`.
+ - Look up the type and definition of the `specialization_goal`'s symbol from
+ the `abwf` AST. Call this `target`.
+ - Create a copy of the `target`'s type, and create a lookaside table that maps
+ `type_map: abwf::TypeVariable -> ts::TypeVariable`. More on this later.
+ - Run `abwf::unify (target.type, specialization_goal.goal_type)`. This forces the
+ type of the `target`, which we are about to specialize, to have the types
+ of the goal, and furthermore propagate those types to sites in the
+ definition.
+ - Initialize a mapping `symbol_map: abwf::Symbol -> ts::Symbol`.
+ - Walk the AST of `target.definition`. Create a new instance of each symbol
+ discovered. For each type discovered, run a procedure `lower_type:
+ type_map, variable: abwf::TypeVariable -> ts::TypeVariable` that converts
+ the type from the `abwf` AST to the types needed in the specialization
+ AST. This process should just walk and instantiate new instances of all
+ symbols and all types.
+ - Whenever a call is encountered:
+ - Set `specialization_key` = create a `SpecializationKey` for the called symbol.
+ - Set `specialization_symbol = specialization_queue.get_or_create_specialization_symbol(specialization_key)`.
+ - Replace the call's symbol with `specialization_symbol`.
+
+
+
+Definition of `SpecializationQueue`
+
+
+```rust
+struct SpecializationKey {
+ original_symbol: abwf::Symbol,
+ ty: ts::TypeVariable,
+}
+
+impl Eq for SpecializationKey {
+ fn eq(&self, other: &SpecializationKey) -> bool {
+ self.original_symbol == other.original_symbol &&
+ // Comparing the types for equivalence is important. See below for more details.
+ type_context.ty_eq(self.ty, other.ty)
+ }
+}
+
+struct SpecializationGoal {
+ key: SpecializationKey,
+ goal_type: abwf::TypeVariable,
+ goal_symbol: ts::Symbol,
+}
+
+struct SpecializationQueue {
+ queue: Dequeue<(SpecializationKey, SpecializationGoal)>,
+ specialization_symbols: Map,
+ specializations: Map,
+}
+
+impl SpecializationQueue {
+ /// `next_to_specialize()` returns the next goal for which a specialization should be created.
+ pub fn next_to_specialize(&mut self) -> Option;
+ // INVARIANT: If `next_to_specialize() = None`, then `self.specialization_symbols.keys() = self.specializations.keys()`
+
+ /// `get_or_create_specialization_symbol(key, goal_type)` returns the symbol corresponding to the specialization of the `key`.
+ /// If the specialization does not exist, it will be enqueued.
+ pub fn get_or_create_specialization_symbol(&mut self, key: SpecializationKey, goal_type: abwf::TypeVariable) -> ts::Symbol {
+ match self.specialization_symbols.get(&key) {
+ Some(specialization_symbol) => *specialization_symbol,
+ None => {
+ let goal_symbol = create_new_symbol();
+ // Note that hashing of the key only works if there is a stable hash for types; otherwise, a specialization must be found by equivalence.
+ // See commentary later in this section.
+ self.specialization_symbols.insert(key, goal_symbol);
+ let goal = SpecializationGoal { key, goal_type, goal_symbol };
+ self.queue.push_back(goal);
+ specialization_symbol
+ }
+ }
+ }
+
+ /// `insert_specialization(goal, specialization)` records the specialization for the given `goal`.
+ pub fn insert_specialization(&mut self, goal: SpecializationGoal, ast: ts::AST);
+
+ pub fn get_specializations(&self) -> ts::AST;
+}
+```
+
+
+
+Implementation of `type_specialize::specialize`
+
+```rust
+pub fn specialize(abwf: abwf::AbWellFormed) -> ts::TypeSpecialized {
+ let mut ctx = Context {
+ abwf: &abwf,
+ specialization_queue: SpecializationQueue::new(),
+ type_context: TypeContext::new(),
+ }
+ let specialized_entry_points: Vec = Vec::new();
+ for (original_symbol, goal_type) in find_entry_points(abwf) {
+ let mut type_map = Map::new();
+ let ty = lower_type(&ctx, &mut type_map, goal_type);
+ let key = SpecializationKey { original_symbol, ty };
+ let goal = SpecializationGoal { key, goal_type };
+ let specialized_entry_point = ctx.specialization_queue.get_or_create_specialization_symbol(goal);
+ specialized_entry_points.push(specialized_entry_point);
+ }
+
+ while let Some(goal) = ctx.specialization_queue.next_to_specialize() {
+ let original_ast = find_definition(ctx.abwf, goal.key.original_symbol);
+ let original_type = original_ast.type();
+ // IMPORTANT: must create a state for unifying the goal type that is orthogonal to the rest of the program.
+ // See later commentary.
+ let snapshot = ctx.abwf.subs.snapshot();
+
+ ctx.abwf.subs.unify(original_type, goal.goal_type);
+
+ let mut type_map = Map::new();
+ let mut symbol_map = Map::new();
+ let spec_ast = lower_ast(&ctx, &mut type_map, &mut symbol_map, original_ast);
+ ctx.specialization_queue.insert_specialization(goal, spec_ast);
+
+ ctx.abwf.subs.rollback(snapshot);
+ }
+
+ let ast = ctx.specialization_queue.get_specializations();
+ ts::TypeSpecialized {
+ ast,
+ type_context
+ }
+}
+
+struct Context {
+ abwf: &abwf::AbWellFormed,
+ specialization_queue: SpecializationQueue,
+ type_context: TypeContext,
+}
+
+fn find_entry_points(abwf: abwf::AbWellFormed) -> Vec<(abwf::Symbol, abwf::TypeVariable)>;
+
+fn find_definition(abwf: abwf::AbWellFormed, symbol: abwf::Symbol) -> abwf::AST;
+
+fn lower_type(ctx: &Context, type_map: &mut Map, ty: abwf::TypeVariable) -> ts::TypeVariable {
+ if let Some(spec_ty) = type_map.get(ty) {
+ // If the same type variable was already lowered, re-use that to preserve the relationship.
+ return spec_ty;
+ }
+
+ // Must map the new type up-front to handle recursive types.
+ let spec_ty = ctx.type_context.create(Type::UNFILLED);
+ let spec_ty_content = match ctx.abwf.subs.get(ty) {
+ // Only interesting case - unbound types at this point are never used by a value, so they can be the void type
+ RigidVar(_) | FlexVar(_) => Type::VOID,
+ // ..etc..
+ };
+ ctx.type_context.set(spec_ty, spec_ty_content);
+ spec_ty
+}
+
+fn lower_ast(
+ ctx: &Cotenxt,
+ type_map: &mut Map,
+ symbol_map: &mut Map,
+ ast: &abwf::AST
+) -> ts::AST {
+ let original_type = ast.type();
+ let spec_type = lower_type(ctx, type_map, original_type);
+
+ let ast_node = match ast {
+ abwf::AST::Variable(x) => {
+ // If the variable is in the symbol_map, it must be in the local scope and we replace it (assuming local symbols are never generalized).
+ // Otherwise, it is a top-level and we must replace it with the specialization.
+ let x1 = match symbol_map.get(x) {
+ Some(x1) => *x1
+ None => ctx.specializations.get_or_create_specialization_symbol(SpecializationKey {
+ original_symbol: x,
+ ty; spec_type,
+ })
+ };
+ ts::AST::Variable(x1)
+ }
+ abwf::AST::Let(x, e) => {
+ // x = e
+ let x1 = create_new_symbol_from(x);
+ symbol_map.insert(x, x1);
+ let e1 = lower_ast(ctx, type_map, symbol_map, e);
+ ts::AST::Let(x1, e1)
+ }
+ // ..etc..
+ }
+
+ // NOTE that this probably doesn't exist unless all AST nodes are untyped.
+ // If AST nodes themselves contain types, then replace the types in-line.
+ // However, a disjoint AST and Type may be easier, where e.g. AST { Call(fn: TypedAst, args: TypedAst[]) }; TypedAST = { ty: TypeVariable, expr: AST }
+ ts::TypedAST(spec_type, ast_node)
+}
+```
+
+
+
+### Storage of types
+
+The best way to store types is in a similar union-find format that is used to
+store variables in passes up to this point. This enables the preservation of
+links between types for use in latter passes as well. For example, up to this
+point, if we had
+
+```
+\x -> x
+```
+
+we would have typed
+
+```
+\(x: _t1) -> (x: _t2)
+_t1 = _t2 (both variables point to the same underlying type)
+```
+
+Now, when we convert to the lower AST, we want to continue preserving this
+relationship. This is the reason for the `type_cache` above.
+
+The particular implementation doesn't matter too much and should be opaque.
+`TypeVariable` can be an index into an array in `TypeContext`, `TypeVariable`
+could be a reference-counted pointer to a `Type`, etc.
+
+### Comparing specialization keys for equivalence, and stable specialization keys
+
+When checking if a specialization key exists, it is important to compare keys for
+equivalence of their types, not just equality of pointers.
+
+To see why this is, consider the program
+
+```
+foo = \x -> x
+
+call1 = \x -> foo x
+
+call2 = \x -> foo x
+
+main = (call1 "", call2 "")
+```
+
+In this case, because the specializations for `call1` and `call2` are processed
+independently, both will check if there is a specialization for `(foo, sometvar
+= Str -> Str)` that exists. Because the specializations are independent, the
+type variable `sometvar` will differ between the two, since it will be created
+fresh. If only comparing pointer equality, the two type variables will not be
+seen as equal, and a duplicate specialization of `foo` will be created. If the
+structure of the type is compared, it will be seen to be equivalent.
+
+There are a few ways of going about comparing types for equivalence. The most
+straightforward way (which is necessary regardless of the final approach) is to
+compare types for structural equality, up to recursion. The way we handle
+recursion is to keep track of the pointers of type variables being compared. If
+a pair of type variables is compared again, we know the two types being compared
+are recursive. As long as the types were equivalent up to that point, we then
+know they are equivalent. Note that `lower_type` explicitly handles recursion,
+making this safe.
+
+```rust
+fn type_equivalent(ctx: &TypeContext, tv1: TypeVariable, tv2: TypeVariable) -> bool {
+ // VariableId is some value that uniquely identifies a TypeVariable by pointer equality.
+ let mut seen: Set<(VariableId, VariableId)> = Set::new();
+ type_equivalent_help(ctx, &mut seen, tv1, tv2)
+}
+
+fn type_equivalent_help(ctx: &TypeContext, &mut seen: Set<(VariableId, VariableId)> tv1: TypeVariable, tv2: TypeVariable) -> bool {
+ if tv1.variable_id() == tv2.variable_id() {
+ return true;
+ }
+ let visited_pair = (tv1.variable_id(), tv2.variable_id());
+ if seen.has(&visited_pair) {
+ return true;
+ }
+ seen.insert(visited_pair);
+
+ let is_equivalent = match (ctx.deref(tv1), ctx.deref(tv2)) {
+ (Tuple(tvs1), Tuple(tvs2)) => {
+ tvs1.len() == tvs2.len() &&
+ tvs1.iter().zip(tvs2).for_all(|tv1, tv2| type_equivalent_help(ctx, seen, tv1, tv2))
+ }
+ // ..etc..
+ }
+
+ seen.remove(&visited_pair);
+
+ is_equivalent
+}
+```
+
+Another method is to perform equivalence checking up-front by interning the
+types at the time of creation of a type variable. Interning pushes the cost of
+comparison to the front, so that the invariant becomes
+
+```
+type_equivalent(tv1, tv2) = true if and only if tv1.variable_id() == tv2.variable_id()
+```
+
+This makes subsequent checks for equality much faster.
+
+A simple interning mechanism would be
+
+```rust
+fn create(&self, ty: Type) -> TypeVariable {
+ let hash = hash(ty);
+ let next_idx = self.interned.len();
+ let idx = self.interned
+ .raw_entry_mut()
+ .from_key_hashed_nocheck(hash, &value)
+ .or_insert((value, next_idx));
+ TypeVariable(idx)
+}
+```
+
+However, we must now figure out how to hash types correctly. If types have no
+cycles (are not recursive), or recursive types are nominal (identified by a
+unique name, like Module+Identifier), then there is no problem - and actually
+the equivalence check does not need to check for seen unique IDs.
+
+If types are structurally recursive, then we must somehow collapse types with
+the same structural representation but with different unique IDs. For example,
+consider these two types
+
+```
+TypeVariable(0) = Type::Union([("Cons", [TypeVariable(0)), ("Nil", [])]])
+TypeVariable(1) = Type::Union([("Cons", [TypeVariable(1)), ("Nil", [])]])
+```
+
+These two types should intern to the same representation, but hashing them
+naively would result in two separate hashes. One handle to handle this is to
+build a "canonical form" of a type before interning, replacing recursion sites
+with monotonically increasing IDs from 0. In this case, the canonical form would
+be
+
+```
+TypeVariable(0) = Type::Union([("Cons", [TypeVariable(0)), ("Nil", [])]])
+```
+
+in both cases. Then, after interning the value, store a mapping of `canonical
+form -> intern ID`. If the same canonical form is seen again, return the
+mapped intern ID.
+
+With nominally annotated recursive types, this problem goes away. To see why,
+suppose we had been lowering two types
+
+```
+abwf::TypeVariable(0) = Type::Nominal(ModuleA.ConsList, TypeVariable(1) = Type::Union([("Cons", [TypeVariable(0)]), ("Nil", [])]))
+abwf::TypeVariable(1) = Type::Nominal(ModuleA.ConsList, TypeVariable(2) = Type::Union([("Cons", [TypeVariable(1)]), ("Nil", [])]))
+```
+
+When we intern `Type::Nominal`, it is enough to intern by the nominal name and
+any type arguments. That means we first intern by the composite hash of `Type::Nominal +
+ModuleA.ConsList + [type args, zero in this case]`. This is a stable hash with
+no dynamic IDs, and maps the same in both cases.
+
+Interning is likely necessary, and is significantly easier to do if recursive
+types are first forced to be nominal - which has other benefits in type checking
+and error reporting as well. Unless or until that change is made, I suggest
+going with strict checking of equivalence to de-duplicate specializations.
+
+### Creating orthogonal type and symbol states during specialization
+
+Symbols and types in the source program have a 1:many relationship relative to
+symbols and types after specialization. This is the reason for creating new
+instances of types and symbols in specialized instances of functions, and the
+presence of the `type_map` and `symbol_map` when specializing the AST of a
+function.
+
+For example, consider
+
+```
+apply : _t1 -> _t1
+apply = \x -> x
+
+main = (apply "", apply 1)
+```
+
+Here, I'll use `_tN` to refer to the same type type variable, and the symbol of
+an identifier is the identifier name.
+
+If we did not instantiate new symbols in the created specializations, we would
+end up with
+
+```
+apply_1 = \x -> x
+apply_2 = \x -> x
+
+main = (apply_1 "", apply_2 1)
+```
+
+This may not seem like a huge problem, but it can be significant later on in the
+compilation process when symbols are expected to be globally unique.
+
+If we did not keep a `symbol_map` for freshly instantiated symbols in a scope,
+we would lose track of bindings between symbols, ending up creating new
+instances for `apply`'s argument but not its body:
+
+```
+apply_1 = \x_1 -> x
+apply_2 = \x_2 -> x
+
+main = (apply_1 "", apply_2 1)
+```
+
+Similar reasoning applies for the value of `type_map`. Without it, we would
+create separate type variables that do not point to the same underlying type
+instance between `apply`'s argument and its body:
+
+```
+apply_1 = \(x_1: _t1 = Str) -> (x_1: _t2 = Str) # type_equivalent(_t1, _t2), but _t1.unique_id() != _t2.unique_id()
+apply_2 = \(x_2: _t3 = I64) -> (x_2: _t4 = I64) # type_equivalent(_t3, _t4), but _t3.unique_id() != _t4.unique_id()
+
+main = (apply_1 "", apply_2 1)
+```
+
+Again, this is not terribly important at this stage, but it leads to a more
+compact graph and makes it easier to preserve relations between types later on.
+
+Finally, the `1:many` relationship between `abwf::TypeVariable`s and
+`ts::TypeVariable`s is the reason why types from the `abwf` program must be
+processed independently for each specialization, without affecting the rest of
+the program. When we perform the first specialization of `apply: g_1 -> g_1`
+(generalized) to `Str -> Str`, we unify `g_1 -> g_1 ~ Str -> Str`, forcing the
+intermediate program state
+
+```
+apply : Str -> Str
+apply = \x -> x
+
+apply_1 = // About to be created
+apply_2 = // To be created later
+
+main = (apply_1 "", apply_2 1)
+```
+
+If we do not rollback the type state to `apply`'s generalized form after
+completing the work for `apply_1`, then when we go to create the specialization
+`apply_2`, we will attempt to unify `Str -> Str ~ I64 -> I64`, which we cannot
+recover from.
+
+The simplest way to above this is to snapshot the program type state when we go
+to specialize a function, perform type changes within that snapshot, and then
+throw away that snapshot at the end.
+
+There are other approaches as well, for example cloning the generalized type of
+`apply` before unification for specialization, and moreover cloning every type
+in the body of `apply` during its specialization. This approach requires also
+keeping track of a mapping `abwf_type_map : abwf::TypeVariable ->
+abwf::TypVariable` that re-uses already-cloned type variables, again to preserve
+relationships between types in the function body.
+
+### No generalization besides on the top-level
+
+The procedure here relies on the fact that there are no generalized functions
+within a top-level function scope. I believe the current state of Roc is that
+functions within a top-level function are not generalized, so this is not
+anything new.
+
+Supporting generalized functions within a scope is also possible. The easiest
+way to do so would be to run function lifting before type specialization (see
+below).
+
+### Handling abilities
+
+During type specialization is also the best time to resolve the target of
+ability calls. The relevant change in the sample code above would be
+
+```
+ while let Some(goal) = ctx.specialization_queue.next_to_specialize() {
+ let original_definition = find_definition(ctx.abwf, goal.key.original_symbol);
+ let (original_ast, original_type) = match original_definition {
+ AbilityDefinition(ability_member) => {
+ // Lines up the type of the ability target in ability_def with the concrete type to determine what type the ability should be found from
+ let target_type = find_ability_type(ability_member, goal.goal_type);
+ // Resolve the definition of the ability member for this type - either an existing definition associated with the opaque type,
+ // or synthesize the ability on-the-fly.
+ let resolved_ability_definition = resolve_ability(target_type, ability_member);
+ (resolved_ability_definition.ast(), resolved_ability_definition.type())
+ }
+ def => (def.ast(), def.type()),
+ }
+
+ // rest of existing code
+ }
+```
+
+Abilities are one reason it is easier to perform type specialization with all
+modules stabled together. Abilities entirely circumvent a DAG of topological
+dependencies one specialization is in question. Consider for example
+
+```
+# module Dep
+
+ability Test has
+ destroy: a -> {} | a has Test
+
+do_destroy : a -> {} | a has Test
+do_destroy = \x -> destroy x
+
+# module Use
+
+import Dep.{ Test, do_destroy }
+
+Q := {} has [Test { destroy: q_destroy }]
+
+q_destroy = \@Q {} -> {}
+
+main = do_destroy (@Q {})
+```
+
+The topological dependency graph is `Dep.do_destroy -> Use.main`.
+
+However, the specialized program is
+
+```
+# From Use
+q_destroy_1 = \@Q {} -> {}
+
+# From Dep
+do_destroy_1 = \x -> q_destroy_1 x
+
+# From Use
+main = do_destroy_1 (@Q {})
+```
+
+Forming the dependency graph `Use.q_destroy -> Dep.do_destroy -> Use.main`.
+As such, type specialization in the presence of ad-hoc polymorphic requires
+ad-hoc re-entry into modules, and there is no longer a strong invariant that
+dependencies form an acyclic graph with regard to the modules they come from.
+
+### Useful debugging tools
+
+- A pretty-printer for the resulting AST, run snapshot tests with this (print
+ symbols uniquely)
+- A type-checker for the resulting AST. This should be a straightforward
+ type-checker that just calls `type_equivalent` - there is no need for
+ inference, since all types are already resolved. Good for checking the
+ output is type-correct.
+- A tree-walking checker that all symbols with the same identifier are unique.
+ Useful for making sure that symbols are correctly instantiated to fresh
+ symbols when a generic function is specialized multiple times.
+
+### Recovering parallelization
+
+One simple way to recover parallel compilation, if it is required, is to lump
+all modules together before type specialization, but submit specialization jobs
+to a worker pool. Since each specialization of a function is independent, there
+is no contenstion except over global state. Global state may include derived
+implementations, or if an interner is used for the type context, the type
+context interner. However, global state may also have a thread-local cache.
+
+I would suggest an implementation without parallelism first.
+
+### Recovering caching
+
+One way to implement coarce caching in this context is to keep track of the
+transient specializations submitted from a particular function. This requires
+- a stable hash of a source program's function AST, after symbol resolution
+- a stable hash of `SpecializationKey`s (see previous discussion)
+
+If this is available, then the following data can be stored
+
+```
+type AstHash = /* stable hash of the source program AST for the function, and all its transitive dependencies */
+
+// Reverse index of the transitive specializations needed for a particular function
+function_needed_specializations_cache: AstHash -> (SpecializationKey, AstHash)[]
+// Index of specialization key to specialization
+specialization_cache: (SpecializationKey, AstHash) -> Specialization
+```
+
+When a specialization goal is popped off the queue, if there is an entry for the
+goal in the reverse index, it is enough to load the referenced specializations
+and instead of computing new ones.
+
+## function_lift
+
+TODO
+
+Basic transformation
+
+Note: making sure to re-write symbols of parameters that are lifted up and
+instantiating new types
+
+Extension - trivial compilation from here with erased types
+
+## function_solve
+
+TODO
+
+Instantiating the new program - must instantiate new types if didn't do so before
+Caching types during instantiation
+
+Type inference algo
+
+Fixing up capture types, if needed
+
+## function_specialize
+
+TODO
+
+Type lowering
+
+Specializations, deduplication + checking for equality
+
+Instantiating new symbols in copied functions
+
+Interning: is it important?
+
+Type caching within a function
+
+Extension: handling erasure here
+
+## lower_ir
+
+TODO
+
+Lowering to layouts
+
+Simple lowering of shape
+
+Extension: lowering patterns before this, or after?
+
+## Extensions
+
+TODO
+
+Supporting per-module compilation
+
+Supporting caching
+
+What can be combined? What cannot?