Skip to content

Update FAQ on initialization order #3017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 22, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 103 additions & 87 deletions _overviews/FAQ/initialization-order.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,158 +7,174 @@ permalink: /tutorials/FAQ/:title.html

## Example

To understand the problem, let's pick the following concrete example.
The following example illustrates how classes in a subclass relation
witness the initialization of two fields which are inherited from
their top-most parent. The values are printed during the constructor
of each class, that is, when an instance is initialized.

abstract class A {
val x1: String
val x2: String = "mom"

println("A: " + x1 + ", " + x2)
println(s"A: $x1, $x2")
}
class B extends A {
val x1: String = "hello"

println("B: " + x1 + ", " + x2)
println(s"B: $x1, $x2")
}
class C extends B {
override val x2: String = "dad"

println("C: " + x1 + ", " + x2)
println(s"C: $x1, $x2")
}

Let's observe the initialization order through the Scala REPL:
In the Scala REPL we observe:

scala> new C
A: null, null
B: hello, null
C: hello, dad

Only when we get to the constructor of `C` are both `x1` and `x2` initialized. Therefore, constructors of `A` and `B` risk running into `NullPointerException`s.
Only when we get to the constructor of `C` are both `x1` and `x2` properly initialized.
Therefore, constructors of `A` and `B` risk running into `NullPointerException`s,
since fields are null-valued until set by a constructor.

## Explanation
A 'strict' or 'eager' val is one which is not marked lazy.

In the absence of "early definitions" (see below), initialization of strict vals is done in the following order.
A "strict" or "eager" val is a `val` which is not a `lazy val`.
Initialization of strict vals is done in the following order:

1. Superclasses are fully initialized before subclasses.
2. Otherwise, in declaration order.

Naturally when a val is overridden, it is not initialized more than once. So though x2 in the above example is seemingly defined at every point, this is not the case: an overridden val will appear to be null during the construction of superclasses, as will an abstract val.

There is a compiler flag which can be useful for identifying this situation:

**-Xcheckinit**: Add runtime check to field accessors.

It is inadvisable to use this flag outside of testing. It adds significantly to the code size by putting a wrapper around all potentially uninitialized field accesses: the wrapper will throw an exception rather than allow a null (or 0/false in the case of primitive types) to silently appear. Note also that this adds a *runtime* check: it can only tell you anything about code paths which you exercise with it in place.

Using it on the opening example:

% scalac -Xcheckinit a.scala
% scala -e 'new C'
scala.UninitializedFieldError: Uninitialized field: a.scala: 13
at C.x2(a.scala:13)
at A.<init>(a.scala:5)
at B.<init>(a.scala:7)
at C.<init>(a.scala:12)

### Solutions ###
2. Within the body or "template" of a class, vals are initialized in declaration order,
the order in which they are written in source.

When a `val` is overridden, it's more precise to say that its accessor method (the "getter") is overridden.
So the access to `x2` in class `A` invokes the overridden getter in class `C`.
That getter reads the underlying field `C.x2`.
This field is not yet initialized during the construction of `A`.

## Mitigation

The [`-Wsafe-init` compiler flag](https://docs.scala-lang.org/scala3/reference/other-new-features/safe-initialization.html)
in Scala 3 enables a compile-time warning for accesses to uninitialized fields:

-- Warning: Test.scala:8:6 -----------------------------------------------------
8 | val x1: String = "hello"
| ^
| Access non-initialized value x1. Calling trace:
| ├── class B extends A { [ Test.scala:7 ]
| │ ^
| ├── abstract class A { [ Test.scala:1 ]
| │ ^
| └── println(s"A: $x1, $x2") [ Test.scala:5 ]
| ^^

In Scala 2, the `-Xcheckinit` flag adds runtime checks in the generated bytecode to identify accesses of uninitialized fields.
That code throws an exception when an uninitialized field is referenced
that would otherwise be used as a `null` value (or `0` or `false` in the case of primitive types).
Note that these runtime checks only report code that is actually executed at runtime.
Although these checks can be helpful to find accesses to uninitialized fields during development,
it is never advisable to enable them in production code due to the performance cost.

## Solutions

Approaches for avoiding null values include:

#### Use lazy vals ####

abstract class A {
val x1: String
lazy val x2: String = "mom"
### Use class / trait parameters

abstract class A(val x1: String, val x2: String = "mom") {
println("A: " + x1 + ", " + x2)
}
class B extends A {
lazy val x1: String = "hello"

class B(x1: String = "hello", x2: String = "mom") extends A(x1, x2) {
println("B: " + x1 + ", " + x2)
}
class C extends B {
override lazy val x2: String = "dad"

class C(x2: String = "dad") extends B(x2 = x2) {
println("C: " + x1 + ", " + x2)
}
// scala> new C
// A: hello, dad
// B: hello, dad
// C: hello, dad

Usually the best answer. Unfortunately you cannot declare an abstract lazy val. If that is what you're after, your options include:
Values passed as parameters to the superclass constructor are available in its body.

1. Declare an abstract strict val, and hope subclasses will implement it as a lazy val or with an early definition. If they do not, it will appear to be uninitialized at some points during construction.
2. Declare an abstract def, and hope subclasses will implement it as a lazy val. If they do not, it will be re-evaluated on every access.
3. Declare a concrete lazy val which throws an exception, and hope subclasses override it. If they do not, it will... throw an exception.
Scala 3 also [supports trait parameters](https://docs.scala-lang.org/scala3/reference/other-new-features/trait-parameters.html).

An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access: see SLS 5.2.
Note that overriding a `val` class parameter is deprecated / disallowed in Scala 3.
Doing so in Scala 2 can lead to surprising behavior.

Note that using multiple lazy vals creates a new risk: cycles among lazy vals can result in a stack overflow on first access.
### Use lazy vals

#### Use early definitions ####
abstract class A {
val x1: String
val x2: String = "mom"
lazy val x1: String
lazy val x2: String = "mom"

println("A: " + x1 + ", " + x2)
}
class B extends {
val x1: String = "hello"
} with A {
class B extends A {
lazy val x1: String = "hello"

println("B: " + x1 + ", " + x2)
}
class C extends {
override val x2: String = "dad"
} with B {
class C extends B {
override lazy val x2: String = "dad"

println("C: " + x1 + ", " + x2)
}
// scala> new C
// A: hello, dad
// B: hello, dad
// C: hello, dad

Early definitions are a bit unwieldy, there are limitations as to what can appear and what can be referenced in an early definitions block, and they don't compose as well as lazy vals: but if a lazy val is undesirable, they present another option. They are specified in SLS 5.1.6.
Note that abstract `lazy val`s are supported in Scala 3, but not in Scala 2.
In Scala 2, you can define an abstract `val` or `def` instead.

Note that early definitions are deprecated in Scala 2.13; they will be replaced by trait parameters in Scala 3. So, early definitions are not recommended for use if future compatibility is a concern.
An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access; see SLS 5.2.

#### Use constant value definitions ####
abstract class A {
val x1: String
val x2: String = "mom"
Note that using multiple lazy vals incurs a new risk: cycles among lazy vals can result in a stack overflow on first access.
When lazy vals are annotated as thread-safe in Scala 3, they risk deadlock.

println("A: " + x1 + ", " + x2)
}
class B extends A {
val x1: String = "hello"
final val x3 = "goodbye"
### Use a nested object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that basically the same solution as a lazy val?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it is much shorter to type object than lazy val, even discounting the ease of striking the space bar


println("B: " + x1 + ", " + x2)
}
class C extends B {
override val x2: String = "dad"
For purposes of initialization, an object that is not top-level is the same as a lazy val.

println("C: " + x1 + ", " + x2)
There may be reasons to prefer a lazy val, for example to specify the type of an implicit value,
or an object where it is a companion to a class. Otherwise, the most convenient syntax may be preferred.

As an example, uninitialized state in a subclass may be accessed during construction of a superclass:

class Adder {
var sum = 0
def add(x: Int): Unit = sum += x
add(1) // in LogAdder, the `added` set is not initialized yet
}
class LogAdder extends Adder {
private var added: Set[Int] = Set.empty
override def add(x: Int): Unit = { added += x; super.add(x) }
}
abstract class D {
val c: C
val x3 = c.x3 // no exceptions!
println("D: " + c + " but " + x3)

In this case, the state can be initialized on demand by wrapping it in a local object:

class Adder {
var sum = 0
def add(x: Int): Unit = sum += x
add(1)
}
class E extends D {
val c = new C
println(s"E: ${c.x1}, ${c.x2}, and $x3...")
class LogAdder extends Adder {
private object state {
var added: Set[Int] = Set.empty
}
import state._
override def add(x: Int): Unit = { added += x; super.add(x) }
}
//scala> new E
//D: null but goodbye
//A: null, null
//B: hello, null
//C: hello, dad
//E: hello, dad, and goodbye...

Sometimes all you need from an interface is a compile-time constant.
### Early definitions: deprecated

Scala 2 supports early definitions, but they are deprecated in Scala 2.13 and unsupported in Scala 3.
See the [migration guide](https://docs.scala-lang.org/scala3/guides/migration/incompat-dropped-features.html#early-initializer) for more information.

Constant value definitions (specified in SLS 4.1 and available in Scala 2)
and inlined definitions (in Scala 3) can work around initialization order issues
because they can supply constant values without evaluating an instance that is not yet initialized.

Constant values are stricter than strict and earlier than early definitions and have even more limitations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omitting compile-time mechanisms to work around runtime issues seems important. Probably more is available in Scala 3 besides final vals.

Let's also respect that this was the original one-question FAQ. Maybe a minute of silence?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed definitinos! Either I want the t-shirt, or I want to name the next subatomic particle.

Definitinos allow the Scala Quantum Matrix (SQuaM) to infer your classically ambiguous implicit.

as they must be constants. They are specified in SLS 4.1.
Loading