From 0d68382c13061d1d07816580efc82a86d66abbd9 Mon Sep 17 00:00:00 2001 From: Lukas Rytz Date: Tue, 7 May 2024 19:21:51 +0200 Subject: [PATCH 1/2] Update FAQ on initialization order --- _overviews/FAQ/initialization-order.md | 148 ++++++++++++------------- 1 file changed, 72 insertions(+), 76 deletions(-) diff --git a/_overviews/FAQ/initialization-order.md b/_overviews/FAQ/initialization-order.md index 787281f0db..3caa2b5b72 100644 --- a/_overviews/FAQ/initialization-order.md +++ b/_overviews/FAQ/initialization-order.md @@ -7,7 +7,7 @@ permalink: /tutorials/FAQ/:title.html ## Example -To understand the problem, let's pick the following concrete example. +The following example illustrates the problem: abstract class A { val x1: String @@ -26,7 +26,7 @@ To understand the problem, let's pick the following concrete example. println("C: " + x1 + ", " + x2) } -Let's observe the initialization order through the Scala REPL: +In the Scala REPL we observe: scala> new C A: null, null @@ -36,51 +36,50 @@ Let's observe the initialization order through the Scala REPL: Only when we get to the constructor of `C` are both `x1` and `x2` initialized. Therefore, constructors of `A` and `B` risk running into `NullPointerException`s. ## Explanation -A 'strict' or 'eager' val is one which is not marked lazy. -In the absence of "early definitions" (see below), initialization of strict vals is done in the following order. +A "strict" or "eager" val is one which is not marked lazy. +Initialization of strict vals is done in the following order: 1. Superclasses are fully initialized before subclasses. 2. Otherwise, in declaration order. -Naturally when a val is overridden, it is not initialized more than once. So though x2 in the above example is seemingly defined at every point, this is not the case: an overridden val will appear to be null during the construction of superclasses, as will an abstract val. +When a `val` is overridden, in fact its accessor method (the "getter") is overridden. +So the access to `x2` in class `A` in fact invokes the overridden getter in class `C` which reads the underlying field `C.x2`. +This field is not yet initialized during the construction of `A`. -There is a compiler flag which can be useful for identifying this situation: +## Mitigation -**-Xcheckinit**: Add runtime check to field accessors. +The [`-Ysafe-init` compiler flag](https://docs.scala-lang.org/scala3/reference/other-new-features/safe-initialization.html) in Scala 3 enables compiler warnings for accesses to uninitialized fields: -It is inadvisable to use this flag outside of testing. It adds significantly to the code size by putting a wrapper around all potentially uninitialized field accesses: the wrapper will throw an exception rather than allow a null (or 0/false in the case of primitive types) to silently appear. Note also that this adds a *runtime* check: it can only tell you anything about code paths which you exercise with it in place. + -- Warning: Test.scala:8:6 ------------------ + 8 | val x1: String = "hello" + | ^ + | Access non-initialized value x1. Calling trace: + | ├── class B extends A { [ Test.scala:7 ] + | │ ^ + | ├── abstract class A { [ Test.scala:1 ] + | │ ^ + | └── println("A: " + x1 + ", " + x2) [ Test.scala:5 ] + | ^^ -Using it on the opening example: +In Scala 2, the `-Xcheckinit` flag adds runtime checks in the generated bytecode to identify accesses of uninitialized fields. +The code then throws an exception rather than allowing a `null` (or `0` / `false` in the case of primitive types) to silently appear. +Note that these runtime checks only test code that is actually exectued at runtime. +The flag can be helpful to find accesses to uninitialized fields, but it should never be used in production due to its performance overhead. - % scalac -Xcheckinit a.scala - % scala -e 'new C' - scala.UninitializedFieldError: Uninitialized field: a.scala: 13 - at C.x2(a.scala:13) - at A.(a.scala:5) - at B.(a.scala:7) - at C.(a.scala:12) - -### Solutions ### +## Solutions Approaches for avoiding null values include: -#### Use lazy vals #### - - abstract class A { - val x1: String - lazy val x2: String = "mom" +### Use class / trait parameters + abstract class A(val x1: String, val x2: String = "mom") { println("A: " + x1 + ", " + x2) } - class B extends A { - lazy val x1: String = "hello" - + class B(x1: String = "hello", x2: String = "mom") extends A(x1, x2) { println("B: " + x1 + ", " + x2) } - class C extends B { - override lazy val x2: String = "dad" - + class C(x2: String = "dad") extends B(x2 = x2) { println("C: " + x1 + ", " + x2) } // scala> new C @@ -88,31 +87,29 @@ Approaches for avoiding null values include: // B: hello, dad // C: hello, dad -Usually the best answer. Unfortunately you cannot declare an abstract lazy val. If that is what you're after, your options include: +Values passed as parameters to the superclass constructor are available in its body. -1. Declare an abstract strict val, and hope subclasses will implement it as a lazy val or with an early definition. If they do not, it will appear to be uninitialized at some points during construction. -2. Declare an abstract def, and hope subclasses will implement it as a lazy val. If they do not, it will be re-evaluated on every access. -3. Declare a concrete lazy val which throws an exception, and hope subclasses override it. If they do not, it will... throw an exception. +Scala 3 also [supports trait parameters](https://docs.scala-lang.org/scala3/reference/other-new-features/trait-parameters.html). -An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access: see SLS 5.2. +Note that overriding a `val` class parameter is deprecated / disallowed in Scala 3. +Doing so in Scala 2 can lead to surprising behavior. -Note that using multiple lazy vals creates a new risk: cycles among lazy vals can result in a stack overflow on first access. +### Use lazy vals -#### Use early definitions #### abstract class A { - val x1: String - val x2: String = "mom" + lazy val x1: String + lazy val x2: String = "mom" println("A: " + x1 + ", " + x2) } - class B extends { - val x1: String = "hello" - } with A { + class B extends A { + lazy val x1: String = "hello" + println("B: " + x1 + ", " + x2) } - class C extends { - override val x2: String = "dad" - } with B { + class C extends B { + override lazy val x2: String = "dad" + println("C: " + x1 + ", " + x2) } // scala> new C @@ -120,45 +117,44 @@ Note that using multiple lazy vals creates a new risk: cycles among lazy vals ca // B: hello, dad // C: hello, dad -Early definitions are a bit unwieldy, there are limitations as to what can appear and what can be referenced in an early definitions block, and they don't compose as well as lazy vals: but if a lazy val is undesirable, they present another option. They are specified in SLS 5.1.6. +Note that abstract `lazy val`s are supported in Scala 3, but not in Scala 2. +In Scala 2, you can define an abstract `val` or `def` instead. -Note that early definitions are deprecated in Scala 2.13; they will be replaced by trait parameters in Scala 3. So, early definitions are not recommended for use if future compatibility is a concern. +An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access: see SLS 5.2. -#### Use constant value definitions #### - abstract class A { - val x1: String - val x2: String = "mom" +Note that using multiple lazy vals creates a new risk: cycles among lazy vals can result in a stack overflow on first access. - println("A: " + x1 + ", " + x2) - } - class B extends A { - val x1: String = "hello" - final val x3 = "goodbye" +### Use a nested object - println("B: " + x1 + ", " + x2) - } - class C extends B { - override val x2: String = "dad" +Sometimes, uninitialized state in a subclass is accessed during construction of a superclass: - println("C: " + x1 + ", " + x2) + class Adder { + var sum = 0 + def add(x: Int): Unit = sum += x + add(1) } - abstract class D { - val c: C - val x3 = c.x3 // no exceptions! - println("D: " + c + " but " + x3) + class LogAdder extends Adder { + private var added: Set[Int] = Set.empty + override def add(x: Int): Unit = { added += x; super.add(x) } } - class E extends D { - val c = new C - println(s"E: ${c.x1}, ${c.x2}, and $x3...") + +In this case the state can be initialized on demand by wrapping it into a local object: + + class Adder { + var sum = 0 + def add(x: Int): Unit = sum += x + add(1) } - //scala> new E - //D: null but goodbye - //A: null, null - //B: hello, null - //C: hello, dad - //E: hello, dad, and goodbye... + class LogAdder extends Adder { + private object state { + var added: Set[Int] = Set.empty + } + import state._ + override def add(x: Int): Unit = { added += x; super.add(x) } + } + +### Early definitions: deprecated -Sometimes all you need from an interface is a compile-time constant. +Scala 2 supports early definitinos, but they are deprecated in Scala 2.13 and unsupported in Scala 3. +See the [migration guide](https://docs.scala-lang.org/scala3/guides/migration/incompat-dropped-features.html#early-initializer) for more information. -Constant values are stricter than strict and earlier than early definitions and have even more limitations, -as they must be constants. They are specified in SLS 4.1. From 19967a96c7cf8ed92f76788cda107ceb5d377c46 Mon Sep 17 00:00:00 2001 From: Som Snytt Date: Sat, 22 Feb 2025 02:11:11 -0800 Subject: [PATCH 2/2] More words for initialization FAQ --- _overviews/FAQ/initialization-order.md | 68 +++++++++++++++++--------- 1 file changed, 44 insertions(+), 24 deletions(-) diff --git a/_overviews/FAQ/initialization-order.md b/_overviews/FAQ/initialization-order.md index 3caa2b5b72..ebe07308c6 100644 --- a/_overviews/FAQ/initialization-order.md +++ b/_overviews/FAQ/initialization-order.md @@ -7,23 +7,26 @@ permalink: /tutorials/FAQ/:title.html ## Example -The following example illustrates the problem: +The following example illustrates how classes in a subclass relation +witness the initialization of two fields which are inherited from +their top-most parent. The values are printed during the constructor +of each class, that is, when an instance is initialized. abstract class A { val x1: String val x2: String = "mom" - println("A: " + x1 + ", " + x2) + println(s"A: $x1, $x2") } class B extends A { val x1: String = "hello" - println("B: " + x1 + ", " + x2) + println(s"B: $x1, $x2") } class C extends B { override val x2: String = "dad" - println("C: " + x1 + ", " + x2) + println(s"C: $x1, $x2") } In the Scala REPL we observe: @@ -33,39 +36,46 @@ In the Scala REPL we observe: B: hello, null C: hello, dad -Only when we get to the constructor of `C` are both `x1` and `x2` initialized. Therefore, constructors of `A` and `B` risk running into `NullPointerException`s. +Only when we get to the constructor of `C` are both `x1` and `x2` properly initialized. +Therefore, constructors of `A` and `B` risk running into `NullPointerException`s, +since fields are null-valued until set by a constructor. ## Explanation -A "strict" or "eager" val is one which is not marked lazy. +A "strict" or "eager" val is a `val` which is not a `lazy val`. Initialization of strict vals is done in the following order: 1. Superclasses are fully initialized before subclasses. -2. Otherwise, in declaration order. +2. Within the body or "template" of a class, vals are initialized in declaration order, + the order in which they are written in source. -When a `val` is overridden, in fact its accessor method (the "getter") is overridden. -So the access to `x2` in class `A` in fact invokes the overridden getter in class `C` which reads the underlying field `C.x2`. +When a `val` is overridden, it's more precise to say that its accessor method (the "getter") is overridden. +So the access to `x2` in class `A` invokes the overridden getter in class `C`. +That getter reads the underlying field `C.x2`. This field is not yet initialized during the construction of `A`. ## Mitigation -The [`-Ysafe-init` compiler flag](https://docs.scala-lang.org/scala3/reference/other-new-features/safe-initialization.html) in Scala 3 enables compiler warnings for accesses to uninitialized fields: +The [`-Wsafe-init` compiler flag](https://docs.scala-lang.org/scala3/reference/other-new-features/safe-initialization.html) +in Scala 3 enables a compile-time warning for accesses to uninitialized fields: - -- Warning: Test.scala:8:6 ------------------ + -- Warning: Test.scala:8:6 ----------------------------------------------------- 8 | val x1: String = "hello" | ^ | Access non-initialized value x1. Calling trace: - | ├── class B extends A { [ Test.scala:7 ] + | ├── class B extends A { [ Test.scala:7 ] | │ ^ - | ├── abstract class A { [ Test.scala:1 ] + | ├── abstract class A { [ Test.scala:1 ] | │ ^ - | └── println("A: " + x1 + ", " + x2) [ Test.scala:5 ] - | ^^ + | └── println(s"A: $x1, $x2") [ Test.scala:5 ] + | ^^ In Scala 2, the `-Xcheckinit` flag adds runtime checks in the generated bytecode to identify accesses of uninitialized fields. -The code then throws an exception rather than allowing a `null` (or `0` / `false` in the case of primitive types) to silently appear. -Note that these runtime checks only test code that is actually exectued at runtime. -The flag can be helpful to find accesses to uninitialized fields, but it should never be used in production due to its performance overhead. +That code throws an exception when an uninitialized field is referenced +that would otherwise be used as a `null` value (or `0` or `false` in the case of primitive types). +Note that these runtime checks only report code that is actually executed at runtime. +Although these checks can be helpful to find accesses to uninitialized fields during development, +it is never advisable to enable them in production code due to the performance cost. ## Solutions @@ -120,25 +130,31 @@ Doing so in Scala 2 can lead to surprising behavior. Note that abstract `lazy val`s are supported in Scala 3, but not in Scala 2. In Scala 2, you can define an abstract `val` or `def` instead. -An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access: see SLS 5.2. +An exception during initialization of a lazy val will cause the right-hand side to be re-evaluated on the next access; see SLS 5.2. -Note that using multiple lazy vals creates a new risk: cycles among lazy vals can result in a stack overflow on first access. +Note that using multiple lazy vals incurs a new risk: cycles among lazy vals can result in a stack overflow on first access. +When lazy vals are annotated as thread-safe in Scala 3, they risk deadlock. ### Use a nested object -Sometimes, uninitialized state in a subclass is accessed during construction of a superclass: +For purposes of initialization, an object that is not top-level is the same as a lazy val. + +There may be reasons to prefer a lazy val, for example to specify the type of an implicit value, +or an object where it is a companion to a class. Otherwise, the most convenient syntax may be preferred. + +As an example, uninitialized state in a subclass may be accessed during construction of a superclass: class Adder { var sum = 0 def add(x: Int): Unit = sum += x - add(1) + add(1) // in LogAdder, the `added` set is not initialized yet } class LogAdder extends Adder { private var added: Set[Int] = Set.empty override def add(x: Int): Unit = { added += x; super.add(x) } } -In this case the state can be initialized on demand by wrapping it into a local object: +In this case, the state can be initialized on demand by wrapping it in a local object: class Adder { var sum = 0 @@ -155,6 +171,10 @@ In this case the state can be initialized on demand by wrapping it into a local ### Early definitions: deprecated -Scala 2 supports early definitinos, but they are deprecated in Scala 2.13 and unsupported in Scala 3. +Scala 2 supports early definitions, but they are deprecated in Scala 2.13 and unsupported in Scala 3. See the [migration guide](https://docs.scala-lang.org/scala3/guides/migration/incompat-dropped-features.html#early-initializer) for more information. +Constant value definitions (specified in SLS 4.1 and available in Scala 2) +and inlined definitions (in Scala 3) can work around initialization order issues +because they can supply constant values without evaluating an instance that is not yet initialized. +