The current architecture has two pain points:
-
The trivia-before-unindent problem:
expressionExceedsPageWidthconstructsbeforeLong +> expr +> afterLongwhereafterLong = unindent. Trailing trivia (comments) emitted insideexprbyleaveNodeinteract badly with theunindent— there's no clean place to spliceUnIndentByinto the event stream. The current fix (captureTrailingTriviaEvents) requires releasing trivia from nodes, running a dummy context, capturing events, splicing, and replaying. This needs to happen at each of 45+ call sites. -
ShortExpression double-execution:
expressionExceedsPageWidthrunsexpronce inShortExpressionmode, checks if it fits, and if not, runsexpragain in normal mode. ThecolWithNlnWhenItemIsMultilinepattern similarly needs to know if items are multiline before deciding separators.
Both problems share a root cause: the writer event stream is append-only and immutable, so you can't go back and modify what was already emitted.
See comment-after-design.md for the concrete example that motivated this rethink (trailing comments before closing ]/} brackets).
Replace the immutable Queue<WriterEvent> with a mutable doubly-linked list (DLL). Defer string materialization to the end. During CodePrinter, only track lightweight metadata (line count, column, indent) for formatting decisions.
A custom DLL rather than System.Collections.Generic.LinkedList<T>. The key operation that LinkedList<T> cannot support is O(1) truncation (restore): its node links are encapsulated, so truncation requires O(k) RemoveLast() calls. For a large expression that doesn't fit on one line, k could be thousands of events.
The custom DLL is ~50 lines and does exactly what we need. No Count field — nothing uses it. The two "cursor" use cases (scan new events, restore to snapshot) both work via saved node references.
[<AllowNullLiteral>]
type EventNode(event: WriterEvent) =
member val Event = event with get, set
member val Prev: EventNode = null with get, set
member val Next: EventNode = null with get, settype EventList() =
member val Head: EventNode = null with get, set
member val Tail: EventNode = null with get, set
/// O(1) append — returns the node for future reference
member this.Append(event: WriterEvent) =
let node = EventNode(event)
if isNull this.Tail then
this.Head <- node
this.Tail <- node
else
node.Prev <- this.Tail
this.Tail.Next <- node
this.Tail <- node
node
/// O(1) insert after a given node
member this.InsertAfter(after: EventNode, event: WriterEvent) =
let node = EventNode(event)
node.Prev <- after
node.Next <- after.Next
if not (isNull after.Next) then
after.Next.Prev <- node
else
this.Tail <- node
after.Next <- node
node
/// O(1) insert before a given node
member this.InsertBefore(before: EventNode, event: WriterEvent) =
let node = EventNode(event)
node.Next <- before
node.Prev <- before.Prev
if not (isNull before.Prev) then
before.Prev.Next <- node
else
this.Head <- node
before.Prev <- node
node
/// O(1) remove
member this.Remove(node: EventNode) =
if not (isNull node.Prev) then node.Prev.Next <- node.Next else this.Head <- node.Next
if not (isNull node.Next) then node.Next.Prev <- node.Prev else this.Tail <- node.Prev
/// O(1) snapshot — save a reference to the current tail
member this.Snapshot() : EventNode = this.Tail
/// O(1) restore — truncate everything after the snapshot point
member this.Restore(snapshot: EventNode) =
if isNull snapshot then
this.Head <- null
this.Tail <- null
else
snapshot.Next <- null
this.Tail <- snapshotContext holds one EventList instance. All formatting operations append to the same DLL. This replaces the current WriterEvents: Queue<WriterEvent>.
Because the DLL is shared and mutable, speculative execution (short-expression checks, dummy probes) must use snapshot/restore rather than re-running on a separate context. A missed restore corrupts the event stream.
Two use cases, both via saved node references:
-
Non-destructive scan (cursor): save tail reference, run
expr, walk fromsavedNode.Nextforward to inspect new events. Keep events. Used byisMultilineItem. -
Destructive restore (rewind): save tail reference + full
WriterModel, runexpr, check result, either keep or truncate DLL back to saved node and restoreWriterModel. Used byexpressionExceedsPageWidthandfutureNlnCheck/exceedsWidth(dummy probes).
Nested speculative execution (e.g. outer expressionExceedsPageWidth contains inner expressionExceedsPageWidth) works naturally: snapshots are ordered along the DLL. Restoring an outer snapshot implicitly discards everything the inner level did. The call stack enforces LIFO ordering.
A debug assertion should verify that the snapshot node is still reachable from the tail on restore — this catches misuse where a child snapshot is used after its parent was restored.
type Snapshot = { Node: EventNode; Model: WriterModel }On restore: truncate DLL to Node, reset WriterModel to Model. The Mode field (e.g. ShortExpression) is handled explicitly by the caller after restore, same as today.
futureNlnCheck and exceedsWidth currently create a throwaway context with Queue.empty. With the DLL, they instead snapshot, run the probe, read the answer from WriterModel, and restore. No separate collection needed. WithDummy simplifies to setting Mode = Dummy (and MaxLineLength = Int32.MaxValue) without swapping the event collection.
ShortExpression mode provides early-exit optimization: once ConfirmedMultiline is set, WriterModel.update stops updating the model, and nested expressionExceedsPageWidth calls short-circuit entirely. This remains valuable with the DLL — events still append (and get discarded on restore), but the metadata freeze prevents unnecessary work in nested calls.
Replace Lines: string list with LineCount: int. The Lines content is only inspected in two places during formatting (genTrivia and addFinalNewline), and both can derive what they need by walking backward from the DLL tail to the nearest newline event.
{ LineCount: int
Column: int
Indent: int
AtColumn: int
WriteBeforeNewline: string
Mode: WriterModelMode }WriteBeforeNewline stays as a model field for now — changing it isn't needed to solve the core problems.
-
Metadata update (runs during formatting):
WriterModel.updateprocesses each event as it's appended, updatingLineCount,Column,Indent,AtColumn,WriteBeforeNewline, andShortExpressionmode state. No string building. -
String materialization (runs once at the end):
dumpwalks the DLL head-to-tail, building output strings. This is essentially a second pass that only cares about producing the final text.
When genTrivia needs to check "does the current line have content?" or "is the last character a space?", walk backward from the DLL tail to the nearest newline event, collecting Write/WriteComment texts. This replaces List.tryHead on Lines. The walk is bounded by the number of events on the current line — typically a handful.
Instead of the release/capture/replay dance:
genExpr lastItemruns normally —leaveNodeemits comment +WriteLineBecauseOfTriviainto the DLL- Walk backward from
events.Tailto find the trivia boundary (WriteComment+WriteLineBecauseOfTriviapattern) - Insert
UnIndentBybetween the comment and the trailing newline — O(1) in-place splice - No dummy context, no
ReleaseContentAfter, nocaptureTrailingTriviaEvents
Replace the 4 opaque Context -> Context parameters with structured data:
type LongExpressionLayout =
| IndentAndUnindent // indent +> sepNln ... unindent
| DoubleIndentAndUnindent // indent +> indent +> sepNln ... unindent +> unindent
| NewlineOnly // sepNln ... (no unindent)The function owns the splice logic: after expr runs on the long path, if the layout involves unindent, walk backward from DLL tail, find the trivia boundary, splice UnIndentBy before the trailing newline. This fixes all ~43 call sites at once.
The beforeShort/afterShort parameters stay (they're just sepSpace/sepNone variations). Stroustrup variants bypass expressionExceedsPageWidth entirely and are unaffected.
Two new WriterEvent cases:
| Start // marks the beginning of a colWithNlnWhenItemIsMultiline block
| Placeholder // placeholder separator between items, resolved after all items are emittedBoth are no-ops in WriterModel.update.
Flow:
- Emit
Startmarker, save its DLL node reference - For each item, emit a
Placeholdermarker (save reference), then run the item'sexpr - After all items are emitted, walk backward from DLL tail to the
Startmarker - For each
Placeholder, check the events between it and the next placeholder (or tail) to determine if that item was multiline - Replace each
Placeholdernode's event with the appropriate separator (sepNlnorsepNln + sepNln) - Remove the
Startmarker and all resolvedPlaceholdermarkers
No re-execution. No dummy contexts. Items are formatted once.
Big bang on a branch:
- Introduce
EventListtype (new file, e.g.EventList.fs) - Add
StartandPlaceholdercases toWriterEvent - Change
Context.WriterEventsfromQueue<WriterEvent>toEventList - Update
WriterModel— replaceLines: string listwithLineCount: int - Make everything compile with
failwithstubs where needed - Fix tests starting from
CodePrinterHelperFunctionsTests.fs(simplest, illustrates core patterns), then work up to the full formatter suite - Refactor
expressionExceedsPageWidthto useLongExpressionLayoutDU - Refactor
colWithNlnWhenItemIsMultilineto useStart/Placeholdermarkers
The test suite is the validation — if all tests pass, we're done.
The initial setup (steps 1–6 above) is done: EventList replaces Queue, WriterModel.Lines is gone, dump walks the DLL, CodePrinterHelperFunctionsTests all pass. ~538 of ~2776 tests fail. The work below is ordered from lowest risk to highest complexity.
Done. CodePrinterHelperFunctionsTests.fs now has 48 tests (47 passing, 1 skipped) covering:
- dump edge cases:
WriteBeforeNewline,WriteLineInsideStringConst,WriteLineInsideTrivia, trailing space trimming, leading blank line stripping (normal + selection mode) - Separator helpers:
sepSpacededup,sepNlnForTrivia,sepNlnUnlessLastEventIsNewline,lastWriteEventIsNewline - WriteBeforeNewline-aware helpers:
sepNlnWhenWriteBeforeNewlineNotEmptyboth paths - Speculative formatting probes:
futureNlnCheck(true/false/no-trace),exceedsWidth(true/false/no-trace) - Speculative formatting rollback:
expressionFitsOnRestOfLinefits path,isShortExpressionboth paths,isSmallExpressionboth paths,autoIndentAndNlnIfExpressionExceedsPageWidthboth paths,sepSpaceOrIndentAndNlnIfExpressionExceedsPageWidthboth paths - Leading expression inspection:
leadingExpressionResultcoordinates,leadingExpressionIsMultilinemultiline/single-line
The Context.fsi signature file was also reorganized into logical groups: Types, Core event machinery, Indentation, Separators, Conditionals and combinators, Collection traversal, Option handling, Speculative formatting, Leading expression inspection, Multiline item handling, WriteBeforeNewline-aware helpers, Stroustrup-specific.
This is likely the biggest source of test failures. The function has a replay path: when both the current and previous items are single-line, it discards the optimistic events and replays expr on acc.Context. But with the mutable DLL, the optimistic events (from lines 1102–1110) are still in the list when the replay happens at line 1116.
This needs a CreateBackupPoint before the optimistic path and RollbackTo before the replay. Add tests for:
- All items single-line → separators are just
sepNln - One multiline item → extra blank line around it
- Mix of single and multiline items
- Items with leading trivia newlines (the
newlineBetweenLastWriteEventcheck)
Once this replay is fixed, a large batch of failures should resolve.
Once all tests pass with the current architecture, refactor expressionExceedsPageWidth to use the structured LongExpressionLayout DU. This replaces the 4 opaque Context -> Context parameters and centralizes the trivia-before-unindent splice logic. This is the payoff that motivated the whole rethink.
Replace the current "run, check multiline, maybe replay" pattern with the marker-based approach: emit Start, emit Placeholder between items, then resolve all placeholders in a single backward pass. No re-execution needed.
With the trivia reassignment changes (findNodeBeforeWithMatchingColumn, leaf-node heuristic), comments that were previously ContentBefore on a closing bracket are now ContentAfter on the last content item. This is correct for indentation purposes — the comment stays at the content's indent level. But it has a side effect: the trailing trivia events (WriteLineBecauseOfTrivia, WriteTrivia "// ...", WriteLineBecauseOfTrivia) are now part of the last item's event stream.
When speculative formatting checks whether an expression fits on one line (isSmallExpression, expressionFitsOnRestOfLine, futureNlnCheck), the trailing comment makes the expression appear multiline or wider than it actually is. For example:
// Before trivia reassignment: comment is ContentBefore on `]`
Html.a [ prop.className "navbar-item" ] // fits on one line ✓
// After trivia reassignment: comment is ContentAfter on `Html.a [...]`
Html.a [ prop.className "navbar-item" ] // the trivia events make this "multiline"
(* block comment *) // → forces multiline layout unnecessarilyThis affects multiple code patterns:
Elmish-style expressions: list items with trailing comments get expanded to multiline when they would otherwise fit on one line.
Let bindings: a single-line binding like let a = b with a trailing comment becomes multiline because the speculative check in genBinding (via sepSpaceOrIndentAndNlnIfExpressionExceedsPageWidth) sees the trivia events and decides the expression doesn't fit. This forces b onto the next line with indentation:
// Input:
let a = b
// yozora
c
// Output (b is pushed to next line because trivia makes it "multiline"):
let a =
b
// yozora
c
// If the short path is forced, b stays inline but the comment doesn't align:
let a = b
// yozora ← at column 0, not aligned with b at column 8
cNeither outcome is ideal. The binding expression is simultaneously single-line (the code b) and multiline (the code + trailing trivia). The formatter cannot currently distinguish these perspectives.
The formatted output is valid F# and idempotent in all cases, but more verbose than necessary. This is an accepted limitation of the trivia reassignment improvement — the indentation correctness gains (comments at the right column before closing brackets) outweigh the occasional unnecessary multiline expansion.
Possible future solutions:
- Trim trailing trivia from width checks:
isSmallExpression/futureNlnCheckcould stop counting events after the last non-trivia content, similar to howisMultilineItemskips leading trivia. - Separate content width from trivia width: Track the "content column" (before trivia) separately in
WriterModel, so width checks use the content width. - Defer trivia emission: Don't emit ContentAfter trivia during
genExpr— capture it and replay after the width check. This is essentially thecaptureTrailingTriviaEventsapproach from thecomment-after-rebasedbranch, but applied to width checks rather than unindent splicing.