Skip to content

Add CFG-to-CProverGOTO translation#574

Open
tautschnig wants to merge 9 commits intomainfrom
tautschnig/use-unstructured
Open

Add CFG-to-CProverGOTO translation#574
tautschnig wants to merge 9 commits intomainfrom
tautschnig/use-unstructured

Conversation

@tautschnig
Copy link
Contributor

The existing Stmt-to-GOTO translation (ToCProverGOTO.lean) interleaves control-flow lowering — emitting conditional jumps, managing forward references, patching targets — with CBMC-specific concerns like source locations and instruction encoding. This makes the code hard to extend to additional backends and difficult to reason about independently.

Description of changes:

This commit introduces a CFG-based path that separates those responsibilities:

Core Stmt → CFG (via stmtsToCFG) → CProverGOTO instructions

The first step (stmtsToCFG from StructuredToUnstructured) handles control-flow lowering once, producing a backend-agnostic CFG. The second step (detCFGToGotoTransform, added here) is a straightforward mapping from CFG blocks to GOTO instructions. This separation makes it easier to target additional backends — each only needs to consume the CFG — and to reason about control-flow lowering independently.

Three gaps remain before this can replace the direct path:

  1. Source locations on control flow: StructuredToUnstructured discards MetaData from ite/loop/block/exit statements, so transfer commands in the CFG carry no source location information.

  2. Loop contracts: the direct path emits #spec_loop_invariant and #spec_decreases as named sub-expressions on the backward-edge GOTO (for CBMC DFCC). In the CFG, invariants become plain asserts and measures are discarded.

  3. Core.CmdExt.call: this translation handles Imperative.Cmd only. Core procedure calls need a command translator analogous to coreStmtsToGoto in CoreToGOTOPipeline.lean.

No changes to StructuredToUnstructured are needed: the existing stmtsToCFG already works with Core.Command because its CmdT parameter is generic, and Core.Command has the required HasPassiveCmds and HasBool instances.

Tests cover sequential commands, if-then-else, loops (including back-edge verification), empty programs, assert/assume, and havoc.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

The existing Stmt-to-GOTO translation (ToCProverGOTO.lean) interleaves
control-flow lowering — emitting conditional jumps, managing forward
references, patching targets — with CBMC-specific concerns like source
locations and instruction encoding. This makes the code hard to extend
to additional backends and difficult to reason about independently.

This commit introduces a CFG-based path that separates those
responsibilities:

  Core Stmt → CFG (via stmtsToCFG) → CProverGOTO instructions

The first step (stmtsToCFG from StructuredToUnstructured) handles
control-flow lowering once, producing a backend-agnostic CFG. The
second step (detCFGToGotoTransform, added here) is a straightforward
mapping from CFG blocks to GOTO instructions. This separation makes
it easier to target additional backends — each only needs to consume
the CFG — and to reason about control-flow lowering independently.

Three gaps remain before this can replace the direct path:

1. Source locations on control flow: StructuredToUnstructured discards
   MetaData from ite/loop/block/exit statements, so transfer commands
   in the CFG carry no source location information.

2. Loop contracts: the direct path emits #spec_loop_invariant and
   #spec_decreases as named sub-expressions on the backward-edge GOTO
   (for CBMC DFCC). In the CFG, invariants become plain asserts and
   measures are discarded.

3. Core.CmdExt.call: this translation handles Imperative.Cmd only.
   Core procedure calls need a command translator analogous to
   coreStmtsToGoto in CoreToGOTOPipeline.lean.

No changes to StructuredToUnstructured are needed: the existing
stmtsToCFG already works with Core.Command because its CmdT parameter
is generic, and Core.Command has the required HasPassiveCmds and
HasBool instances.

Tests cover sequential commands, if-then-else, loops (including
back-edge verification), empty programs, assert/assume, and havoc.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new CFG→CBMC CProverGOTO translation path for Imperative deterministic CFGs, plus a StrataTest test suite validating the emitted instruction arrays. This supports the longer-term goal of separating control-flow lowering (structured→CFG) from backend-specific encoding (CFG→GOTO).

Changes:

  • Introduce Imperative.detCFGToGotoTransform to translate CFG String (DetBlock ...) into CProverGOTO instructions with a two-pass emit+patch strategy.
  • Add regression tests covering sequential code, ite, loop, empty CFG, assume/assert, and havoc cases.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
Strata/DL/Imperative/CFGToCProverGOTO.lean New CFG-to-GOTO translator (emit blocks, then patch GOTO targets).
StrataTest/DL/Imperative/CFGToCProverGOTO.lean New tests exercising the CFG-based translation output and basic patching properties.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Validate that cfg.entry matches the first block label, returning an
  error if not (entry block must be listed first for correct execution).
- Fail with an error on unresolved label references instead of silently
  leaving GOTO targets as none.
- Fix source-location TODO comment: DetTransferCmd already carries
  MetaData; the real gap is that StructuredToUnstructured fills it with
  MetaData.empty.
- Strengthen test assertions: check that *all* emitted GOTOs have
  resolved targets, not just that some do.
- Add tests for entry-block validation and all-GOTOs-resolved on
  sequential programs.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
tautschnig and others added 3 commits March 13, 2026 19:36
- Fix loc variable shadowing: rename inner binding to targetLoc.
- Call patchGotoTargets once with all resolved patches instead of once
  per patch. Replace labelMap List with Std.HashMap for O(1) lookups.
- Switch pendingPatches from List with prepend to Array with push for
  consistency with the rest of the code.
- Add test for the unresolved-label error path.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@shigoel shigoel enabled auto-merge March 13, 2026 22:18
shigoel
shigoel previously approved these changes Mar 13, 2026
The loop termination feature (PR #581) added HasIdent, HasFvar,
HasIntOrder, and HasNot constraints to stmtsToCFG, but the
CFGToCProverGOTO test was not updated with the required instances.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Copy link
Contributor

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one bit that seems odd to me, but I don't know if it's as intended or not.

tautschnig and others added 2 commits March 20, 2026 09:32
… display

The Instruction toString previously showed the code field for all
instruction types, which for GOTO instructions displayed the
meaningless default 'skip'. Now GOTO instructions show their target
location number instead, making the control flow visible:

  Before: GOTO skip [guard]
  After:  GOTO 5 [guard]

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants