Skip to content

feat: Add Lean formalization via lean-dojo for powerful theorem proving#1

Open
Ash-Blanc wants to merge 3 commits intoevalops:mainfrom
Ash-Blanc:patch-1
Open

feat: Add Lean formalization via lean-dojo for powerful theorem proving#1
Ash-Blanc wants to merge 3 commits intoevalops:mainfrom
Ash-Blanc:patch-1

Conversation

@Ash-Blanc
Copy link

Overview

This PR introduces Lean 4 formalization via the LeanDojo framework as a powerful complement to the existing Coq-based theorem proving system. Lean offers superior proof automation, better support for inductive reasoning, and a more expressive type system, making it an excellent choice for resolving complex cognitive dissonance scenarios.

Motivation

While Coq provides solid formal verification capabilities, Lean 4 + LeanDojo brings several advantages:

  • Better automation: Lean's tactics and decision procedures handle more proof obligations automatically
  • Richer libraries: Mathlib provides extensive mathematical formalizations
  • Superior inductive reasoning: Particularly strong for recursive data structures and inductive proofs
  • Active ecosystem: Lean is experiencing rapid growth with excellent tooling (LeanDojo for LLM integration)
  • Cross-verification: Fallback mechanism ensures claims are verified by multiple backends

Changes

Adds three new modules to formal_verification/:

1. lean_translator.py

  • Translates natural language claims to Lean 4 theorems
  • Pattern-based translation for arithmetic claims, algorithms, invariants
  • Support for common mathematical proof patterns
  • Integrates with LeanDojo for advanced proof search

2. lean_prover.py

  • Lean 4 theorem prover using LeanDojo API
  • Automated proof search with configurable timeouts
  • Proof caching for performance
  • Type-checking verification via lean command

3. hybrid_lean_coq_resolver.py

  • Intelligent fallback between Lean and Coq provers
  • Automatic backend selection based on claim type
  • Dual-prover verification for maximum confidence
  • Confidence scoring based on prover backend

Benefits

✅ Extended proof coverage: Claims unprovable in Coq may be provable in Lean
✅ Better automation: Less reliance on manual proof strategies
✅ Cross-verification: Multiple provers increase confidence in results
✅ Type-safety: Lean's type system catches more errors earlier
✅ Future-proof: Lean's growing ecosystem and LeanDojo support

Integration

The hybrid resolver seamlessly integrates with existing DSPy agents:

from formal_verification.hybrid_lean_coq_resolver import HybridLeanCoqResolver

resolver = HybridLeanCoqResolver(prefer_lean=True, use_fallback=True)
result = resolver.resolve_conflict(claims)

Dependencies

Requires installation of:

  • lean (Lean 4 theorem prover)
  • leandojo (Python package for LeanDojo API)
  • Existing: coq (already required)

Testing

All modules include:

  • Type hints for IDE support
  • Comprehensive error handling
  • Caching for improved performance
  • Logging for debugging

Related Issues

Closes/References roadmap item: "Integration with LeanDojo/APOLLO for cross-prover validation"

Future Work

  • Expand Lean pattern library for more claim types
  • Integrate LeanDojo's proof search API more deeply
  • Benchmark against existing Coq performance
  • Add support for APOLLO (GPT-4 based Lean proofs) as tertiary backend

This file implements a Lean translator that converts natural language claims into Lean 4 formalizations, utilizing pattern-based translation strategies for various claim types such as arithmetic and algorithms.
Implement Lean 4 theorem prover using LeanDojo API for automated proof search and verification.
Implement hybrid resolver for Lean and Coq theorem proving.
# Fallback to Coq
if self.use_fallback:
coq_result = self._try_coq_proof(claims)
if coq_result.get("proven"):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code calls .get() on a ProofResult dataclass object, which lacks this method, causing an AttributeError. The object is incorrectly assumed to be a dictionary.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The function _try_coq_proof is type-hinted to return a Dict but returns a ProofResult dataclass object on a successful proof. The calling code at line 73 and 83 attempts to use the .get("proven") method on the returned object. Since dataclass objects do not have a .get() method, this will raise an AttributeError whenever the Coq prover is used as a fallback. This will cause the hybrid resolver functionality to fail.

💡 Suggested Fix

Access the proven attribute directly on the ProofResult object using coq_result.proven instead of coq_result.get("proven"). Additionally, correct the return type hint for _try_coq_proof to Union[ProofResult, Dict] to accurately reflect its possible return types.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: formal_verification/hybrid_lean_coq_resolver.py#L73

Potential issue: The function `_try_coq_proof` is type-hinted to return a `Dict` but
returns a `ProofResult` dataclass object on a successful proof. The calling code at line
73 and 83 attempts to use the `.get("proven")` method on the returned object. Since
dataclass objects do not have a `.get()` method, this will raise an `AttributeError`
whenever the Coq prover is used as a fallback. This will cause the hybrid resolver
functionality to fail.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 7743576

@Ash-Blanc
Copy link
Author

Bug Analysis & Fixes

Great catch by the AI reviewer! I've analyzed the bugs in detail. Here's a comprehensive breakdown:

Root Cause

The _try_coq_proof() method has inconsistent return types:

  • Returns ProofResult dataclass on success (line 133)
  • Returns Dict on error (line 138)

The code at lines 73 and 83 assumes the return is always a dict and calls .get("proven"), which fails when a ProofResult dataclass is returned.

Bugs Found

Bug #1 - Line 73:
if coq_result.get("proven"): # WRONG - ProofResult has no .get() method

Bug #2 - Line 83:
if coq_result.get("proven"): # WRONG - Same issue in else branch

Fixes Required

Best Solution:
Make _try_coq_proof() always return the same type. Change line 133 from:
return proof_result

To:
return {"proven": proof_result.proven, "result": proof_result, "error": None}

Also update return type hint on line 128:
def _try_coq_proof(...) -> Dict:

This ensures consistent handling with .get() calls on lines 73 and 83.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant