System Overview¶
This document describes the current runtime architecture in src/lean_proof_auto_mcp
and aligns terminology with the implemented code paths.
1. Topology¶
flowchart TB
Client["MCP Client"] --> FastMCP["FastMCP app (server.py)"]
FastMCP --> Router["ToolRouter"]
subgraph Tools["Tool Entry Points (tools/*.py)"]
ScanFile["scan_file"]
ScanTheorem["scan_theorem"]
RankTargets["rank_targets"]
Verify["verify"]
Probe["probe"]
ProbeFile["probe_file"]
SearchAuto["search_automated_proof"]
TryAuto["try_automated_proof"]
GetCtx["get_proof_context"]
end
subgraph Core["Core Domain"]
VerifyDomain["verify_domain"]
ProbeDomain["probe_domain"]
ValidateDomain["validate_proof_domain"]
SearchOrch["search_orchestrator"]
Harness["harness_construction"]
CtxExtractor["context_extractor"]
StaticAnalysis["indexer/features/scoring/ranking"]
end
subgraph Adapters["Adapters and Lean Runtime"]
ServerMgr["lean/server_manager"]
Querier["lean/querier"]
Validator["lean/validator"]
ProofState["lean/proof_state"]
Workspace["adapters/workspace_provider"]
Metadata["observability/metadata_collector"]
end
Router --> ScanFile
Router --> ScanTheorem
Router --> RankTargets
Router --> Verify
Router --> Probe
Router --> ProbeFile
Router --> SearchAuto
Router --> TryAuto
Router --> GetCtx
ScanFile --> StaticAnalysis
ScanTheorem --> StaticAnalysis
RankTargets --> StaticAnalysis
Verify --> VerifyDomain
Probe --> ProbeDomain
ProbeFile --> ProbeDomain
SearchAuto --> SearchOrch
TryAuto --> ValidateDomain
GetCtx --> CtxExtractor
VerifyDomain --> Validator
VerifyDomain --> Workspace
VerifyDomain --> Metadata
ProbeDomain --> Querier
ProbeDomain --> Validator
ProbeDomain --> Harness
ProbeDomain --> Workspace
SearchOrch --> Querier
SearchOrch --> Validator
SearchOrch --> Harness
SearchOrch --> ProofState
ValidateDomain --> Querier
ValidateDomain --> Validator
ValidateDomain --> Harness
ValidateDomain --> ProofState
CtxExtractor --> Querier
CtxExtractor --> ProofState
Querier --> ServerMgr
Validator --> ServerMgr
ProofState --> ServerMgr
2. Primary Runtime Pattern¶
This is the dominant runtime shape for stateful tools (verify, probe,
probe_file, search_automated_proof, try_automated_proof, get_proof_context).
sequenceDiagram
participant C as MCP Client
participant S as FastMCP Tool Fn
participant T as tools/*.py
participant H as Core Handler/Orchestrator
participant P as Ports
participant A as Adapters
participant L as Lean REPL
C->>S: tools/call(name, args)
S->>T: router.dispatch(name, args)
T->>T: validate/coerce args
T->>T: build command/config
T->>T: wire composition root
T->>H: handle(command) or search(...)
H->>P: call abstract dependency
P->>A: concrete adapter call
A->>L: FileCommand/Command
L-->>A: diagnostics/results
A-->>H: domain result
H-->>T: result object
T->>T: status mapping + normalization
T-->>S: response dict
S-->>C: MCP JSON response
3. Range-Based Harness Construction¶
3.1 Historical note¶
Harness construction evolved multiple times. Earlier generations relied on string/line parsing of Lean declarations. The current implementation replaced that approach with range-driven source splicing using Lean-provided declaration value ranges.
3.2 Current implementation¶
Implemented in core/harness_construction.py as RangeBasedHarnessConstructor.
It is pure core logic: callers must provide source text and declarations.
3.3 Splicing algorithm¶
- Resolve target theorem deterministically:
- exact
full_name - exact
name - local-name fallback with explicit ambiguity checks
- Classify target value-range shape (
assign,by,where,term_body,tactic_body,equation_clauses, unsafe forms). - Classify proof attempt as tactic-mode or term-mode.
- Build splice plan:
- target declaration gets attempted proof replacement
- non-target theorem/lemma declarations are replaced with
sorry-safe forms - Enforce exactly one target splice.
- Apply splices bottom-up (reverse order) so offsets remain valid.
- Optionally inject extra imports (for example
import Aesop).
3.4 Safety and replacement behavior¶
- Target declaration kinds allowed:
theorem,lemma,instance. - Non-target
sorryreplacements are restricted to safe kinds (theorem,lemma). - Unsafe/empty range shapes fail fast for target declarations.
- Equation-clause non-target bodies are skipped instead of unsafe rewrites.
3.5 Failure modes¶
HarnessError.error_type values:
theorem_not_foundambiguous_theorem_idunsafe_target_rangetarget_splice_cardinalityinvalid_proof_attemptconstruction_failed
3.6 Harness execution flow¶
flowchart LR
A["Tool entry (probe/search/try)"] --> B["Querier.extract_declarations + read_source_file"]
B --> C["RangeBasedHarnessConstructor.construct"]
C --> D["build_splice_plan"]
D --> E["apply_splices (bottom-up)"]
E --> F["Harness code"]
F --> G["LeanInteractProofValidator.validate_harness_code / validate_proof"]
G --> H["ValidationResult"]
H --> I["Tool-level status normalization"]
Probe-specific note: probe can inject trace_config options into generated
harness code before Lean execution (set_option ...), so harness execution can
include targeted Lean tracing when requested.
3.7 Why this replaced previous approaches¶
- Lean already exposes exact declaration/value ranges.
- Splicing by Lean ranges avoids reimplementing Lean syntax parsing.
- Error classification is clearer: construction failures are separated from Lean proof diagnostics.
- The approach is deterministic and easier to test with property/unit tests.
4. get_proof_context Architecture¶
4.1 Execution path¶
tools/get_proof_context.pyvalidates args and builds composition root.ContextExtractor.extract_context(...)orchestrates context assembly.LeanInteractQuerier.get_theorem_context(...)loads declarations and returns theorem statement/proof/scope.- Optional
LeanInteractProofStateInspectorcall enriches hypotheses. - Optional similar-proof search scans declarations and computes similarity.
- Response is normalized into
status=success|fail|error.
4.2 Performance profile¶
Dominant latency is usually Lean declaration extraction
(FileCommand(..., declarations=True)) on large Lean files. The querier keeps
an in-memory declarations cache per file path; repeated calls in the same
process may avoid re-running the expensive extraction.
4.3 Diagram¶
flowchart LR
A["get_proof_context tool"] --> B["ContextExtractor.extract_context"]
B --> C["Querier.get_theorem_context"]
C --> D["extract_declarations (FileCommand declarations=True)"]
D --> E["Lean REPL elaboration"]
E --> F["TheoremContext"]
F --> G["Optional proof-state enrichment"]
G --> H["Optional similar-proof scan"]
H --> I["Response normalization (success/fail/error)"]
5. Shared runtime decisions reflected in code¶
- Lean server reuse is keyed by project root via
lean/server_manager.get_shared_server_manager. - Workspace auto-detection currently selects
nonemode in adapter logic. - Tool API version is uniform across registered tools (
1.1.0).