System Architecture

Overview

Aspect	Details
Purpose	Edit-agnostic safety evaluation framework for ML model weight modifications.
Audience	Developers extending InvarLock, operators debugging pipelines, security reviewers.
Core components	CLI shells, Core/runtime policy layer, Guard chain, Reporting/artifact subsystem.
Design goals	Torch-independent core, edit-agnostic guards, deterministic evaluation, explicit artifact contracts, full provenance.
Source of truth	`src/invarlock/core/.py`, `src/invarlock/reporting/.py`, `src/invarlock/runtime_provenance.py`, `src/invarlock/runtime_verify.py`, `src/invarlock/cli/commands/.py`, `src/invarlock/cli/run_.py`, `src/invarlock/guards/*.py`.

See the Glossary for definitions of terms such as the canonical guard chain, policy digest, and measurement contract.

Quick Reference
High-Level Architecture
Component Layers
Pipeline Flow
Guard Chain Architecture
Report Generation Flow
Architecture Guardrails
Key Design Decisions
Module Dependencies
Extension Points
Related Documentation

Quick Reference

System overview showing user input, processing stages, and report outputs.

High-Level Architecture

InvarLock follows a layered architecture with clear separation of concerns:

Layered architecture connecting CLI shells, policy contracts, runtime services, guards, and reporting files.

Component Layers

CLI Layer (`src/invarlock/cli/`)

Typer-based command shells providing user-facing entry points. The command modules should stay thin: parse arguments, call core/reporting owners, render output, and map failures to exit codes.

Shell support modules such as cli/config_execution.py, cli/run_execution.py, cli/run_config.py, cli/run_pairing.py, cli/run_overhead.py, and cli/run_artifacts.py belong to this boundary layer as well. They can perform CLI-facing adaptation and console/event rendering, but they must not become policy owners.

Command	Purpose	Primary Output
`evaluate`	Compare baseline vs subject with pinned windows	report JSON + MD
`verify`	Validate report against schema and pairing	Exit code + messages
`report`	Render/compare reports and report outputs	MD/HTML/JSON artifacts
`doctor`	Environment diagnostics	Health check output
`advanced`	Maintenance workflows such as evidence packs, policy packs, plugins, and calibration	Exit code + workflow-specific artifacts
`version`	Emit package and schema version information	Version string

Core Policy / Contracts (`src/invarlock/core/`, `src/invarlock/reporting/`)

Deterministic policy, artifact-contract, and report-verification owners shared by the CLI and non-CLI entrypoints.

Module	Responsibility
`evaluate_contract.py`	Baseline-report validation and emitted run-artifact contract enforcement for `evaluate`
`evaluate_plan.py`	Evaluation result policy, degradation classification, and emitted outcome shaping
`report_inputs.py`	Canonical report path resolution and JSON-object validation
`doctor_findings.py`	Structured doctor findings and optional report cross-check analysis
`verify_contract.py`	Structured report-verification service used by `verify` and evidence-pack flows
`runtime_manifest_verify.py` + `runtime_provenance.py`	Authoritative runtime-manifest verification and runtime-provenance ownership for report verification
`run_policy.py`	Shared run policy helpers such as split choice, PM thresholds, and overhead policy
`run_retry_policy.py`	Retry-attempt summaries and retry state transitions
`run_snapshot_contract.py` + `run_snapshot_policy.py`	Snapshot planning, restore behavior, and retry transitions
`run_guard_overhead_policy.py`	Guard-overhead normalization, summary building, and report shaping
`run_provenance_contract.py` + `run_report_contract.py`	Run provenance and run-report assembly contracts
`run_report_payload_policy.py`	Deterministic payload shaping for context, metrics, guards, and flags

Runtime Provenance Verification Ownership

Runtime provenance uses a single verifier implementation:

core/runtime_manifest_verify.py is the authoritative verifier for runtime.manifest.json plus report-digest binding checks.
runtime_verify.py and cli/runtime_verify.py are the programmatic and CLI entrypoints for that verifier.
runtime_provenance.py calls the same verifier when invarlock verify enforces runtime provenance on container-backed reports.
Product behavior does not depend on finding an external verifier binary on PATH; verifier semantics are package-native and deterministic across installs.

Core Runtime (`src/invarlock/core/`)

Pipeline orchestration without direct torch imports (torch-independent coordination).

Module	Responsibility
`runner.py` + `runner_*.py`	Pipeline phases: prepare → guards → edit → eval → finalize
`api.py`	Protocol definitions for ModelAdapter, ModelEdit, Guard
`bootstrap.py`	BCa bootstrap CI computation for paired metrics
`checkpoint.py`	Snapshot/restore primitives for retry loops
`registry.py`	Plugin discovery and registration

Guard Layer (`src/invarlock/guards/`)

Four-guard pipeline for edit safety validation.

Guard	Focus	Key Metric
`invariants`	Structural integrity, NaN/Inf checks	`validation.invariants_pass`
`spectral`	Weight matrix spectral norm stability	κ-threshold violations
`rmt`	Activation edge-risk via Random Matrix Theory	ε-band compliance
`variance`	Variance equalization with A/B gate	Predictive gain

Reporting Layer (`src/invarlock/reporting/`)

Report generation, validation, persistence, and rendering.

Module	Responsibility
`report_schema.py`	Evaluation report schema and structural validation
`report_validation.py`	Canonical validation-flag computation
`report_make.py`	Public evaluation-report entrypoint that coordinates the split report-making owners
`report_make_inputs.py`	Input normalization, baseline reference building, and build-section extraction
`report_make_assembly.py`	Policy/provenance/guard assembly and report build-context composition
`report_make_output.py`	Final evaluation-report shaping and output payload construction
`report_bundle.py`	Evaluation-bundle persistence, manifest writing, and evidence attachment
`report_contract.py`	Input loading and report-generation planning
`report_console.py`	Console/report validation summary helpers used by CLI/reporting surfaces
`report_summary.py`	Shared executive-summary/view-model derivation for reporting surfaces
`render.py`	Markdown rendering for evaluation reports
`html.py`	HTML export with styling
`report_files.py`	Raw run-report JSON/Markdown/HTML persistence
`evidence.py`	Evidence file normalization and attachment helpers
`telemetry.py`	Performance metrics collection

Pipeline Flow

Evaluation pipeline from baseline and subject runs into normalized metric comparison, policy application, and report rendering.

Guard Chain Architecture

Guard chain execution across pre-edit and post-edit checks.

Report Generation Flow

Architecture Guardrails

The shell/core split is enforced by design and by targeted architecture guard tests. The intended invariants are:

No lazy exports in package roots such as adapters/__init__.py or guards/__init__.py. Package roots should expose only explicit canonical exports.
No rmt_legacy references in production source. RMT ownership lives in rmt.py, rmt_analysis.py, rmt_detection.py, and rmt_math.py.
No dependency-map orchestration in command shells. Public command owners must stay thin and must not rebuild giant deps dictionaries or inject callables to recreate removed indirection.
No compatibility-only command signatures once a canonical owner contract exists. Example: lens-metric calculation takes a required MetricsConfig instead of deprecated per-call overrides.
No CLI imports inside owner layers. Modules under src/invarlock/core/ and src/invarlock/reporting/ must stay callable without importing invarlock.cli.

These guardrails keep the CLI as an imperative shell while policy, contracts, and verdict computation remain reusable from non-CLI flows such as evidence-pack verification and programmatic execution.

Key Design Decisions

Decision	Rationale	Implementation
Torch-independent core	`runner.py` coordinates without importing torch; adapters encapsulate torch-specific logic.	Adapter protocol in `core/api.py`
Edit-agnostic guards	Guards work with any weight modification (quantization, pruning, LoRA merge).	Guard protocol validates model state, not edit type
Tier-based policies	Calibrated thresholds in `tiers.yaml` for balanced/conservative/aggressive safety profiles.	Policy resolution in `guards/policies.py`
Deterministic evaluation	Seed bundle + window pairing schedules ensure reproducible metrics.	`meta.seeds`, `dataset.windows.stats` tracking
Functional-core / imperative-shell split	Keep policy, artifact contracts, and verdict computation reusable outside the CLI while CLI modules stay thin.	`core/.py` + `reporting/.py` owners called from `cli/commands/*.py`
Single verifier ownership	Runtime-manifest verification should not vary with host tooling, so it must use one product implementation.	`core/runtime_manifest_verify.py`, `runtime_verify.py`, `runtime_provenance.py`
Plugin architecture	Entry points for guards, adapters, edits enable extension without core changes.	`importlib.metadata` discovery in `core/registry.py`
Log-space primary metrics	Paired ΔlogNLL with BCa bootstrap avoids ratio math bias.	`core/bootstrap.py` implementation

Module Dependencies

Module dependency graph linking CLI shells, shared contracts, runtime owners, and extension surfaces.

Extension Points

InvarLock supports extension via entry points without modifying core code.

Extension Type	Entry Point Group	Example
Adapters	`invarlock.adapters`	`hf_causal`, `hf_mlm`, `hf_causal`
Guards	`invarlock.guards`	`invariants`, `spectral`, `rmt`, `variance`
Edits	`invarlock.edits`	`quant_rtn`, `noop`

Custom Adapter Example

# my_adapter.py
from invarlock.core.api import ModelAdapter

class MyAdapter(ModelAdapter):
    name = "my_custom_adapter"

    def load(self, model_id: str, device: str) -> nn.Module:
        # Custom loading logic
        ...

    def describe(self, model: nn.Module) -> dict:
        # Return model metadata
        ...

# pyproject.toml
[project.entry-points."invarlock.adapters"]
my_custom_adapter = "my_adapter:MyAdapter"

Troubleshooting

Import errors in torch-free context: ensure invarlock.core imports stay torch-independent; use adapters for torch operations.
Guard preparation failures: check tier policy compatibility; use context.run.strict_guard_prepare: false for debugging.
Report generation errors: verify baseline and subject reports exist and have compatible window structures.

Observability

Pipeline phases emit timing via print_timing_summary() in CLI.
Guard results recorded in report.guards[] and report validation.* flags.
Telemetry fields include memory_mb_peak, latency_ms_*, duration_s.

CLI Reference — Command usage and options
Guards Reference — Guard configuration and evidence
Configuration Schema — YAML config structure
reports — report schema and verification
Assurance Case Overview — Assurance claims and evidence