One Run Lifecycle
Overview
| Aspect | Details |
|---|---|
| Purpose | Map one evaluate -> verify -> report journey to code and artifact owners. |
| Audience | Maintainers, reviewers auditing the assurance boundary, contributors tracing failures. |
| Contract scope | Current strict assurance flow and report v1 artifacts. |
| Source of truth | src/invarlock/cli/commands/evaluate.py, src/invarlock/core/evaluate_plan.py, src/invarlock/core/assurance_contract.py, src/invarlock/reporting/verify_contract.py. |
This page maps one evaluate -> verify -> report journey to the code and
artifact surfaces reviewers inspect.
Quick Start
The minimal end-to-end trace for a single comparison:
invarlock evaluate --allow-network \
--baseline gpt2 \
--subject distilgpt2 \
--baseline-adapter auto --subject-adapter auto \
--profile ci \
--assurance strict \
--report-out reports/eval
invarlock verify --assurance strict reports/eval/evaluation.report.json
invarlock report html -i reports/eval/evaluation.report.json -o reports/eval/evaluation.html
Each stage emits artifacts the next stage consumes; reviewers can pause at any
stage to inspect the surface in the table below. The evaluate command uses
the runtime container by default for model-loading work; host execution must be
an explicit non-assurance bypass.
Stage Map
| Stage | Code surface | Artifact surface |
|---|---|---|
| CLI planning | invarlock.cli.commands.evaluate, invarlock.core.evaluate_plan | selected profile, tier, preset, adapter, runtime policy |
| Runtime policy | invarlock.runtime_security, invarlock.cli.evaluate_phases | runtime.manifest.json |
| Config loading | invarlock.core.config_loader | normalized run config, context.assurance |
| Component resolution | invarlock.cli.run_execution, guard/adapter/edit registries | resolved adapter, edit, and guard order |
| Guard execution | invarlock.core.runner, invarlock.guards.* | guard evidence and statuses |
| Metric computation | invarlock.core.bootstrap, runner metric helpers | paired delta log-loss, ratio, CI fields |
| Report assembly | invarlock.reporting.report_make | evaluation.report.json |
| Verification | invarlock.reporting.verify_contract | verifier pass/fail details |
| Human report | invarlock report html | rendered HTML report |
Assurance Boundary
The strict assurance boundary starts at CLI planning and ends at verifier
acceptance. Strict mode is not inferred from profile names alone; it is
recorded in assurance.mode and checked by the verifier.
Debugging Rule
When a strict report fails verification, fix the earliest source evidence that caused the failure. Do not patch the report artifact by hand. The stage table above lets you trace a failure back to the owning code path and the artifact where the evidence is recorded.