# Compare baseline and subject on the default runtime-container path
invarlock evaluate --allow-network \
--baseline gpt2 \
--subject gpt2 \
--report-out reports/eval
# Render HTML from the emitted evaluation bundle
invarlock report html -i reports/eval/evaluation.report.json -o reports/eval/evaluation.html
invarlock report explain --evaluation-report reports/eval/evaluation.report.json
Model-loading commands use the runtime container by default unless a
host-side invarlock evaluate --execution-mode host workflow explicitly
bypasses it.
Repo-owned presets under configs/ remain available for maintainers, but the
quick-start path above stays wheel-compatible by using direct flags only.
Concepts
runs/ is scratch space: evaluate emits baseline/subject working artifacts there.
reports/ is evidence: archive evaluation.report.json and runtime.manifest.json
for audit, plus any HTML or evidence-pack outputs you distribute.
evaluation bundles reference baseline/subject report artifacts; keep them
together to preserve pairing and make later review easier.