API Guide

Overview

AspectDetails
PurposeProgrammatic interface for running the InvarLock pipeline and generating reports.
AudiencePython callers building scripted workflows or integrations.
Supported surfaceStable contract surfaces remain CLI/report/contract-read paths; CoreRunner.execute, RunConfig, ModelAdapter, ModelEdit, Guard, and direct reporting helpers are advanced/non-stable.
Requiresinvarlock[adapters] for HF adapters, invarlock[edits] for built-in edits, invarlock[guards] for guard math, invarlock[eval] for dataset providers.
NetworkOffline by default; CLI runs use evaluate --allow-network, while Python callers set INVARLOCK_ALLOW_NETWORK=1 to download models or datasets.
InputsModel instance, adapter, edit, guard list, RunConfig, optional calibration data.
Outputs / ArtifactsRunReport object; optional event logs/checkpoints; evaluation bundles via invarlock.reporting.make_report(...) and report_bundle.save_evaluation_bundle(...).
Source of truthsrc/invarlock/core/runner.py, src/invarlock/core/api.py, src/invarlock/cli/config_execution.py, src/invarlock/reporting/report_make.py, src/invarlock/reporting/report_make_inputs.py, src/invarlock/reporting/report_make_assembly.py, src/invarlock/reporting/report_make_output.py, src/invarlock/reporting/report_bundle.py, src/invarlock/reporting/report_console.py, src/invarlock/reporting/report_files.py, src/invarlock/reporting/report_schema.py.

Quick Start

from invarlock.adapters.auto import HF_Auto_Adapter
from invarlock.core.api import RunConfig
from invarlock.core.runner import CoreRunner
from invarlock.edits import RTNQuantEdit
from invarlock.guards.invariants import InvariantsGuard
from invarlock.guards.spectral import SpectralGuard

adapter = HF_Auto_Adapter()
model = adapter.load_model("gpt2", device="auto")

edit = RTNQuantEdit(bitwidth=8, per_channel=True, group_size=128, clamp_ratio=0.005)
guards = [InvariantsGuard(), SpectralGuard(sigma_quantile=0.95, deadband=0.10)]

config = RunConfig(device="auto")
report = CoreRunner().execute(model, adapter, edit, guards, config)

print("status:", report.status)
print("primary metric:", report.metrics.get("primary_metric"))

For real primary-metric values, pass calibration_data (see Concepts). Without it, the runner uses lightweight mock metrics so the pipeline can finish.

Concepts

  • Pipeline phases: prepare → guard prepare → edit → guard validate → eval → finalize/rollback.
  • Calibration data: indexable batches (list/sequence) with input_ids, optional attention_mask, and optional labels. Preview/final windows are sliced from this sequence.
  • Auto configuration: auto_config controls tier/policy resolution and is recorded under report.meta["auto"] for report generation.
  • Snapshots: retries use snapshot/restore; configure via context.snapshot.* when using YAML configs.
  • reports: generated from RunReport + baseline report via invarlock.reporting.make_report, then persisted as an evaluation bundle with invarlock.reporting.report_bundle.save_evaluation_bundle.
  • Verification: CLI-side invarlock verify enforces runtime.manifest.json runtime provenance for container-backed outputs in addition to schema and pairing checks.

Responsibility lanes

LaneResponsibility
User codeBuild RunConfig, call execute, consume RunReport.
CoreRunnerOrchestrate phases, apply edit, assemble status + metrics.
AdapterLoad/describe model, snapshot/restore.
Guardsprepare/validate, return typed decisions (allow/monitor/rollback/block).
EvalBuild windows, compute primary metric + tail metrics.
reportmake_report(report, baseline) + save_evaluation_bundle(...) for evaluation-bundle generation.

Note: CoreRunner coordinates each lane.

Reference

CoreRunner.execute

CoreRunner.execute is the primary entry point for advanced/non-stable programmatic runs.

report = CoreRunner().execute(
    model,
    adapter,
    edit,
    guards,
    config,
    calibration_data=calibration_data,
    auto_config=auto_config,
    edit_config=edit_config,
    preview_n=preview_n,
    final_n=final_n,
)
ParameterTypeDescription
modelAnyLoaded model instance.
adapterModelAdapterAdapter that can describe/snapshot/restore the model.
editModelEdit or EditLikeEdit operation to apply.
guardslist[Guard]Guard instances to validate after edit.
configRunConfigRuntime settings (device, thresholds, event logs).
calibration_dataAnyOptional calibration batches for evaluation.
auto_configdict[str, Any]Optional tier/policy hints (recorded into report meta).
edit_configdict[str, Any]Overrides passed to edit.apply(...).
preview_n / final_nint | NoneOverride preview/final counts; defaults to slicing calibration data.

RunConfig

RunConfig controls runtime behavior in the core runner.

FieldDefaultNotes
device"auto"Resolves to CUDA → MPS → CPU.
max_pm_ratio1.5Max acceptable primary-metric ratio before rollback.
spike_threshold2.0Catastrophic spike ratio for immediate rollback.
event_pathNonePath to JSONL event log (optional).
checkpoint_interval00 disables checkpoints.
dry_runFalseSkip mutations and produce a report.
verboseFalseEnables extra logging.
context{}Free-form context passed to guards/eval.

Auto config hints

auto_config is recorded in report.meta["auto"] and used for tier resolution.

KeyMeaning
enabledWhether auto mode is enabled.
tierTier label (balanced, conservative, aggressive).
probesMicro-probe count (0–10).
target_pm_ratioTarget ratio for auto tuning (CLI default: 2.0).

RunReport fields

FieldDescription
metaExecution metadata (device, seeds, config snapshot).
editEdit metadata and deltas.
guardsGuard results keyed by guard name.
metricsPrimary metric + telemetry values.
evaluation_windowsCaptured preview/final windows (if enabled).
statuspending, running, success, failed, or rollback.
errorError string when status=failed.
contextRun context propagated to guards/eval.

Failure outcomes

OutcomeTriggerRunReport evidence
MonitorGuard returns decision: monitor.report.guards[].decision = monitor; report.status = success.
RollbackGuard returns decision: rollback, or guard/primary-metric gates fail.report.status = rollback; report.meta.rollback_reason.
FailedUnrecoverable runner exception.report.status = failed; report.error.

Interfaces

ModelAdapter, ModelEdit, and Guard are defined in invarlock.core.api.

from invarlock.core.api import Guard, ModelAdapter, ModelEdit

class CustomGuard(Guard):
    name = "custom_guard"

    def prepare(self, model, adapter, calib, policy):
        return {"ready": True}

    def validate(self, model, adapter, context):
        return {"passed": True, "decision": "monitor", "metrics": {"ok": 1}}

Notes:

  • The runner calls prepare(...) when the guard implements it (GuardWithPrepare).
  • validate(...) is always called during the guard phase.
  • validate(...) should emit the typed decision vocabulary: allow, monitor, rollback, or block.
  • Optional lifecycle helpers (before_edit, after_edit, finalize) are only invoked when you manage guards manually (for example via GuardChain).

GuardChain helper

GuardChain provides lifecycle helpers for manually coordinating guard calls:

from invarlock.core.api import GuardChain

chain = GuardChain([guard])
chain.prepare_all(model, adapter, calib, policy_config)
chain.before_edit_all(model)
chain.after_edit_all(model)
chain.finalize_all(model)

Calibration data format

Calibration batches should be indexable and yield dict-like objects:

batch = {
    "input_ids": [[101, 102, 103]],
    "attention_mask": [[1, 1, 1]],
    # optional
    "labels": [[101, 102, 103]],
}

If your calibration data is an iterator without __len__, set INVARLOCK_ALLOW_CALIBRATION_MATERIALIZE=1 to allow the runner to materialize it.

Evaluation window helpers

You can build calibration batches from dataset providers:

from invarlock.eval.data import get_provider

provider = get_provider("wikitext2")
preview, final = provider.windows(
    tokenizer,
    preview_n=64,
    final_n=64,
    seq_len=512,
    stride=512,
)

calibration = [
    {"input_ids": ids, "attention_mask": mask}
    for ids, mask in zip(
        preview.input_ids + final.input_ids,
        preview.attention_masks + final.attention_masks,
        strict=False,
    )
]

reports (canonical helpers)

from invarlock.reporting.render import render_report_markdown
from invarlock.reporting.report_make import make_report
from invarlock.reporting.report_schema import validate_report

report = make_report(report, baseline_report)
validate_report(report)
print(render_report_markdown(report))

Exceptions

Core exceptions live in invarlock.core.exceptions:

  • ModelLoadError, AdapterError, EditError, GuardError, ConfigError
  • InvarlockError (base class)

Troubleshooting

  • DEPENDENCY-MISSING during adapter load: install the matching extra (e.g., pip install "invarlock[adapters]") and retry.
  • No calibration data provided warnings: pass calibration_data to CoreRunner.execute (or use the CLI, which handles datasets automatically).
  • Calibration data not indexable: pass a list/sequence or set INVARLOCK_ALLOW_CALIBRATION_MATERIALIZE=1 to allow materialization.
  • Guard prepare failures in CI/Release: adjust guard policies or set context.run.strict_guard_prepare: false for local debugging only.

Observability

  • RunReport.meta, RunReport.guards, RunReport.metrics, and RunReport.evaluation_windows are the canonical inspection points (windows can be omitted when INVARLOCK_STORE_EVAL_WINDOWS=0).
  • If RunConfig.event_path is set, an event log is written as JSONL.
  • reports from make_report can be validated with invarlock.reporting.report_schema.validate_report or the CLI invarlock verify.