Model Adapters

Overview

AspectDetails
PurposeLoad models, describe structure, and snapshot/restore state for edits and guards.
AudienceCLI users choosing model.adapter and Python callers instantiating adapters.
Supported surfaceCore HF adapters, auto-match adapters, and Linux-only quantized adapters.
Requiresinvarlock[adapters] or invarlock[hf] for core HF adapters; invarlock[onnx] for hf_causal_onnx; invarlock[gpu], invarlock[awq], invarlock[gptq] for quantized adapters.
NetworkOffline by default; set INVARLOCK_ALLOW_NETWORK=1 for model downloads.
Inputsmodel.id (HF repo or local path), adapter name, device.
Outputs / ArtifactsLoaded model object; optional snapshots; exported model directories when enabled.
Source of truthsrc/invarlock/adapters/*, src/invarlock/plugins/hf_*_adapter.py.

Quick Start

# Install core HF adapters + evaluation stack
pip install "invarlock[hf]"

# Inspect adapter availability
invarlock plugins adapters

# Compare & evaluate with adapter auto-selection
INVARLOCK_ALLOW_NETWORK=1 invarlock evaluate \
  --baseline gpt2 \
  --subject gpt2 \
  --adapter auto
from invarlock.adapters import HF_Auto_Adapter

adapter = HF_Auto_Adapter()
model = adapter.load_model("gpt2", device="auto")
print(adapter.describe(model)["model_type"])

Adapter availability is broader than the published assurance basis. GPT-2 and BERT currently back the published calibrated basis; repo-shipped pilot configs for Mistral 7B and Qwen2 7B are for experimentation until supporting artifacts are attached.

Concepts

  • Adapters hide model-specific logic: they handle loading, structure description, and snapshot/restore so edits/guards stay model-agnostic.
  • Auto selection: use adapter: auto (config/CLI shortcut) or --adapter hf_auto (adapter plugin) to choose a concrete role adapter (hf_causal, hf_mlm, hf_seq2seq, hf_causal_onnx) plus quant adapters when detected. Local paths can use config.json; remote IDs fall back to name heuristics and default to hf_causal when unsure.
  • Quantized adapters (hf_bnb, hf_awq, hf_gptq) handle their own device placement; avoid calling .to(...) on the loaded model.
  • Snapshot strategy: HF adapters expose snapshot/restore and snapshot_chunked/restore_chunked (large-model friendly). The CLI selects the strategy automatically via context.snapshot.*.

Auto adapter mapping

model_type familyAdapter
mistral / mixtral / qwen / yihf_causal
gpt2 / opt / neo-x / phihf_causal
bert / robertahf_mlm
t5 / barthf_seq2seq

Auto inspects config.model_type; remote models may need network for config.

Capability matrix (at a glance)

Adapter familySnapshot/restoreGuard compatibilityPlatform
HF PyTorch (hf_causal, hf_mlm, hf_seq2seq)YesFullAll
Quantized (hf_bnb, hf_awq, hf_gptq)Best-effortFull when modules exposedLinux
ONNX (hf_causal_onnx)NoEval-onlyAll

Machine-readable adapter capability metadata is published at contracts/adapter_capabilities.json and surfaced through invarlock plugins adapters --json.

Reference

Supported adapters

AdapterModels / PurposeRequiresPlatform supportNotes
hf_causalDecoder-only causal LMs (dense + MoE + GPT2-like)invarlock[adapters]All platforms with torchDefault causal LM adapter.
hf_mlmBERT/RoBERTa/DeBERTa MLMsinvarlock[adapters]All platforms with torchLoads AutoModelForMaskedLM when possible.
hf_seq2seqT5/encoder‑decoder modelsinvarlock[adapters]All platforms with torchFor seq2seq evaluation.
hf_causal_onnxOptimum/ONNXRuntime causal LMsinvarlock[onnx]All platformsInference-only; snapshot/restore not supported.
hf_autoAuto-select HF adapterinvarlock[adapters]All platforms with torchDelegates to a role adapter; prefers quant adapters when detected.
hf_bnbBitsandbytes quantized LMsinvarlock[gpu]Linux onlyUses device_map="auto"; no .to().
hf_awqAWQ quantized LMsinvarlock[awq]Linux onlyRequires autoawq/triton.
hf_gptqGPTQ quantized LMsinvarlock[gptq]Linux onlyRequires auto-gptq/triton.

Adapter capabilities

Adapter classSnapshot/restoreGuard compatibilityNotes
PyTorch HF adapters (hf_causal, hf_causal, hf_mlm, hf_seq2seq)YesFull (module access)Uses HFAdapterMixin snapshots.
Quantized HF adapters (hf_bnb, hf_awq, hf_gptq)Yes (best-effort)Full when modules are exposedAvoid explicit .to() calls.
ONNX adapter (hf_causal_onnx)NoEval-onlyUse edit: noop and expect guard limitations.

Adapter selection (adapter: auto)

Automatic resolution uses local config.json (if model.id is a directory) and simple heuristics to choose a concrete built-in adapter name.

  • Decoder-only causal → hf_causal
  • BERT/RoBERTa/DeBERTa/ALBERT → hf_mlm
  • T5/BART → hf_seq2seq
model:
  id: mistralai/Mistral-7B-v0.1
  adapter: auto
  device: auto

Configuration examples

# Standard causal LM run
model:
  id: gpt2
  adapter: hf_causal
  device: auto
# Bitsandbytes quantized load (Linux + gpu extra)
model:
  id: mistralai/Mistral-7B-v0.1
  adapter: hf_bnb
  quantization_config:
    quant_method: bitsandbytes
    bits: 8
# ONNX Runtime inference-only adapter (use with edit: noop)
model:
  id: /path/to/onnx-model
  adapter: hf_causal_onnx
edit:
  name: noop

Adapter load arguments

Adapter loaders pass through standard Hugging Face from_pretrained arguments:

KeyCommon useApplies to
dtypeForce float16/bfloat16HF adapters
device_mapSharding/placementHF adapters
trust_remote_codeEnable custom model codeHF adapters
revisionPin model revisionHF adapters
cache_dirCache locationHF adapters

Adapter describe fields

adapter.describe(model) returns a dictionary containing:

  • n_layer, heads_per_layer, mlp_dims, tying (required for guard gates)
  • model_type, model_class, and adapter-specific metadata

Snapshot strategy

snapshot = adapter.snapshot(model)
try:
    # mutate model
    ...
    adapter.restore(model, snapshot)
finally:
    pass

For large models, use chunked snapshots:

snap_dir = adapter.snapshot_chunked(model)
try:
    adapter.restore_chunked(model, snap_dir)
finally:
    import shutil
    shutil.rmtree(snap_dir, ignore_errors=True)

Troubleshooting

  • Adapter missing from invarlock plugins adapters: install the required extra (invarlock[adapters], invarlock[gpu], invarlock[gptq], invarlock[awq]).
  • Linux-only adapters not available: hf_bnb, hf_awq, and hf_gptq are only published for Linux in pyproject.toml.
  • Quantized model .to() errors: avoid explicit .to(); load with the adapter and let it manage device placement.
  • ONNX adapter guard failures: hf_causal_onnx is inference-only; use edit: noop and avoid guards that require PyTorch module access.

Observability

  • invarlock plugins adapters --json reports readiness and missing extras.
  • report.context["plugins"] and report plugins.adapters record adapter discovery for audit trails.