Model Adapters

Overview

AspectDetails
PurposeLoad models, describe structure, and snapshot/restore state for edits and guards.
AudienceCLI users choosing model.adapter and Python callers instantiating adapters.
Supported surfaceCore HF text and image-text adapters, auto-match adapters, platform-dependent BNB, and Linux-only AWQ/GPTQ quantized adapters.
Requiresinvarlock[adapters] or invarlock[hf] for core HF adapters; invarlock[gpu], invarlock[awq], invarlock[gptq] for quantized adapters.
NetworkOffline by default; use evaluate --allow-network when a run needs model downloads.
Inputsmodel.id (HF repo or local path), adapter name, device.
Outputs / ArtifactsLoaded model object; optional snapshots; exported model directories when enabled.
Source of truthsrc/invarlock/adapters/*, src/invarlock/plugins/hf_*_adapter.py.

Quick Start

# Install core HF adapters + evaluation stack
pip install "invarlock[hf]"

# Inspect adapter availability
invarlock advanced plugins adapters

# Compare & evaluate with adapter auto-selection
invarlock evaluate --allow-network \
  --baseline gpt2 \
  --subject gpt2 \
  --adapter auto

The CLI example above uses the runtime container by default. Add --execution-mode host only for host-side compare/evaluate workflows that intentionally bypass that boundary.

from invarlock.adapters.auto import HF_Auto_Adapter

adapter = HF_Auto_Adapter()
model = adapter.load_model("gpt2", device="auto")
print(adapter.describe(model)["model_type"])

Adapter availability is broader than the published assurance basis. GPT-2 and BERT back the published calibrated basis; repo-included pilot configs for Mistral 7B, Ministral 3 text-only, Qwen2 7B, Qwen2.5 7B, Qwen2.5 14B, and additional experimental families are for experimentation until supporting artifacts are attached. See the Model Family Catalog for the authoritative family-by-family inventory.

Concepts

  • Adapters hide model-specific logic: they handle loading, structure description, and snapshot/restore so edits/guards stay model-agnostic.
  • Auto selection: use adapter: auto (config/CLI shortcut) or --adapter hf_auto (adapter plugin) to choose a concrete role adapter (hf_causal, hf_mlm, hf_seq2seq) plus quant adapters when detected. Local paths can use config.json; remote IDs fall back to name heuristics and default to hf_causal when unsure. Image-text models use the explicit hf_multimodal adapter rather than adapter auto.
  • Quantized adapters (hf_bnb, hf_awq, hf_gptq) handle their own device placement; avoid calling .to(...) on the loaded model.
  • Snapshot strategy: HF adapters expose snapshot/restore and snapshot_chunked/restore_chunked (large-model friendly). The CLI selects the strategy automatically via context.snapshot.*.

Auto adapter mapping

model_type familyAdapter
llama / mistral / mistral3 / mixtral / qwen / gemma / OLMo / yihf_causal
gpt2 / gpt_oss / opt / neo-x / phihf_causal
bert / robertahf_mlm
t5 / barthf_seq2seq

Auto inspects config.model_type; remote models may need network for config.

Capability matrix (at a glance)

Adapter familySnapshot/restoreGuard compatibilityPlatform
HF text (hf_causal, hf_mlm, hf_seq2seq)YesFullAll
HF image-text (hf_multimodal)YesFull when decoder layers are exposedAll
Quantized (hf_bnb)Best-effortFull when modules exposedPlatform-dependent
Quantized (hf_awq, hf_gptq)Best-effortFull when modules exposedLinux

Machine-readable adapter capability metadata is published at contracts/adapter_capabilities.json and surfaced through invarlock advanced plugins adapters --json.

Reference

Supported adapters

AdapterModels / PurposeRequiresPlatform supportNotes
hf_causalDecoder-only causal LMs (dense + MoE + GPT2-like)invarlock[adapters]All platforms with torchDefault causal LM adapter.
hf_mlmBERT/RoBERTa/DeBERTa MLMsinvarlock[adapters]All platforms with torchLoads AutoModelForMaskedLM when possible.
hf_multimodalImage-text generation models exposed through HF AutoModelForImageTextToTextinvarlock[adapters]All platforms with torchSingle-image vision_text evaluation with explicit adapter selection.
hf_seq2seqT5/encoder‑decoder modelsinvarlock[adapters]All platforms with torchFor seq2seq evaluation.
hf_autoAuto-select HF adapterinvarlock[adapters]All platforms with torchDelegates to a role adapter; prefers quant adapters when detected.
hf_bnbBitsandbytes quantized LMsinvarlock[gpu]Platform-dependentUses device_map="auto"; no .to(). Latest bitsandbytes wheels can work outside Linux/CUDA when the runtime imports cleanly.
hf_awqAWQ quantized LMsinvarlock[awq]Linux onlyRequires autoawq/triton.
hf_gptqGPTQ quantized LMsinvarlock[gptq]Linux onlyRequires auto-gptq/triton; packaged extras currently stop at upstream-supported pre-3.13 Python stacks, and newer Python/CUDA combinations may require a vendor build.

Adapter capabilities

Adapter classSnapshot/restoreGuard compatibilityNotes
PyTorch HF adapters (hf_causal, hf_mlm, hf_multimodal, hf_seq2seq)YesFull (module access) / multimodal full when decoder layers are exposedUses HFAdapterMixin snapshots.
Quantized HF adapters (hf_bnb, hf_awq, hf_gptq)Yes (best-effort)Full when modules are exposedAvoid explicit .to() calls.

Adapter selection (adapter: auto)

Automatic resolution uses local config.json (if model.id is a directory) and simple heuristics to choose a concrete built-in adapter name.

  • Decoder-only causal → hf_causal
  • BERT/RoBERTa/DeBERTa/ALBERT → hf_mlm
  • T5/BART → hf_seq2seq
model:
  id: mistralai/Mistral-7B-v0.1
  adapter: auto
  device: auto

Configuration examples

# Standard causal LM run
model:
  id: gpt2
  adapter: hf_causal
  device: auto
# Bitsandbytes quantized load (Linux + gpu extra)
model:
  id: mistralai/Mistral-7B-v0.1
  adapter: hf_bnb
  quantization_config:
    quant_method: bitsandbytes
    bits: 8

Adapter load arguments

Adapter loaders pass through standard Hugging Face from_pretrained arguments:

KeyCommon useApplies to
dtypeForce float16/bfloat16HF adapters
device_mapSharding/placementHF adapters
trust_remote_codeEnable custom model code only with explicit --allow-remote-code / INVARLOCK_ALLOW_REMOTE_CODE=1HF adapters
revisionPin model revisionHF adapters
cache_dirCache locationHF adapters

Adapter describe fields

adapter.describe(model) returns a dictionary containing:

  • n_layer, heads_per_layer, mlp_dims, tying (required for guard gates)
  • model_type, model_class, and adapter-specific metadata

Snapshot strategy

snapshot = adapter.snapshot(model)
try:
    # mutate model
    ...
    adapter.restore(model, snapshot)
finally:
    pass

For large models, use chunked snapshots:

snap_dir = adapter.snapshot_chunked(model)
try:
    adapter.restore_chunked(model, snap_dir)
finally:
    import shutil
    shutil.rmtree(snap_dir, ignore_errors=True)

Troubleshooting

  • Adapter missing from invarlock advanced plugins adapters: install the required extra (invarlock[adapters], invarlock[gpu], invarlock[gptq], invarlock[awq]).
  • Linux-only adapters not available: hf_awq and hf_gptq depend on triton and remain Linux-only in pyproject.toml.
  • GPTQ install fails even on Linux/CUDA: auto-gptq packaging is upstream-dependent; Python 3.13+ and some newer CUDA stacks may require a pinned or vendor wheel, or a supported interpreter, beyond pip install "invarlock[gptq]".
  • Bitsandbytes not detected: hf_bnb is platform-dependent. If the backend imports cleanly, invarlock advanced plugins adapters will report it as ready even on non-CUDA hosts.
  • Quantized model .to() errors: avoid explicit .to(); load with the adapter and let it manage device placement.

Observability

  • invarlock advanced plugins adapters --json reports readiness and missing extras.
  • report.context["plugins"] and report plugins.adapters record adapter discovery for audit trails.