Release
Stronger paired stats + stricter controls when you need them
Token-weighted paired bootstrap lands across the pipeline, strictness toggles expand, and CI/release pairing expectations become explicit and enforceable.
Release: InvarLock 0.3.3 — More explainable failures (and fewer mysteries)
Highlights
- Token-weighted paired Δlog-loss bootstrap support (core + primary metric + variance guard).
- Window pairing enforcement becomes more explicit (overlap/duplicates/mismatch detection).
- Strictness toggles and report metadata improvements for clearer evaluation outcomes.
0.3.3 tightens the statistical backbone of paired evaluation. The paired Δlog-loss bootstrap work isn’t just a “numbers” change—it’s about making drift conclusions more faithful to what was actually evaluated (token-weighted and paired, not loosely aggregated).
It also makes CI/release expectations blunt and explicit: perfect pairing, non-overlapping windows, and coverage floors aren’t “best effort” anymore—they’re enforced. That’s a theme in this release: fewer fuzzy edges, more things you can confidently point to.
And when things do go wrong, reports carry better context (including evaluation soft-fail metadata), which helps turn failures into something you can diagnose instead of something you just re-run blindly.
For more details, see CHANGELOG.md.
More from the blog
Continue through recent releases and implementation notes.
Release
Stable public contracts with stricter fail-closed verification
InvarLock 0.4.0 stabilizes contracts around policies, proof packs, and evaluation provenance while tightening verification, CI, and coverage enforcement.
Release
Coverage hardening across CLI, reporting, and observability paths
Coverage thresholds now enforce split-module branch floors for critical CLI/reporting paths.
Release
Targeted regression hardening for quantization and reporting paths
A focused hardening release: safer AWQ plugin discovery, stronger quantization clipping behavior, and broader report-schema acceptance for edge payloads.