Release

Token-weighted paired statistics and stricter release gates

Token-weighted paired bootstrap lands across the pipeline, strictness toggles expand, and CI/release pairing expectations become explicit and enforceable.

December 21, 2025

1 min read

InvarLock Team

Release: InvarLock 0.3.3 - Paired bootstrap, strictness toggles, and clearer failures

Highlights

Token-weighted paired Δlog-loss bootstrap support (core + primary metric + variance guard).
Window pairing enforcement becomes more explicit (overlap/duplicates/mismatch detection).
Strictness toggles and report metadata improvements for clearer evaluation outcomes.

0.3.3 tightens the statistical backbone of paired evaluation. The paired Δlog-loss bootstrap work isn’t just a “numbers” change—it’s about making drift conclusions more faithful to what was actually evaluated (token-weighted and paired, not loosely aggregated).

It also makes CI/release expectations blunt and explicit: perfect pairing, non-overlapping windows, and coverage floors aren’t “best effort” anymore—they’re enforced. That’s a theme in this release: fewer fuzzy edges, more things you can confidently point to.

And when things do go wrong, reports carry better context (including evaluation soft-fail metadata), which helps turn failures into something you can diagnose instead of something you just re-run blindly.

For the immutable release record, read the tagged CHANGELOG.md for v0.3.3.

Token-weighted paired statistics and stricter release gates

Highlights

More from the blog

The Minimum Evidence Surface for Trustworthy Weight-Edit Results

Evidence packs and explicit runtime provenance

Fail-Closed Verification for Weight-Edit Evaluation