Back to blog

Runtime Manifests and Why Provenance Must Travel With the Result

Ink/charcoal doodle: a report and runtime manifest travel together into verification as one container-backed evidence bundle.

A strong evaluation result should carry its runtime provenance with it. In InvarLock, that means the runtime manifest travels next to the report and is rechecked by invarlock verify.

3 min read
InvarLock Team

Research Note: provenance is weakest when it stays behind in the runtime

Highlights

  • runtime.manifest.json is a runtime provenance sidecar, not decoration.
  • It binds a report to its runtime context and is enforced by invarlock verify on container-backed outputs.
  • Keeping the manifest next to the report is part of the evidence contract, not an archival preference.

Many systems record runtime context somewhere, but leave it stranded in logs, hidden state, or ephemeral CI metadata. That makes later review harder than it should be.

InvarLock's public manifest surface goes further than that.

The public contracts reference names runtime.manifest.json as a versioned contract. It sits next to evaluation.report.json, carries specific fields about the report, config, execution mode, and runtime flags, and is part of what later verification expects to see.

What The Runtime Manifest Actually Binds

The contract is concrete about the job. The manifest records the report path and hash, config path and source, execution mode, and a small runtime block describing the image reference, image digest, and runtime permissions such as network and remote-code allowance.

That is enough to support runtime-provenance review.

It tells a reviewer which report the manifest is describing, what configuration surface produced it, whether the run happened in the container path or host-bypass path, and which runtime affordances were left on. That is a far better posture than asking a reviewer to "trust our CI settings."

Why Verify Needs It

The public docs do not treat the manifest as optional decoration. Both the reports reference and the CLI reference say that container-backed evaluation outputs are expected to carry runtime.manifest.json next to evaluation.report.json, and invarlock verify checks that adjacency.

That matters because provenance only helps if it survives into the same place where the result is later interpreted.

Once the manifest travels with the report, verification can fail closed when the runtime provenance sidecar is missing. That is better than silently accepting a report whose runtime boundary can no longer be reconstructed.

Why A Sidecar Is Better Than Hidden Runtime State

The artifact layout docs make the design choice practical: archive the report and manifest together.

This is important because the manifest is not trying to replace the report. It is trying to travel with it. The report remains the evaluation summary. The manifest remains the runtime provenance surface. Keeping them adjacent preserves both interpretability and auditability.

That sidecar design is stronger than burying provenance in CI logs or a database that the reviewer may never see.

What The Manifest Still Does Not Prove

The runtime manifest makes a deliberately narrow promise.

The runtime manifest does not prove that the metric conclusion is correct. It does not prove that the dataset was the right one. It does not prove content safety, alignment, or deployment readiness. It proves something smaller and still valuable: the report is accompanied by a stated runtime context that can be rechecked later.

That is the right level of ambition for runtime provenance.

Claim Map

The practical path is:

  • emit evaluation.report.json
  • emit adjacent runtime.manifest.json
  • keep them together in the archive
  • run invarlock verify against that pair later

That is a much stronger provenance story than leaving execution details behind in the machine that ran the job.

Limitations

  • This note is about the runtime-manifest contract, not a new benchmark result.
  • Provenance is necessary for rechecking runtime context, but it does not by itself settle the quality claim.

Sources

More in Research Note

Continue through nearby posts in the same reading thread.