Assurance Protocol Changelog

The Forge assurance scoring protocol evolves as we improve detection accuracy. Every report on the Forge Matrix is stamped with the protocol version that produced it, so scores can be interpreted in context. This page documents every change.

Current protocol: v4  |  4 versions  |  Latest: 2026-06-07

Protocol v4 2026-06-07 CURRENT

Agent-era protocol — 161 scenarios (+87), 16 categories (+8), deployment-profile calibration, paired benign/malicious probes, capability-gating

Status: Current protocol. All new reports use this version. Adds an agentic/tool-use threat model on top of v3 while keeping scoring fully deterministic; Model Certification (no deployment profile) is byte-identical to pre-v4, and v3 reports keep their original meaning.

Changes from v3

Protocol v3 2026-04-06

Enterprise hardening — 74 scenarios (+19), severity weighting, domain-specific safety packs, weighted pass rates

Status: Superseded by v4 (2026-06-07). Reports scored under v3 retain their original meaning. Severity weighting means critical failures (safety, exfiltration) count 2x.

Changes from v2

Protocol v2 2026-04-01

Multi-vector expansion — 55 scenarios (+17), multi-turn escalation, language/encoding attacks, consistency scoring

Status: Significant detection accuracy improvement. Reports scored under v1 may have false positives from empathetic refusals being mis-scored as compliance.

Changes from v1

Protocol v1 2026-03-27

Baseline — 38 scenarios, Trident Protocol (114 test vectors), Forge Parallax dual attestation

Status: This is the baseline protocol. Early Matrix data was scored under this version.

Capabilities

The protocol version is embedded in the signed report payload and cannot be tampered with.

Back to Matrix