Hallucination Transparency Report

How BenchSlap prevents AI-generated legal misinformation

Data as of April 2026

Verification Corpus

Utah appellate opinions (1996-present)13,359
Utah rules and statutes indexed22,895
Authorities cryptographically pinned (SHA-256)35,325
Case dispositions pre-extracted8,968
Overruled/superseded cases flagged129
Treatment graph edges238

AEGIS PRIME Verification Gates

Layer 1: Citation existence checkActive
Layer 2: Holding text containment (SHA-256 + shingles)Active
Layer 3: Structural fact verification (disposition, panel, binding weight)Active
Verification time per citation< 5ms
Gate bypass possibleNo

What Gets Blocked

Fabricated case citationHARD BLOCK
Wrong disposition (e.g., "affirmed" when reversed)HARD BLOCK
Dissent quoted as majority opinionHARD BLOCK
Overruled case cited as current lawHARD BLOCK
Dicta presented as binding holdingHARD BLOCK
Holding not found in opinion textSoft warning

How It Works

Every citation passes through three independent verification layers before reaching the user.

Layer 1 — Does the citation exist? Checked against our local database of 13,359 opinions, then against utcourts.gov, CourtListener, and Caselaw Access Project. If the case doesn't exist, the draft is blocked.

Layer 2 — Does the holding match? The AI's claimed holding is checked for textual containment against the actual stored opinion text. The authority is cryptographically pinned with SHA-256 — if anyone modifies the stored text, the hash mismatch is detected.

Layer 3 — Do the structural facts match? Pre-extracted facts (disposition, panel composition, binding weight, treatment history) are compared against the AI's claims. "The court affirmed" is checked against a stored disposition enum. This is a database lookup, not an AI judgment. It takes less than 5 milliseconds. It is deterministic.

The gates cannot be bypassed. There is no skip button. There is no admin override. There is no environment variable that turns verification off. 77 automated tests guard the enforcement permanently.

Comparison

ChatGPT hallucination rate (Stanford 2023)GPT-3.5: 69% / GPT-4: 36%
BenchSlap hallucination rate (Utah corpus)Structurally impossible
How?Closed corpus + deterministic verification