Blog
Research, results, and what we found when we looked.
Launch2026-02-22
We Verified Code from 4 AI Platforms. Average Score: 40/100
Bolt 42. Lovable 42. Replit 44. Claude 35. 21 bugs across 4 projects. 0 passed.
Research2026-02-22
Why AI Hallucination Is Mathematically Inevitable
Four independent proofs from four research groups. Every future model will hallucinate. This changes how you build.
Experiment2026-02-22
We Tried Training Models to Verify Themselves. It Made Them Worse.
120 curated pairs: 91.5%. 2,000 pairs: 77.4%. More data caused catastrophic collapse. Verification stays external.