Assay

From research to production.

Verified results across benchmarks, pilots, and live deployments.

464

tasks verified across HumanEval + SWE-bench

100%

pass@5 on HumanEval (164/164)

354

claims verified in LVR pilot

27

bugs found and fixed automatically

Published and protected.

Peer-reviewed research and intellectual property.

What we discovered.

Two findings that define the opportunity.

Verification can't be baked in

RLVF experiments show more training data hurts model performance (84.1% → 78.0%). Verification must remain an external loop. This is a permanent moat, not a feature gap.

Full codebase reconstruction works

LVR pilot: three verification loops reconstructed an entire ERP accounting domain. 126 files generated. 354 claims verified. 27 bugs fixed. Zero scaffolding in output.

Interested?

Tell us how you want to work together.

Interested in