April 6, 2026AgentsToolAPI

Reducto Deep Extract: When One Pass Is Not Enough, Send an Agent Loop

Document extraction has always been a single-pass game. Throw an LLM at a PDF, hope for the best, clean up the mess manually. Reducto just changed the rules with Deep Extract, launched yesterday, and it is genuinely clever.

The idea is dead simple but the execution matters. Instead of extracting data once and calling it done, Deep Extract runs an agentic loop: extract, verify results against the source document, identify what is missing or wrong, re-extract, repeat until a quality threshold is met. It is basically a code review but for structured data. The agent checks its own homework.

The numbers are hard to argue with. Customers went from 10 to 20 percent field accuracy using frontier models to 99 to 100 percent with Deep Extract. It has already processed over 28 million fields on documents up to 2,500 pages in production beta. And it outperforms expert human labelers on extraction tasks. That last part is the one that should make you pay attention.

You enable it with a single parameter: deep_extract set to true. It integrates with Reducto existing Extract API. The tradeoff is time, the agentic loop takes longer than a single pass, but still faster than having a human reviewer do it. For complex documents like invoice line items, brokerage statements, or equipment manifests where missing one field means real money lost, the tradeoff is obvious.

Reducto is YC-backed with investment from a16z. Deep Extract also supports custom verification criteria so you can tell it things like make sure all line items sum to the stated total. Every extracted field comes with citations and bounding boxes for audit trails.

https://reducto.ai/blog/reducto-deep-extract-agent

The pattern here is bigger than document extraction. Agent-in-the-loop is replacing human-in-the-loop everywhere the verification step is well-defined. If you can write the rules for what correct looks like, an agent can check it. Deep Extract is one of the cleanest implementations of this idea so far.
← Previous
depthfirst Raises $80M Series B to Secure the Code Agents Write
Next β†’
Google LiteRT-LM: Run Agentic AI on a Raspberry Pi, Seriously
← Back to all articles

Comments

Loading...
>_