a pdf goes in. typed json comes out.
hand us a pdf
web upload for people; one mcp call for agents.
the managed pipeline parses it
layout, tables across page breaks, six languages — you never see the pipeline.
typed json comes back
a download for you; a presigned url for your agent, so its context stays clean.
one portal for people.
one mcp server for agents.
for people
a folder portal. upload pdfs, watch them parse, download json. rename, move, delete — a file manager, not a developer tool. some customers never write a line of code.
for agents
two mcp tools: claritize does the work,
claritize_check watches it. async by design —
submit a long document, keep working, harvest the result when it lands.
results arrive as a presigned url, never inline.
illustrative arithmetic — the json waits in storage; your agent reads it where tokens are free.
receipts. tables. six languages.
receipts
merchant, date, line items, tax broken out by jurisdiction, tip, total, last 4 of the card. restaurant, retail, gas, online — every flavor.
receipt/v1 · out of the boxtables
multi-page tables stitched across page breaks. header rows become field names; merged cells flatten intelligently. tables come back as arrays of objects, ready for whatever's downstream.
where generic ocr visibly failsmulti-language
english, spanish, french, german, italian, and portuguese — native, with mixed-language documents handled per page. need another? more on request. we extract in the source language — translation stays yours.
en · es · fr · de · it · ptand by extension: forms, thousand-page documents, bad scans — same pipeline, no special code path.
zero data retention.
claritize never trains on, sells, shares, or analyzes your document content. we process documents to return structured data — and that is the end of our relationship with the content.
coming soon.
per-page pricing, no surprise bills.