Document Parsing API

Document parsing API for loan origination.

A single endpoint for any lending document. Bank statements, tax returns, IDs, audited financials. Typed, scanned, photographed, handwritten.

Kita is a document parsing API built for lenders. Send the document. Get structured JSON, fraud signals, and citations. Available as a standalone service or as part of the full Kita stack.

Definition

What is a document parsing API?

A document parsing API turns documents into structured data over a REST endpoint. Modern document parsing APIs use vision and language models to handle any format, including photos, scans, and handwriting, without per-form template configuration. The output is JSON or a structured export.

How it works

01

One endpoint, many document types

POST /v1/documents with the file and the document type. Get back structured JSON with extracted data, metadata, fraud signals, and confidence scores. 50+ supported types.

02

Vision-language, not template OCR

Reads layout, tables, handwriting, signatures, and stamps. Handles photos, scans, and screenshots without preprocessing. Works on documents the model has never seen.

03

Output ready for your pipeline

JSON for systems, CSV and Excel for spreadsheets, webhooks for async processing. Field-level confidence so you can route low-confidence outputs to human review.

Comparison

Document parsing API vs. legacy OCR.

AspectLegacy OCR APIKita document parsing API
Setup per document typeWeeks of template configurationDay-one support for 50+ types
Handwriting and photosFails outside clean scansHandles photo, scan, screenshot, handwritten
OutputRaw fields you have to interpretCredit signals plus raw data
Fraud signalsNot includedTampering, mismatches, forgery patterns
LatencyVariable; depends on form complexitySub-30s typical, async webhooks for batch
MaintenanceTemplates break when forms changeGeneralizes across format variations

Who it's for

Built for the three lender scenarios we serve.

Loan origination systems

Plug parsing into your LOS.

Send borrower-uploaded documents to the parsing API from your LOS workflow. Get structured data back to populate the file. Cuts hours of manual entry.

Underwriting platforms

Feed your decision engine.

Parse documents on intake and feed the structured output into your scoring model or credit-decisioning system. Reduces the document-handling overhead.

Compliance and audit

Document verification at scale.

Verify identity, income, and business registration documents at the volume your origination pipeline requires. Built-in fraud signals reduce manual review queues.

The product

Kita Capture API

The document parsing API is delivered through Kita Capture. REST endpoints, Python and JavaScript SDKs, webhooks, and a portal for low-volume use. Full schemas and examples in the developer documentation.

Read the API docs

FAQ

Common questions

How do I integrate the document parsing API?

POST a document to /api/v1/documents with the document_type parameter. The API returns a job ID for async processing or a synchronous response for small documents. Webhooks deliver results when ready. SDKs available for Python and JavaScript.

What document types are supported?

Bank statements, tax filings (BIR, SAT, IRS), payslips, audited financial statements, government IDs, business registrations, mobile money exports, e-wallet records, utility bills, invoices, loan agreements, vehicle and property titles. 50+ supported types with custom extraction for documents outside the list.

What is the output format?

JSON by default with a structured schema per document type. CSV and Excel exports are available for spreadsheet workflows. The full schema for each document type is published in the documentation.

How fast is the document parsing API?

Most documents parse in under 30 seconds. Batch and async modes are available for high-volume use. Latency budgets and SLAs are configurable for production deployments.

Can the document parsing API detect fraud?

Yes. Tampering, metadata inconsistencies, forgery patterns, and cross-document mismatches. Fraud signals are returned alongside the extracted data, calibrated per market because fraud patterns differ by country.

How accurate is the document parsing?

Around 98 percent end-to-end accuracy on supported document types. Per-field confidence is exposed in the API response so downstream systems can route low-confidence fields to human review.

Is the document parsing API secure?

Kita is in active engagement for ISO 27001 and SOC 2 Type II audits. Engagement letters available on request. Data residency options for the EU and other jurisdictions. Encryption in transit and at rest.

Parse any lending document. One API.

Get an API key and start parsing in under an hour.