Kita API
Upload, poll, deliver. The Kita API turns scanned bank statements, payslips, IDs, and tax filings into structured, validated, fraud-checked data — across PH, MX, and ID schemas. Upload a file, pass a URL, or send base64 — get back clean JSON. Supports webhooks for async processing.
https://api.usekita.comAuthBearer tokenFormatJSON / multipartMCP Server (Claude & Codex)
Plug the full Kita docs reference — every endpoint, every response envelope, every document-type schema — straight into Claude Code, Claude Desktop, or the Codex CLI via the official MCP server (kita-docs-mcp on npm). One command and your agent can search, fetch schemas, and cite the same content that renders on this page, with no browser round-trips.
Claude Code
claude mcp add kita-docs -- npx -y kita-docs-mcpTools exposed: search_docs(query), get_schema(document_type), and list_document_types().
Resources: every section available as docs://schemas/<type> (response envelopes) and docs://guides/<slug> (general docs).
Restart your CLI and run /mcp (Claude) or codex mcp ls (Codex) to confirm kita-docs is connected.
Quick Start
1. File Upload (multipart)
Upload a local file directly. The Python SDK handles polling automatically.
Terminal
pip install kitaPython
from kita import KitaClient
client = KitaClient(api_key="kita_prod_...")
# Process a document
result = client.process("statement.pdf", "bank_statement")
result.save_json("output.json")2. File URL (S3 presigned URL / any public URL)
Pass a file_url and Kita fetches the file server-side. Works with S3 presigned URLs, GCS signed URLs, or any publicly accessible link.
Python
from kita import KitaClient
client = KitaClient(api_key="kita_prod_YOUR_API_KEY")
result = client.process_url(
"https://your-bucket.s3.amazonaws.com/docs/invoice.pdf?X-Amz-...",
document_type="sales_invoice"
)
print(result.status, result.extracted_data)3. File URL + Webhook (fire-and-forget)
Add a webhook_url and the request returns immediately. Kita POSTs to your URL when processing completes or fails — no polling needed.
Python
from kita import KitaClient
client = KitaClient(api_key="kita_prod_YOUR_API_KEY")
# Fire-and-forget: returns immediately with document_id
result = client.process_url(
"https://your-bucket.s3.amazonaws.com/docs/invoice.pdf?X-Amz-...",
document_type="bank_statement",
webhook_url="https://your-server.com/webhooks/kita"
)
# Kita POSTs to your webhook_url when processing completes4. Poll for Results
If you're not using webhooks, poll the job endpoint until status is "completed" or "failed".
Python
# Poll for results (SDK handles this automatically with client.process)
result = client.get_result(document_id)
print(result.status) # "completed" | "processing" | "failed"
print(result.extracted_data)Supported file formats: .pdf, .png, .jpg, .jpeg, .tiff, .bmp
Common document types: bank_statement, sales_invoice, credit_report, payslip, bill, general, auto_classify, and more — see Document Types.
Authentication
Every request requires an API key passed in the Authorization header. Get your key from the Kita dashboard.
HTTP Header
Authorization: Bearer kita_prod_your_key_hereKeep your API key secret — never expose it in client-side code or public repositories. Use the KITA_API_KEY environment variable on your server.
Postman
Set up a Postman environment to test any endpoint interactively.
- 1Create an environment named
Kita APIwith variables:baseUrl,apiKey,documentId. - 2Set Auth type to
Bearer Token, value{{apiKey}}. - 3
POST {{baseUrl}}/api/v1/documents— form-data withfile(File) +document_type(Text), or JSON body withfile_url. - 4
GET {{baseUrl}}/api/v1/documents/jobs/{{documentId}}— poll untilstatus: "completed".
Environment Variables
{
"baseUrl": "https://api.usekita.com",
"apiKey": "kita_prod_your_key_here",
"documentId": "12345"
}Document Types
Pass the document_type field with any upload. Values are case-insensitive.
| Type | What it extracts |
|---|---|
| Legacy pipeline (full structured extraction) | |
bank_statement | Transactions, balances, flow metrics, category breakdowns, per-row tamper detection, fraud signals |
audited_financial_statement | AFS — balance sheet, income statement, cash flow, notes/schedules, ratios, completeness scoring, risk flags |
general_information_sheet | PH SEC GIS — company info, stockholders, directors, officers, beneficial owners, compliance |
| Universal pipeline — Financial | |
bank_certificate | Bank certification — account holder, balances, certification date |
credit_card_statement | Transactions, balances, payment info |
loan_statement | Loan details, payment schedule, balances |
mobile_banking_screenshot | App-screenshot statements — account snapshot, recent transactions |
passbook | Bank passbook — extraction + fraud only (no credit assessment) |
payslip | Earnings, deductions, net pay, employer info, statutory IDs, fraud score |
business_financials | Informal / management financials — line items, totals |
bill | Provider, amounts, due dates, verification signals |
receipt | Receipt (POS) — items, totals, payment method |
remittance_slip | Remittance details, amounts |
sales_invoice | Seller/buyer, line items, totals, VAT, invoice signals |
| Universal pipeline — Identity & Employment | |
government_id | Government-issued ID — name, ID number, dates, photo (passport, INE, KTP, driver's license, etc.) |
certificate_of_employment | Employer, employee, position, dates, salary |
barangay_clearance | PH Barangay Clearance — residency, purpose |
| Universal pipeline — Credit & Regional Legal | |
credit_report | Accounts, KYC, bureau score, payment history, credit metrics — extraction + fraud only (no credit assessment) |
slik | Indonesian SLIK (OJK) credit report — debtor profile, facilities, collateral, collectibility |
acta_constitutiva | MX Articles of Incorporation — shareholders, capital, legal reps, notarial info |
mx_legal | Other Mexican legal documents (poderes, actas de asamblea, constancia situación fiscal, etc.) |
indo_legal | Indonesian legal documents (akta pendirian/perubahan, SK Kemenkumham, NIB, etc.) |
| Universal pipeline — Generic & Combined | |
combined_document | Multi-document bundle — auto-segmented per page |
other_document | Generic catch-all — { document_data, labeled_fields, signatories } envelope. Used as the fallback for any document_type slug without a dedicated vocabulary. |
| Experimental / fallback slugs | |
| Accepted at upload time but route through other_document — output is the generic envelope, not a type-specific schema. | |
bir_2303 | Generic fallback (no dedicated vocabulary yet). |
bir_2307 | Generic fallback. |
tin_id | Generic fallback. |
secretarys_certificate | Generic fallback. |
business_registration_dti | Generic fallback. |
business_registration_sec | Generic fallback. |
certificate_of_incorporation | Generic fallback. |
business_permit | Generic fallback. |
mayors_permit | Generic fallback. |
purchase_order | Generic fallback. |
bill_of_lading | Generic fallback. |
proof_of_billing | Generic fallback. |
land_title | Generic fallback. |
vehicle_registration | Generic fallback. |
insurance_policy | Generic fallback. |
loan_agreement | Generic fallback. |
income_tax_return | Generic fallback. |
Supported file formats: .pdf, .png, .jpg, .jpeg, .tiff, .bmp
Endpoints
All endpoints are authenticated via Authorization: Bearer <key>. Responses are JSON unless otherwise noted.
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/documents | Upload a file, base64, or URL for processing (supports webhook_url) |
| GET | /api/v1/documents/jobs/{document_id} | Retrieve full result by document ID |
| GET | /api/documents/{id}/summary | 48 summary metrics (bank statements) |
| GET | /api/v1/documents/{id}/custom-export | Download org-configured Excel export |
| GET | /api/v1/documents/{id}/export | Schema-based Excel export (audited_financial_statement, credit report, SLIK) |
| GET | /api/v1/documents | List documents with optional filters |
| POST | /api/v1/verify | Cross-document verification across 2–50 processed documents |
| POST | /api/v1/batch | Create a batch job or process from URL |
| GET | /api/v1/batch/{batch_id} | Check batch status and progress |
| GET | /api/v1/batch/{batch_id}/results | Get full results for all batch documents |
| POST | /api/v1/documents/merge | Merge multiple files into one document for processing |
| PUT | /api/v1/documents/{id}/transactions | Edit transactions on a processed document |
| POST | /api/v1/documents/{id}/transactions/revert | Restore original extracted transactions |
| POST | /api/v1/documents/{id}/transactions/revalidate | Re-run validation and metrics after edits |
/api/v1/documentsThree input methods — provide exactly one of file, file_base64, or file_url.
| Parameter | Required | Description |
|---|---|---|
file | * | Multipart file upload (PDF, PNG, JPG) |
file_base64 | * | Base64-encoded file content (data URI prefix accepted) |
file_url | * | Public or S3 presigned URL to fetch the file from |
filename | † | Required with file_base64 and file_url |
document_type | Yes | Document type (see table above) |
password | No | PDF password if encrypted |
webhook_url | No | URL to receive a POST when processing completes or fails |
file, file_base64, or file_url. † filename is inferred from multipart upload but required for base64 and URL inputs.Option A: File upload (multipart)
Python
result = client.process(
"statement.pdf",
"bank_statement",
wait=True, # Wait for completion (default: True)
poll_interval=2, # Seconds between status checks
timeout=600, # Max wait time in seconds
password=None, # PDF password if encrypted
show_progress=True # Show progress spinner
)/api/v1/documents/jobs/{document_id}Python
result = client.get_result(12345)
print(result.metadata)
result.save_json("output.json")Transaction Editing
After a document is processed, you can programmatically edit its transactions, restore the originals, or re-run validation and metrics on edited data. This is useful for correcting OCR errors, re-categorizing transactions, or adjusting values before generating reports.
Transaction editing is available for bank_statement and passbook document types. After editing, call /revalidate to re-compute metrics and fraud signals against the updated transactions.
/api/v1/documents/{id}/transactionsReplace the transaction array on a processed document. Pass the full updated list — this is a full replacement, not a patch.
Python
updated = client.edit_transactions(12345, transactions=[
{
"date": "01-02-2024",
"description": "SALARY CREDIT",
"credit": 30000.00,
"debit": None,
"balance": 80000.00,
"category": "income",
"subcategory": "salary",
"transaction_type": "credit",
},
])
print(updated.status, updated.transactions_count)/api/v1/documents/{id}/transactions/revertDiscard all edits and restore the original extracted transactions. This also resets metrics and validation to their original state.
Python
reverted = client.revert_transactions(12345)
print(reverted.status, reverted.transactions_count)/api/v1/documents/{id}/transactions/revalidateRe-run validation checks and re-compute all metrics (inflow/outflow, category breakdowns, extended metrics) and fraud signals against the current transaction data. Call this after editing transactions.
Python
result = client.revalidate_transactions(12345)
print(result.status)
print(result.metrics) # re-computed metricsTypical workflow: PUT /transactions to edit → POST /revalidate to re-compute → GET /results/{id} to fetch updated output. Use POST /revert at any time to undo all edits.
Python SDK
Install with pip install kita. The SDK wraps all HTTP endpoints, handles polling, and returns typed DocumentResult objects.
Python
# Pass directly
client = KitaClient(api_key="kita_prod_...")
# Or set environment variable
# export KITA_API_KEY=kita_prod_...
client = KitaClient()Serialize results with result.to_dict(), result.to_json(), or result.save_json(path).
Response Schemas
Philippines (PH) Document Schemas
Region-specific document types for the Philippine market. Uses the same POST /api/v1/documents endpoint — just pass the PH document_type.
Mexico (MX) Document Schemas
Region-specific document types for the Mexican market. Uses the same POST /api/v1/documents endpoint — just pass the Mexican document_type.
Indonesia (ID) Document Schemas
Region-specific document types for the Indonesian market. Uses the same POST /api/v1/documents endpoint — just pass the Indonesian document_type.
Download Export
Download Excel exports of processed documents. Two export modes are available: org-configured exports (all document types) and schema-based exports (audited_financial_statement, credit report, SLIK, and other schema-based types).
| Endpoint | Use case |
|---|---|
| /api/v1/documents/{id}/custom-export | Org-configured Excel format (requires a custom export schema on your organization) |
| /api/v1/documents/{id}/export | Schema-based multi-sheet Excel (audited_financial_statement, credit report, SLIK, etc.) |
/api/v1/documents/{id}/exportDownload a multi-sheet Excel workbook using the document's type-specific schema. For audited_financial_statement documents, this generates a 17-sheet workbook covering company info, balance sheet, income statement, cash flow, notes, ratios, risk flags, data validation, prior year comparison, shareholders, equity, tax, and more.
Python
# Org-configured export
client.custom_export(result.document_id, "report.xlsx")
# Credit report multi-sheet export
client.custom_export(result.document_id, "credit.xlsx", export_type="credit_report")
# Schema-based export (audited_financial_statement 17-sheet workbook, credit report, SLIK, etc.)
client.custom_export(result.document_id, "financial_report.xlsx", export_type="schema")If the document type has no export schema, the /export endpoint returns 400 with error code INVALID_EXPORT_TYPE. If the document hasn't finished processing, it returns DOCUMENT_NOT_PROCESSED.
Webhooks
Instead of polling GET /api/v1/documents/jobs/{document_id}, pass a webhook_url when uploading. Kita will POST to that URL when processing completes or fails.
webhook_url is accepted on all three input methods (file, file_base64, file_url). The upload returns immediately with document_id + status: "processing".
Payload: document.completed
The completed payload includes extracted_data inline so you can process results immediately, plus a result_url to fetch the full result later.
JSON
{
"event": "document.completed",
"document_id": 12345,
"status": "completed",
"document_type": "bank_statement",
"file_name": "statement.pdf",
"processing_time_seconds": 12.5,
"extracted_data": { "..." : "..." },
"result_url": "/api/v1/documents/jobs/12345"
}Payload: document.failed
JSON
{
"event": "document.failed",
"document_id": 12345,
"status": "failed",
"error": "Password-protected PDF — provide password parameter",
"document_type": "bank_statement",
"file_name": "statement.pdf",
"failed_at": "2025-06-15T10:30:05Z"
}S3 + webhook (fire-and-forget)
Python
result = client.process_url(
"https://my-bucket.s3.amazonaws.com/docs/statement.pdf?X-Amz-...",
"bank_statement",
webhook_url="https://api.yourapp.com/webhooks/kita"
)
# Returns immediately with document_id + status="processing"
# Kita POSTs to your webhook_url when doneRegistered HMAC webhooks
For organization-wide async delivery, register a webhook endpoint once and Kita will sign every delivery with HMAC-SHA256 of the raw request body using your webhook's secret. Verify both the X-Kita-Signature header and the t= timestamp (default 5-minute tolerance) to defend against replays.
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/webhooks | Register a webhook endpoint |
| GET | /api/v1/webhooks | List registered webhooks |
| GET | /api/v1/webhooks/{id} | Get a single webhook |
| PATCH | /api/v1/webhooks/{id} | Update URL, events, or active flag |
| DELETE | /api/v1/webhooks/{id} | Delete a webhook |
| POST | /api/v1/webhooks/{id}/test | Send a test delivery |
| POST | /api/v1/webhooks/{id}/rotate-secret | Rotate the signing secret |
| GET | /api/v1/webhooks/{id}/secret | Retrieve the current signing secret |
| GET | /api/v1/webhooks/{id}/deliveries | Delivery history (status, attempts, latency) |
| GET | /api/v1/webhooks/stats | Delivery aggregates |
| GET | /api/v1/webhooks/dlq | Dead-letter queue (deliveries that failed every retry) |
| POST | /api/v1/webhooks/validate-url | Pre-flight check a candidate URL |
| POST | /api/v1/webhooks/verify-signature | Helper to verify a sample payload |
Batch Processing
Process up to 100 documents in a single request. Requires a paid plan. Use POST /api/v1/batch to create a job, then poll GET /api/v1/batch/{batch_id} for status.
HTTP API
Create a batch job with document URLs, then poll for progress and retrieve results.
cURL
curl -X POST https://api.usekita.com/api/v1/batch \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{ "file_url": "https://example.com/stmt1.pdf", "document_type": "bank_statement" },
{ "file_url": "https://example.com/stmt2.pdf", "document_type": "bank_statement" },
{ "file_url": "https://example.com/payslip.pdf", "document_type": "payslip" }
]
}'| Method | Path | Description |
|---|---|---|
| POST | /api/v1/batch | Create batch — pass documents array with file_url + document_type |
| GET | /api/v1/batch/{batch_id} | Poll status, progress_percent, per-document state |
| GET | /api/v1/batch/{batch_id}/results | Get full extracted data for all documents |
Python SDK
Three input modes — toggle between them. All return the same result structure.
From Folder
batch = client.batch_process(
"/path/to/statements",
"bank_statement",
extensions=['.pdf', '.png', '.jpg'], # Default file types
recursive=False, # Search subdirectories
max_workers=5 # Parallel upload threads
)
results = batch.results() # {filepath: DocumentResult}
for filepath, result in results.items():
print(f"{filepath}: {result.status}")
result.save_json(f"{filepath}_output.json")From Folder accepts extensions (default ['.pdf','.png','.jpg']), recursive (default False), and max_workers (default 5). From Base64 requires file_base64, filename, and document_type per item.
Merge Documents
Combine multiple images or PDFs into a single document before processing. Useful for multi-page documents scanned as separate images (e.g. a bank statement photographed page by page). Each file in the files list can specify a file_url, file_path (SDK only, auto-converted to base64), or file_base64 — and they can be mixed freely.
HTTP API
/api/v1/documents/mergeSend a JSON body with an array of files to merge and process as a single document. The API is deployed on AWS (ALB) and proxied through Vercel rewrites from api.usekita.com.
| Field | Type | Required | Description |
|---|---|---|---|
files | array | Yes | Array of objects, each with a file_url, file_path, or file_base64 key |
document_type | string | Yes | Target document type for the merged result (e.g. bank_statement) |
output_filename | string | No | Custom filename for the merged PDF (default: auto-generated) |
cURL
curl -X POST https://api.usekita.com/api/v1/documents/merge \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"files": [
{ "file_url": "https://example.com/page1.pdf" },
{ "file_url": "https://example.com/page2.pdf" }
],
"document_type": "bank_statement",
"output_filename": "merged_document.pdf"
}'Python SDK
The SDK merge_documents method accepts mixed input types — URLs, local file paths, and base64 strings — in a single call.
Python
result = client.merge_documents(
files=[
{"file_url": "https://example.com/cover.pdf"},
{"file_path": "/path/to/scan.png"},
{"file_base64": "iVBORw0KGgo...", "filename": "page3.png"},
],
document_type="bank_statement",
)The merged file is processed as a single document. Poll GET /api/v1/documents/jobs/{document_id} with the returned document_id to retrieve the extracted result.
Cross-Document Verification
Run 34+ cross-document checks against documents that have already been processed. Useful for verifying that a credit application's payslip, bank statement, and ID line up with each other. Returns per-document authenticity summaries, a 0–100 cross-document consistency score, and field-by-field corroboration.
/api/v1/verify| Field | Type | Required | Description |
|---|---|---|---|
document_ids | number[] | Yes | 2–50 IDs of already-completed documents belonging to your org |
cURL
curl -X POST https://api.usekita.com/api/v1/verify \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{ "document_ids": [101, 102, 103] }'POST /api/v1/verify/single returns the full fraud_detection block for a single document, no cross-document analysis.
Folders & Schemas
Two organizational primitives sit under v1: folders for grouping documents (e.g. by applicant), and schemas for defining custom extraction shapes that override the default vocabulary output.
Folders
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/folders | Create a folder |
| GET | /api/v1/folders | List folders |
| GET | /api/v1/folders/{id}/documents | List documents in a folder |
Custom Extraction Schemas
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/schemas | List custom schemas |
| GET | /api/v1/schemas/{id} | Get a single schema |
| POST | /api/v1/schemas | Create a custom schema |
| PUT | /api/v1/schemas/{id} | Update a schema |
| DELETE | /api/v1/schemas/{id} | Delete a schema |
Regions & Data Residency
Kita runs two production deployments. The region used for a request is determined by your organization's configured residency — not by the API host. API key auth, request shape, and response shape are identical across regions.
| Region | Default for | Storage | Database |
|---|---|---|---|
| ap-southeast-1 | All organizations unless flagged for MX residency | Supabase Storage | Supabase (managed Postgres) |
| mx-central-1 | Mexican-residency customers (regulated) | Amazon S3 (MX bucket) | Amazon RDS |
Documents uploaded by an MX-residency organization stay in mx-central-1 storage and database for their entire lifecycle. Contact support@usekita.com to enable MX residency on your organization.
Cost Reporting
Every processed document records a per-document cost in USD, captured from the underlying model provider's billed usage. Use this to attribute spend per applicant, per batch, or per document type.
| Field | Where | Description |
|---|---|---|
total_cost_usd | document result | Total billed cost for the document (NUMERIC(10,6)) |
cost_report | document result | Per-call breakdown (model, prompt/completion tokens, USD per call, stage) |
total_cost_usd | batch job | Roll-up across all documents in a batch (NUMERIC(10,4)) |
Costs are recorded for both the universal pipeline (Stage 1 vision, optional Stage 2 templated extraction, Stage 3 dedup) and the legacy bank_statement / audited_financial_statement / general_information_sheet extractors. They surface on GET /api/v1/documents/jobs/{document_id} and on batch result endpoints.
Error Handling
HTTP Status Codes
| Status | Meaning | What to do |
|---|---|---|
200 | Success | Parse the JSON response |
202 | Accepted | Document queued — poll for results |
400 | Bad Request | Check your request body / parameters |
401 | Unauthorized | Invalid or missing API key |
403 | Forbidden | Upgrade required, or document type not enabled for your org |
404 | Not Found | Document or batch ID doesn't exist |
429 | Rate Limited | Wait and retry. Check Retry-After header. |
500 | Server Error | Retry after a brief delay |
Error Response Format
JSON
{
"error": "Bad Request",
"message": "documents array is required and must not be empty"
}Python Exceptions
Python
from kita import (
KitaClient,
KitaError,
KitaAPIError,
KitaAuthenticationError,
KitaRateLimitError
)
try:
result = client.process("doc.pdf", "bank_statement")
except KitaAuthenticationError:
print("Invalid API key")
except KitaRateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except KitaAPIError as e:
print(f"API Error {e.status_code}: {e.message}")
except KitaError as e:
print(f"SDK Error: {e}")Questions? Email support@usekita.com
