DocsREST + Python SDK + portal. Same credit balance across all three.Open the portal
REST API

Kita API

Upload, poll, deliver. The Kita API turns scanned bank statements, payslips, IDs, and tax filings into structured, validated, fraud-checked data — across PH, MX, and ID schemas. Upload a file, pass a URL, or send base64 — get back clean JSON. Supports webhooks for async processing.

Base URLhttps://api.usekita.comAuthBearer tokenFormatJSON / multipart

MCP Server (Claude & Codex)

Plug the full Kita docs reference — every endpoint, every response envelope, every document-type schema — straight into Claude Code, Claude Desktop, or the Codex CLI via the official MCP server (kita-docs-mcp on npm). One command and your agent can search, fetch schemas, and cite the same content that renders on this page, with no browser round-trips.

Claude Code

claude mcp add kita-docs -- npx -y kita-docs-mcp

Tools exposed: search_docs(query), get_schema(document_type), and list_document_types().

Resources: every section available as docs://schemas/<type> (response envelopes) and docs://guides/<slug> (general docs).

Restart your CLI and run /mcp (Claude) or codex mcp ls (Codex) to confirm kita-docs is connected.

Source: npm · GitHub

Quick Start

1. File Upload (multipart)

Upload a local file directly. The Python SDK handles polling automatically.

Terminal

pip install kita

Python

from kita import KitaClient

client = KitaClient(api_key="kita_prod_...")

# Process a document
result = client.process("statement.pdf", "bank_statement")
result.save_json("output.json")

2. File URL (S3 presigned URL / any public URL)

Pass a file_url and Kita fetches the file server-side. Works with S3 presigned URLs, GCS signed URLs, or any publicly accessible link.

Python

from kita import KitaClient

client = KitaClient(api_key="kita_prod_YOUR_API_KEY")

result = client.process_url(
    "https://your-bucket.s3.amazonaws.com/docs/invoice.pdf?X-Amz-...",
    document_type="sales_invoice"
)
print(result.status, result.extracted_data)

3. File URL + Webhook (fire-and-forget)

Add a webhook_url and the request returns immediately. Kita POSTs to your URL when processing completes or fails — no polling needed.

Python

from kita import KitaClient

client = KitaClient(api_key="kita_prod_YOUR_API_KEY")

# Fire-and-forget: returns immediately with document_id
result = client.process_url(
    "https://your-bucket.s3.amazonaws.com/docs/invoice.pdf?X-Amz-...",
    document_type="bank_statement",
    webhook_url="https://your-server.com/webhooks/kita"
)
# Kita POSTs to your webhook_url when processing completes

4. Poll for Results

If you're not using webhooks, poll the job endpoint until status is "completed" or "failed".

Python

# Poll for results (SDK handles this automatically with client.process)
result = client.get_result(document_id)
print(result.status)          # "completed" | "processing" | "failed"
print(result.extracted_data)

Supported file formats: .pdf, .png, .jpg, .jpeg, .tiff, .bmp

Common document types: bank_statement, sales_invoice, credit_report, payslip, bill, general, auto_classify, and more — see Document Types.

Authentication

Every request requires an API key passed in the Authorization header. Get your key from the Kita dashboard.

HTTP Header

Authorization: Bearer kita_prod_your_key_here

Keep your API key secret — never expose it in client-side code or public repositories. Use the KITA_API_KEY environment variable on your server.

Postman

Set up a Postman environment to test any endpoint interactively.

  1. 1Create an environment named Kita API with variables: baseUrl, apiKey, documentId.
  2. 2Set Auth type to Bearer Token, value {{apiKey}}.
  3. 3POST {{baseUrl}}/api/v1/documents — form-data with file (File) + document_type (Text), or JSON body with file_url.
  4. 4GET {{baseUrl}}/api/v1/documents/jobs/{{documentId}} — poll until status: "completed".

Environment Variables

{
  "baseUrl": "https://api.usekita.com",
  "apiKey": "kita_prod_your_key_here",
  "documentId": "12345"
}

Document Types

Pass the document_type field with any upload. Values are case-insensitive.

TypeWhat it extracts
Legacy pipeline (full structured extraction)
bank_statementTransactions, balances, flow metrics, category breakdowns, per-row tamper detection, fraud signals
audited_financial_statementAFS — balance sheet, income statement, cash flow, notes/schedules, ratios, completeness scoring, risk flags
general_information_sheetPH SEC GIS — company info, stockholders, directors, officers, beneficial owners, compliance
Universal pipeline — Financial
bank_certificateBank certification — account holder, balances, certification date
credit_card_statementTransactions, balances, payment info
loan_statementLoan details, payment schedule, balances
mobile_banking_screenshotApp-screenshot statements — account snapshot, recent transactions
passbookBank passbook — extraction + fraud only (no credit assessment)
payslipEarnings, deductions, net pay, employer info, statutory IDs, fraud score
business_financialsInformal / management financials — line items, totals
billProvider, amounts, due dates, verification signals
receiptReceipt (POS) — items, totals, payment method
remittance_slipRemittance details, amounts
sales_invoiceSeller/buyer, line items, totals, VAT, invoice signals
Universal pipeline — Identity & Employment
government_idGovernment-issued ID — name, ID number, dates, photo (passport, INE, KTP, driver's license, etc.)
certificate_of_employmentEmployer, employee, position, dates, salary
barangay_clearancePH Barangay Clearance — residency, purpose
Universal pipeline — Credit & Regional Legal
credit_reportAccounts, KYC, bureau score, payment history, credit metrics — extraction + fraud only (no credit assessment)
slikIndonesian SLIK (OJK) credit report — debtor profile, facilities, collateral, collectibility
acta_constitutivaMX Articles of Incorporation — shareholders, capital, legal reps, notarial info
mx_legalOther Mexican legal documents (poderes, actas de asamblea, constancia situación fiscal, etc.)
indo_legalIndonesian legal documents (akta pendirian/perubahan, SK Kemenkumham, NIB, etc.)
Universal pipeline — Generic & Combined
combined_documentMulti-document bundle — auto-segmented per page
other_documentGeneric catch-all — { document_data, labeled_fields, signatories } envelope. Used as the fallback for any document_type slug without a dedicated vocabulary.
Experimental / fallback slugs
Accepted at upload time but route through other_document — output is the generic envelope, not a type-specific schema.
bir_2303Generic fallback (no dedicated vocabulary yet).
bir_2307Generic fallback.
tin_idGeneric fallback.
secretarys_certificateGeneric fallback.
business_registration_dtiGeneric fallback.
business_registration_secGeneric fallback.
certificate_of_incorporationGeneric fallback.
business_permitGeneric fallback.
mayors_permitGeneric fallback.
purchase_orderGeneric fallback.
bill_of_ladingGeneric fallback.
proof_of_billingGeneric fallback.
land_titleGeneric fallback.
vehicle_registrationGeneric fallback.
insurance_policyGeneric fallback.
loan_agreementGeneric fallback.
income_tax_returnGeneric fallback.

Supported file formats: .pdf, .png, .jpg, .jpeg, .tiff, .bmp

Endpoints

All endpoints are authenticated via Authorization: Bearer <key>. Responses are JSON unless otherwise noted.

MethodPathDescription
POST/api/v1/documentsUpload a file, base64, or URL for processing (supports webhook_url)
GET/api/v1/documents/jobs/{document_id}Retrieve full result by document ID
GET/api/documents/{id}/summary48 summary metrics (bank statements)
GET/api/v1/documents/{id}/custom-exportDownload org-configured Excel export
GET/api/v1/documents/{id}/exportSchema-based Excel export (audited_financial_statement, credit report, SLIK)
GET/api/v1/documentsList documents with optional filters
POST/api/v1/verifyCross-document verification across 2–50 processed documents
POST/api/v1/batchCreate a batch job or process from URL
GET/api/v1/batch/{batch_id}Check batch status and progress
GET/api/v1/batch/{batch_id}/resultsGet full results for all batch documents
POST/api/v1/documents/mergeMerge multiple files into one document for processing
PUT/api/v1/documents/{id}/transactionsEdit transactions on a processed document
POST/api/v1/documents/{id}/transactions/revertRestore original extracted transactions
POST/api/v1/documents/{id}/transactions/revalidateRe-run validation and metrics after edits
POST/api/v1/documents

Three input methods — provide exactly one of file, file_base64, or file_url.

ParameterRequiredDescription
file*Multipart file upload (PDF, PNG, JPG)
file_base64*Base64-encoded file content (data URI prefix accepted)
file_url*Public or S3 presigned URL to fetch the file from
filenameRequired with file_base64 and file_url
document_typeYesDocument type (see table above)
passwordNoPDF password if encrypted
webhook_urlNoURL to receive a POST when processing completes or fails
* Provide exactly one of file, file_base64, or file_url.  † filename is inferred from multipart upload but required for base64 and URL inputs.

Option A: File upload (multipart)

Python

result = client.process(
    "statement.pdf",
    "bank_statement",
    wait=True,            # Wait for completion (default: True)
    poll_interval=2,      # Seconds between status checks
    timeout=600,          # Max wait time in seconds
    password=None,        # PDF password if encrypted
    show_progress=True    # Show progress spinner
)
GET/api/v1/documents/jobs/{document_id}

Python

result = client.get_result(12345)
print(result.metadata)
result.save_json("output.json")

Transaction Editing

After a document is processed, you can programmatically edit its transactions, restore the originals, or re-run validation and metrics on edited data. This is useful for correcting OCR errors, re-categorizing transactions, or adjusting values before generating reports.

Transaction editing is available for bank_statement and passbook document types. After editing, call /revalidate to re-compute metrics and fraud signals against the updated transactions.

PUT/api/v1/documents/{id}/transactions

Replace the transaction array on a processed document. Pass the full updated list — this is a full replacement, not a patch.

Python

updated = client.edit_transactions(12345, transactions=[
    {
        "date": "01-02-2024",
        "description": "SALARY CREDIT",
        "credit": 30000.00,
        "debit": None,
        "balance": 80000.00,
        "category": "income",
        "subcategory": "salary",
        "transaction_type": "credit",
    },
])
print(updated.status, updated.transactions_count)
POST/api/v1/documents/{id}/transactions/revert

Discard all edits and restore the original extracted transactions. This also resets metrics and validation to their original state.

Python

reverted = client.revert_transactions(12345)
print(reverted.status, reverted.transactions_count)
POST/api/v1/documents/{id}/transactions/revalidate

Re-run validation checks and re-compute all metrics (inflow/outflow, category breakdowns, extended metrics) and fraud signals against the current transaction data. Call this after editing transactions.

Python

result = client.revalidate_transactions(12345)
print(result.status)
print(result.metrics)  # re-computed metrics

Typical workflow: PUT /transactions to edit → POST /revalidate to re-compute → GET /results/{id} to fetch updated output. Use POST /revert at any time to undo all edits.

Python SDK

Install with pip install kita. The SDK wraps all HTTP endpoints, handles polling, and returns typed DocumentResult objects.

Python

# Pass directly
client = KitaClient(api_key="kita_prod_...")

# Or set environment variable
# export KITA_API_KEY=kita_prod_...
client = KitaClient()

Serialize results with result.to_dict(), result.to_json(), or result.save_json(path).


Response Schemas

Philippines (PH) Document Schemas

PH

Region-specific document types for the Philippine market. Uses the same POST /api/v1/documents endpoint — just pass the PH document_type.

Mexico (MX) Document Schemas

MX

Region-specific document types for the Mexican market. Uses the same POST /api/v1/documents endpoint — just pass the Mexican document_type.

Indonesia (ID) Document Schemas

ID

Region-specific document types for the Indonesian market. Uses the same POST /api/v1/documents endpoint — just pass the Indonesian document_type.


Download Export

Download Excel exports of processed documents. Two export modes are available: org-configured exports (all document types) and schema-based exports (audited_financial_statement, credit report, SLIK, and other schema-based types).

EndpointUse case
/api/v1/documents/{id}/custom-exportOrg-configured Excel format (requires a custom export schema on your organization)
/api/v1/documents/{id}/exportSchema-based multi-sheet Excel (audited_financial_statement, credit report, SLIK, etc.)
GET/api/v1/documents/{id}/export

Download a multi-sheet Excel workbook using the document's type-specific schema. For audited_financial_statement documents, this generates a 17-sheet workbook covering company info, balance sheet, income statement, cash flow, notes, ratios, risk flags, data validation, prior year comparison, shareholders, equity, tax, and more.

Python

# Org-configured export
client.custom_export(result.document_id, "report.xlsx")

# Credit report multi-sheet export
client.custom_export(result.document_id, "credit.xlsx", export_type="credit_report")

# Schema-based export (audited_financial_statement 17-sheet workbook, credit report, SLIK, etc.)
client.custom_export(result.document_id, "financial_report.xlsx", export_type="schema")

If the document type has no export schema, the /export endpoint returns 400 with error code INVALID_EXPORT_TYPE. If the document hasn't finished processing, it returns DOCUMENT_NOT_PROCESSED.

Webhooks

Instead of polling GET /api/v1/documents/jobs/{document_id}, pass a webhook_url when uploading. Kita will POST to that URL when processing completes or fails.

webhook_url is accepted on all three input methods (file, file_base64, file_url). The upload returns immediately with document_id + status: "processing".

Payload: document.completed

The completed payload includes extracted_data inline so you can process results immediately, plus a result_url to fetch the full result later.

JSON

{
  "event": "document.completed",
  "document_id": 12345,
  "status": "completed",
  "document_type": "bank_statement",
  "file_name": "statement.pdf",
  "processing_time_seconds": 12.5,
  "extracted_data": { "..." : "..." },
  "result_url": "/api/v1/documents/jobs/12345"
}

Payload: document.failed

JSON

{
  "event": "document.failed",
  "document_id": 12345,
  "status": "failed",
  "error": "Password-protected PDF — provide password parameter",
  "document_type": "bank_statement",
  "file_name": "statement.pdf",
  "failed_at": "2025-06-15T10:30:05Z"
}

S3 + webhook (fire-and-forget)

Python

result = client.process_url(
    "https://my-bucket.s3.amazonaws.com/docs/statement.pdf?X-Amz-...",
    "bank_statement",
    webhook_url="https://api.yourapp.com/webhooks/kita"
)
# Returns immediately with document_id + status="processing"
# Kita POSTs to your webhook_url when done

Registered HMAC webhooks

For organization-wide async delivery, register a webhook endpoint once and Kita will sign every delivery with HMAC-SHA256 of the raw request body using your webhook's secret. Verify both the X-Kita-Signature header and the t= timestamp (default 5-minute tolerance) to defend against replays.

MethodPathDescription
POST/api/v1/webhooksRegister a webhook endpoint
GET/api/v1/webhooksList registered webhooks
GET/api/v1/webhooks/{id}Get a single webhook
PATCH/api/v1/webhooks/{id}Update URL, events, or active flag
DELETE/api/v1/webhooks/{id}Delete a webhook
POST/api/v1/webhooks/{id}/testSend a test delivery
POST/api/v1/webhooks/{id}/rotate-secretRotate the signing secret
GET/api/v1/webhooks/{id}/secretRetrieve the current signing secret
GET/api/v1/webhooks/{id}/deliveriesDelivery history (status, attempts, latency)
GET/api/v1/webhooks/statsDelivery aggregates
GET/api/v1/webhooks/dlqDead-letter queue (deliveries that failed every retry)
POST/api/v1/webhooks/validate-urlPre-flight check a candidate URL
POST/api/v1/webhooks/verify-signatureHelper to verify a sample payload

Batch Processing

Process up to 100 documents in a single request. Requires a paid plan. Use POST /api/v1/batch to create a job, then poll GET /api/v1/batch/{batch_id} for status.

HTTP API

Create a batch job with document URLs, then poll for progress and retrieve results.

cURL

curl -X POST https://api.usekita.com/api/v1/batch \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      { "file_url": "https://example.com/stmt1.pdf", "document_type": "bank_statement" },
      { "file_url": "https://example.com/stmt2.pdf", "document_type": "bank_statement" },
      { "file_url": "https://example.com/payslip.pdf", "document_type": "payslip" }
    ]
  }'
MethodPathDescription
POST/api/v1/batchCreate batch — pass documents array with file_url + document_type
GET/api/v1/batch/{batch_id}Poll status, progress_percent, per-document state
GET/api/v1/batch/{batch_id}/resultsGet full extracted data for all documents

Python SDK

Three input modes — toggle between them. All return the same result structure.

From Folder

batch = client.batch_process(
    "/path/to/statements",
    "bank_statement",
    extensions=['.pdf', '.png', '.jpg'],  # Default file types
    recursive=False,                       # Search subdirectories
    max_workers=5                          # Parallel upload threads
)

results = batch.results()  # {filepath: DocumentResult}

for filepath, result in results.items():
    print(f"{filepath}: {result.status}")
    result.save_json(f"{filepath}_output.json")

From Folder accepts extensions (default ['.pdf','.png','.jpg']), recursive (default False), and max_workers (default 5). From Base64 requires file_base64, filename, and document_type per item.

Merge Documents

Combine multiple images or PDFs into a single document before processing. Useful for multi-page documents scanned as separate images (e.g. a bank statement photographed page by page). Each file in the files list can specify a file_url, file_path (SDK only, auto-converted to base64), or file_base64 — and they can be mixed freely.

HTTP API

POST/api/v1/documents/merge

Send a JSON body with an array of files to merge and process as a single document. The API is deployed on AWS (ALB) and proxied through Vercel rewrites from api.usekita.com.

FieldTypeRequiredDescription
filesarrayYesArray of objects, each with a file_url, file_path, or file_base64 key
document_typestringYesTarget document type for the merged result (e.g. bank_statement)
output_filenamestringNoCustom filename for the merged PDF (default: auto-generated)

cURL

curl -X POST https://api.usekita.com/api/v1/documents/merge \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "files": [
      { "file_url": "https://example.com/page1.pdf" },
      { "file_url": "https://example.com/page2.pdf" }
    ],
    "document_type": "bank_statement",
    "output_filename": "merged_document.pdf"
  }'

Python SDK

The SDK merge_documents method accepts mixed input types — URLs, local file paths, and base64 strings — in a single call.

Python

result = client.merge_documents(
    files=[
        {"file_url": "https://example.com/cover.pdf"},
        {"file_path": "/path/to/scan.png"},
        {"file_base64": "iVBORw0KGgo...", "filename": "page3.png"},
    ],
    document_type="bank_statement",
)

The merged file is processed as a single document. Poll GET /api/v1/documents/jobs/{document_id} with the returned document_id to retrieve the extracted result.

Cross-Document Verification

Run 34+ cross-document checks against documents that have already been processed. Useful for verifying that a credit application's payslip, bank statement, and ID line up with each other. Returns per-document authenticity summaries, a 0–100 cross-document consistency score, and field-by-field corroboration.

POST/api/v1/verify
FieldTypeRequiredDescription
document_idsnumber[]Yes2–50 IDs of already-completed documents belonging to your org

cURL

curl -X POST https://api.usekita.com/api/v1/verify \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "document_ids": [101, 102, 103] }'

POST /api/v1/verify/single returns the full fraud_detection block for a single document, no cross-document analysis.

Folders & Schemas

Two organizational primitives sit under v1: folders for grouping documents (e.g. by applicant), and schemas for defining custom extraction shapes that override the default vocabulary output.

Folders

MethodPathDescription
POST/api/v1/foldersCreate a folder
GET/api/v1/foldersList folders
GET/api/v1/folders/{id}/documentsList documents in a folder

Custom Extraction Schemas

MethodPathDescription
GET/api/v1/schemasList custom schemas
GET/api/v1/schemas/{id}Get a single schema
POST/api/v1/schemasCreate a custom schema
PUT/api/v1/schemas/{id}Update a schema
DELETE/api/v1/schemas/{id}Delete a schema

Regions & Data Residency

Kita runs two production deployments. The region used for a request is determined by your organization's configured residency — not by the API host. API key auth, request shape, and response shape are identical across regions.

RegionDefault forStorageDatabase
ap-southeast-1All organizations unless flagged for MX residencySupabase StorageSupabase (managed Postgres)
mx-central-1Mexican-residency customers (regulated)Amazon S3 (MX bucket)Amazon RDS

Documents uploaded by an MX-residency organization stay in mx-central-1 storage and database for their entire lifecycle. Contact support@usekita.com to enable MX residency on your organization.

Cost Reporting

Every processed document records a per-document cost in USD, captured from the underlying model provider's billed usage. Use this to attribute spend per applicant, per batch, or per document type.

FieldWhereDescription
total_cost_usddocument resultTotal billed cost for the document (NUMERIC(10,6))
cost_reportdocument resultPer-call breakdown (model, prompt/completion tokens, USD per call, stage)
total_cost_usdbatch jobRoll-up across all documents in a batch (NUMERIC(10,4))

Costs are recorded for both the universal pipeline (Stage 1 vision, optional Stage 2 templated extraction, Stage 3 dedup) and the legacy bank_statement / audited_financial_statement / general_information_sheet extractors. They surface on GET /api/v1/documents/jobs/{document_id} and on batch result endpoints.

Error Handling

HTTP Status Codes

StatusMeaningWhat to do
200SuccessParse the JSON response
202AcceptedDocument queued — poll for results
400Bad RequestCheck your request body / parameters
401UnauthorizedInvalid or missing API key
403ForbiddenUpgrade required, or document type not enabled for your org
404Not FoundDocument or batch ID doesn't exist
429Rate LimitedWait and retry. Check Retry-After header.
500Server ErrorRetry after a brief delay

Error Response Format

JSON

{
  "error": "Bad Request",
  "message": "documents array is required and must not be empty"
}

Python Exceptions

Python

from kita import (
    KitaClient,
    KitaError,
    KitaAPIError,
    KitaAuthenticationError,
    KitaRateLimitError
)

try:
    result = client.process("doc.pdf", "bank_statement")
except KitaAuthenticationError:
    print("Invalid API key")
except KitaRateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except KitaAPIError as e:
    print(f"API Error {e.status_code}: {e.message}")
except KitaError as e:
    print(f"SDK Error: {e}")

Questions? Email support@usekita.com