AI Document Service "Kukai"

Digitize paper and PDFs at high speed with high accuracy,
and let AI handle the understanding and organization automatically.

Three reasons people choose Kukai

Easy - Drag & drop and ingestion is done
Fast - AI understands and emits JSON in as little as 30 seconds
Strong - Template-driven, so new forms are supported immediately

Feature highlights

High-precision OCR - Faded characters and handwriting alike, captured cleanly.
AI understanding & organization - Reading, reasoning, summarizing - all handled by AI.
REST API - Sync with your core systems instantly.
Sync & async processing - One page or ten thousand, the speed is the same.
Custom JSON - Just the fields you want, exactly how you want them.

System overview

This system extracts text from PDFs such as invoices and emits it as well-organized JSON. By combining OCR via Google Cloud Vision with OpenAI's Chat API, the contents of a document are structured automatically. The browser interface also provides an editable report screen.

Technology stack

Python 3 / FastAPI - Server application[^1]
Google Cloud Vision API - Extracts text from PDF images[^2]
OpenAI API - Analyzes the extracted text and produces invoice data[^2]
pdf2image / OpenCV / PyMuPDF - Converts PDFs to images and prepares them for OCR[^2]
Jinja2 - Renders reports as HTML templates[^3]
JavaScript - Browser-side highlights, zoom, and other interactions[^4]

System architecture

Data flow

1. PDF upload
1. Accepted at the /upload endpoint. PDFs are saved to a storage/{timestamp} directory[^5].
2. OCR processing
1. process_file rasterizes the PDF (300dpi) and the Vision API extracts text and positional information[^6][^7].
3. AI analysis
1. The OCR result is sent to OpenAI together with a prompt to produce invoice JSON[^8][^9].
2. OCR results come in two forms: a JSON form with coordinate information, and a plain-text form.
3. Schema mapping is delegated to OpenAI.
4. For the plain-text form, downstream code can apply hard-coded post-processing.
4. Save results
1. The generated JSON, OCR result, and page images are saved to the same directory[^10].
5. Report display
1. /report/{id} renders an HTML report. The edit form and OCR image are shown side by side[^11] [^12].
6. API usage
1. /v1/invoices:analyze returns JSON immediately when a PDF is posted[^13].
  An async version /v1/invoices:analyzeAsync is also available[^14].

FAQ

Q. Is it secure?: A. All traffic is encrypted with TLS 1.3. Data can be stored within the Japan region.
Q. Can it read handwritten slips?: A. Yes - we use Google Cloud Vision's high-end model, which is highly accurate even on handwriting.
Q. I want to customize fields to fit our workflow.: A. Edit the JSON schema visually from the admin screen, with changes applied immediately.

Feel free to reach out about adopting Kukai.

AI Document Service: Kukai