All AI applications

Finance & Accounting

On-Device Document Extraction

AI-powered invoice extraction system using a local LLM (Qwen 2.

The challenge

Why it exists

The idea was to test and demo a Local LLM extracting and processing sensitive data opposed to a cloud LLM. This way PII data never the local environment and organizations can rest assured that their data is secure.

The approach

How it works

AI-powered invoice extraction system using a local LLM (Qwen 2.5 7B via Ollama) running entirely on-device — no data leaves the machine. The system uses pdfplumber for accurate PDF-to-text conversion with column-aware extraction, a single-pass LLM prompt to extract all invoice fields simultaneously, and a regex post-correction layer that validates and confirms extracted values. A confidence scoring system visually flags uncertain fields for human review with an amber indicator, and a Re-enforce mechanism allows users to correct extractions and teach the system vendor-specific patterns that persist for future invoices.

Key capabilities

What it does

AI-powered invoice extraction system using a local LLM (Qwen 2.5 7B via Ollama) running entirely on-device — no data leaves the machine.

The system uses pdfplumber for accurate PDF-to-text conversion with column-aware extraction, a single-pass LLM prompt to extract all invoice fields simultaneously, and a regex post-correction layer that validates and confirms extracted values.

A confidence scoring system visually flags uncertain fields for human review with an amber indicator, and a Re-enforce mechanism allows users to correct extractions and teach the system vendor-specific patterns that persist for future invoices.

Typically used by

AP / Finance team

Business impact

Estimated 15–20 minutes saved per invoice on manual data entry and verification. For a team processing 50+ invoices per month that's 12–16 hours saved monthly. Secondary benefit: vendor-specific learning means accuracy improves over time without retraining. Sensitive financial data stays on-premise.

Built with

Technology

Tools & Frameworks

OllamaQwen 2.5 7BpdfplumberFlaskpdf.jsPythonJavaScriptHTML/CSS

Integrations

Ollamapdfplumberpdf.js

Want something like this for your team?

We'll map your workflow and scope a working prototype — typically in three weeks, not three months.

Talk to us