Extract and structure data from documents using computer vision and human validation

Capture information from financial documents with unparalleled accuracy using AI-based document extraction software from Ocrolus. Ocrolus transforms documents of any format into contextualized, structured data to inform lending decisions.

Ocrolus processes every document with over 99% accuracy thanks to our Human-in-the-Loop approach to document capture. Our system intelligently selects the extraction or OCR tool, which results in the highest raw accuracy, then layers in proprietary pattern recognition and machine learning models. Data fields that cannot be automatically confirmed are then routed through a unique machine and human quality control workflow.

Step one: Best-in-class document and data extraction with Optical Character Recognition and Other Advanced Parsers

Optical Character Recognition (OCR) has been around for many years and has reached a ceiling in terms of accuracy. Rather than trying to reinvent the wheel, Ocrolus leverages a library of document extraction and OCR tools, automatically selecting the most effective data extraction technology based on the submitted document type.

document and data extraction with Ocrolus

Step two: Go beyond document data extraction with machine contextualization and localization

Going beyond document data extraction, Ocrolus technology uses proprietary machine learning and pattern recognition to localize each key element of a financial document and label it with the proper context. Our document and data extraction automation software is fine-tuned for unstructured and semi-structured documents, identifying the data needed to make lending decisions without the need to rely on templates or complex pre-configuration.

extract data from documents using machine contextualization and localization

Step three: Comprehensive quality control

Whereas many companies offer Business Process Outsourcing (BPO) data cleanup, Ocrolus is a pioneer when it comes to marrying machines and humans for data extraction and document verification. Our IP is carefully designed to trigger human validation steps strictly on an as-needed basis, with built-in algorithmic checks to eliminate the possibility of human error.

quality control for document data extraction with Ocrolus

Step four: Accurate, structured data output

Ocrolus returns accurate and clean data in a highly structured format, regardless of the original document source or quality during document extraction. Whether a statement came from a top-5 bank or small credit union, the output schema will always be identical, allowing for seamless and reliable integration of trusted data.

accurate, structured data output with Ocrolus

Ready to go?

Schedule a demo to see how we deliver on document data extraction.

events Ready to go scaled 1
Blend dark 1
bluevine dark
brex dark
crosscountry dark
enova dark
ICE Mortgage dark
paypal dark
plaid dark
sofi dark

Ocrolus technology elevated our bank statement analysis capabilities to the next level.”

– Jim Granat, President of SMB Lending and Senior Vice President, Enova International

Frequently Asked Questions

Document data extraction is the process of identifying and extracting and meaningful information from unstructured or semi-structured documents for further use or storage. Ocrolus automates document extraction using machine learning in addition to other techniques such as computer vision and micro-templates.

Yes, Ocrolus can perform data extraction from both unstructured and semi-structured documents. Our software identifies the data needed to make lending decisions without the need to rely on templates or complex pre-configuration.

When capturing fields from a document, AI-only approaches can often only achieve 80-95% accuracy on their own. Even humans may only be 93-97% accurate. Ocrolus achieves high accuracy its customers require to make high-stakes financial decisions by combining the two. We use humans in places where humans are best, AI where it is best, and use each to assist the other.

OCR is a form of AI which attempts to identify characters in a document. Like many AI products on their own, it delivers mediocre accuracy. Ocrolus leverages OCR and other AI approaches, and combines these with humans, to achieve superior accuracy with our document extraction software.

An average of 10-30 minutes, depending on the complexity of the form type, document quality, the number of compounded forms in each document upload, and other factors.* *Ocrolus does not guarantee turnaround time. For additional context on turnaround time and exclusions to turnaround time, see Schedule 1 of our Master Service Agreement (Service Level Agreement).

To maximize their automation, most Ocrolus customers integrate Ocrolus’ JSON API response directly into their Loan Origination Systems (LOS). Ocrolus also supports a Dashboard (web) view of its document processing, and Excel outputs for some Mortgage use cases.