Document Information Extraction using OCR and LLMs

Our project aimed to develop a flexible and efficient solution for extracting information from a variety of document formats. By leveraging Optical Character Recognition (OCR) and Large Language Models (LLMs), such as Llama 3, we explored how advanced contextual capabilities of LLMs could enhance the accuracy and adaptability of information extraction.

Key Features

Multi-format Document Support: Our approach is adaptable to numerous document types (pdf,png,jpg,docs), whether they are scanned images, PDFs, or other formats, without requiring specific pre-training or rule-based configurations for each type.
Optical Character Recognition (OCR) (PaddleOCR): Extracts textual data from scanned documents or images, transforming it into machine-readable content.
Large Language Models (LLMs): Utilizes advanced language models to interpret and analyze the extracted text, offering contextual understanding for better information extraction.

Innovation

The innovation of this method lies in its adaptability. Unlike traditional systems that require extensive rule-based settings or format-specific training, our solution can process various document formats with minimal configuration.
Also Privacy in Our approach we used Local LLMs so there no the third party.

Documentation

For more detailed documentation, please refer to the official project documentation at the following link:

Project Documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Document Information Extraction using OCR and LLMs

Key Features

Innovation

Documentation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Document Information Extraction using OCR and LLMs

Key Features

Innovation

Documentation