OCR as a Service

Turn any OCR models into online inference API endpoint 🚀
Powered by BentoML 🍱

📖 Introduction 📖

This project demonstrates how to effortlessly serve an OCR model using BentoML. It accepts PDFs as input and returns the text contained within. The service employs Microsoft's DiT using Meta's detectron2 for image segmentation and EasyOCR for OCR.

🏃‍♂️ Running the Service 🏃‍♂️

Containers

The most convenient way to run this service is through containers, as the project relies on numerous external dependencies. We provide two pre-built containers optimized for CPU and GPU usage, respectively.

To run the service, you'll need a container engine such as Docker, Podman, etc. Quickly test the service by running the appropriate container:

# cpu
docker run -p 3000:3000 ghcr.io/bentoml/ocr-as-a-service:cpu

# gpu
docker run --gpus all -p 3000:3000 ghcr.io/bentoml/ocr-as-a-service:gpu

BentoML CLI

Prerequisite 📋

✅ Python

This project requires Python 3.8 or higher.

✅ Poppler, to convert pdf to image

On MacOS, make sure to install poppler to use pdf2image:

brew install poppler

On Linux distros, install pdftoppm and pdftocairo using your package manager, i.e. with apt-get:

sudo apt install poppler-utils

✅ Python Development Package

To build the Detectron2 wheel, python3-dev package is required. On Linux distros, run the following:

sudo apt install python3-dev

You may need to install a specific version of python3-dev, e.g., python3.10-dev for Python 3.10.

For MacOS, Python Development Package is installed by default.

Refer to Detectron2 installation page for platform specific instructions and further troubleshootings.

Once you have all prerequisite installed, clone the repository and install the dependencies:

git clone https://github.com/bentoml/OCR-as-a-Service.git && cd OCR-as-a-Service

pip install -r requirements/pypi.txt

# This depends on PyTorch, hence needs to be installed afterwards
pip install 'git+https://github.com/facebookresearch/detectron2.git'

To serve the model with BentoML:

bentoml serve

You can then open your browser at http://127.0.0.1:3000 and interact with the service through Swagger UI.

🌐 Interacting with the Service 🌐

BentoML's default model serving method is through an HTTP server. In this section, we demonstrate various ways to interact with the service:

cURL

curl -X 'POST' \
  'http://localhost:3000/image_to_text' \
  -H 'accept: application/pdf' \
  -H 'Content-Type: multipart/form-data' \
  -F file=@path-to-pdf

Replace path-to-pdf with the file path of the PDF you want to send to the service.

Via BentoClient 🐍

To send requests in Python, one can use bentoml.client.Client to send requests to the service. Check out client.py for the example code.

Swagger UI

You can use Swagger UI to quickly explore the available endpoints of any BentoML service.

🚀 Deploying to Production 🚀

Effortlessly transition your project into a production-ready application using BentoCloud, the production-ready platform for managing and deploying machine learning models.

Start by creating a BentoCloud account. Once you've signed up, log in to your BentoCloud account using the command:

bentoml cloud login --api-token <your-api-token> --endpoint <bento-cloud-endpoint>

Note: Replace <your-api-token> and <bento-cloud-endpoint> with your specific API token and the BentoCloud endpoint respectively.

Next, build your BentoML service using the build command:

bentoml build

Then, push your freshly-built Bento service to BentoCloud using the push command:

bentoml push <name:version>

Lastly, deploy this application to BentoCloud with a single bentoml deployment create command following the deployment instructions.

BentoML offers a number of options for deploying and hosting online ML services into production, learn more at Deploying a Bento.

👥 Community 👥

BentoML has a thriving open source community where thousands of ML/AI practitioners are contributing to the project, helping other users and discussing the future of AI. 👉 Pop into our Slack community!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
config		config
dit		dit
images		images
requirements		requirements
samples		samples
tests		tests
.bentoignore		.bentoignore
.gitignore		.gitignore
Dockerfile.template		Dockerfile.template
README.md		README.md
bentofile.gpu.yaml		bentofile.gpu.yaml
bentofile.yaml		bentofile.yaml
client.py		client.py
pyproject.toml		pyproject.toml
service.py		service.py
warmup.py		warmup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR as a Service

📖 Introduction 📖

🏃‍♂️ Running the Service 🏃‍♂️

Containers

BentoML CLI

Prerequisite 📋

✅ Python

✅ Poppler, to convert pdf to image

✅ Python Development Package

🌐 Interacting with the Service 🌐

cURL

Via BentoClient 🐍

Swagger UI

🚀 Deploying to Production 🚀

👥 Community 👥

About

Releases

Packages

Contributors 3

Languages

bentoml/OCR-as-a-Service

Folders and files

Latest commit

History

Repository files navigation

OCR as a Service

📖 Introduction 📖

🏃‍♂️ Running the Service 🏃‍♂️

Containers

BentoML CLI

Prerequisite 📋

✅ Python

✅ Poppler, to convert pdf to image

✅ Python Development Package

🌐 Interacting with the Service 🌐

cURL

Via BentoClient 🐍

Swagger UI

🚀 Deploying to Production 🚀

👥 Community 👥

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages