Project: Lambda Function for Image Text Extraction with Tesseract OCR

This project provides a Python Lambda function packaged as a Docker image, enabling you to leverage the power of Tesseract OCR for image text extraction within AWS Lambda. By deploying this function, you gain the flexibility to customize its capabilities based on your specific requirements.

The best way to push this image to ECR is to use the pre-generated commands in the console. You can find this by going to the ECR repository and clicking on the "View push commands" button.

For lambda you add the bucket name in the environment variable 'BUCKET'. I set the memory to 10240 MB since it might need more resources to run Tesseract OCR. I also set the timeout to 30 seconds.

You can test the function by uploading an image to the S3 bucket and performing a test event in the Lambda console.

{
    "key": "example-image.jpg"
}

I will provide a cdk stack in the future to deploy the solution.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Dockerfile		Dockerfile
README.MD		README.MD
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: Lambda Function for Image Text Extraction with Tesseract OCR

About

Releases

Packages

Languages

JairJosafath/Tesseract-OCR-Lambda

Folders and files

Latest commit

History

Repository files navigation

Project: Lambda Function for Image Text Extraction with Tesseract OCR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages