Visual Read Assist is an innovative project designed to aid individuals with visual impairments. The device captures images, converts the text into machine-readable format using OCR (Optical Character Recognition), and reads the text aloud using text-to-speech technology. By leveraging the Raspberry Pi and other hardware components, this project aims to provide a low-cost and portable solution for visually impaired individuals to access and understand textual information independently.
- Image Capture: Utilizes the Raspberry Pi camera to capture high-quality images.
- Optical Character Recognition (OCR): Converts captured images into machine-readable text using
tesseract
. - Text-to-Speech Conversion: Reads out the extracted text in real-time using
espeak
. - Audio Feedback: Provides audio cues for user interaction, such as
Clicking picture
andPicture clicked
. - Portable and Low-Cost: Designed to be affordable and easy to use.
- Raspberry Pi 4B+
- 5MP Pi Camera
- Speakers or Headphones
- Power Supply
- Push Button
- Raspberry Pi Case
- Python 3
- Raspbian OS
- Required Libraries:
subprocess
pytesseract
aplay
PIL (Pillow)
espeak
libcamera-still
tesseract-ocr
- Attach the Pi Camera to the Raspberry Pi’s camera slot.
- Connect speakers or headphones for audio output.
- Ensure the Raspberry Pi is powered on and connected to the internet.
- Update and install the required packages:
sudo apt-get update sudo apt-get install espeak tesseract-ocr libcamera-dev python3-pip pip install pytesseract pillow
- Verify the installation of Tesseract OCR:
tesseract --version
Ensure it displays the installed version without errors.
Run the click_image.py
script to capture an image:
python3 click_image.py
- The program announces "Clicking picture" via audio.
- Captures an image using the Raspberry Pi camera and saves it as test_image.jpg.
- Announces "Picture clicked" once the image is saved.
Run the image_to_text_speech.py
script to process the captured image:
python3 image_to_text_speech.py
-
The program reads the captured image (
test_image.jpg
by default). -
Extracts text using Tesseract OCR.
-
Reads the extracted text aloud using
espeak
. -
Prints the text in the terminal for reference.
- Captures an image using the Raspberry Pi camera.
- Provides audio feedback to indicate the image capture process.
- Converts the captured image into text using OCR (Tesseract).
- Synthesizes the extracted text into real-time audio feedback.
click_image.py
: Captures an image using the Raspberry Pi camera and provides audio cues during the process.image_to_text_speech.py
: Processes the captured image to extract text and convert it to speech.test_image.jpg
: The default filename for the captured image.- Dependencies:
pytesseract
: For OCR functionality.espeak
: For text-to-speech conversion.libcamera-still
: For image capture.
-
Add support for multiple languages and handwriting recognition in OCR.
-
Enable users to select voices and adjust speech rates in text-to-speech.
-
Integrate with wearable devices, such as smart glasses, for hands-free operation.
-
Improve OCR performance using advanced AI and machine learning models.
This project is open-source and available under the MIT License.