GPT OCR is a Python application that leverages the OpenAI GPT-3.5-turbo model to process and proofread text documents produced from PDF OCR, Scan OCR etc. The application is specifically designed for handling .txt
files. It provides various functionalities such as navigating through a directory of text files, chunking the text for efficient processing, and communicating with the AI model interactively.
To install the required libraries, you can use pip:
pip install openai
pip install asyncio
pip install textwrap
pip install tiktoken
pip install click
Run the application via the command line:
python main.py ocr --model MODEL --wrap WRAP --prompt_file PROMPT_FILE --base BASE
MODEL
: The model to use, default isgpt-3.5-turbo
.WRAP
: The prompt isolation wrap, default is '```<<PAYLOAD>>```'.PROMPT_FILE
: The input prompt file, default isprompt.txt
.BASE
: The base directory to search for text files, default is./data/papers/
.
Make sure you have a valid OpenAI API key stored in a file named openaiapi.key
in the same directory as the main script.
-
Traverse Folder: The application can traverse through a given directory, find all
.txt
files, and ignore any files that already include_proofread
in the name or have a corresponding_proofread
file. -
Prompt in Chunks: If a text file is too large, the application can process it in chunks. Each chunk is guaranteed to be within the token limit of the OpenAI model.
-
Write Async: The application writes the processed and proofread content to a new file with
_proofread
appended to the original file's name. -
Error Handling: The application includes handling for rate limit errors and invalid request errors.
Pull requests are welcome. Please ensure to update tests as appropriate.