Installation

OCR-D Installation on NVIDIA Jetson Nano and Xavier

Basic installations

Install Python3 module venv

It is highly recommended to use a virtual environment for all Python modules. Install the Python3 module venv to support this.

sudo apt install python3-venv

Install Python3 development package

It is needed for building tesserocr and maybe other Python3 modules.

sudo apt install libpython3-dev

Install TensorFlow

Several OCR related tools use TensorFlow for Python 3. Installation can be tricky for ARM systems. These commands install TensorFlow with NVIDIA GPU support.

Install required packages for your Linux distribution.

# Install Debian / Ubuntu packages.
sudo apt install libhdf5-dev

It is suggested to install all Python code in a virtual environment, so create and activate one.

# Create a virtual environment for Python 3.
python3 -m venv $HOME/venv

Now install the Python modules for TensorFlow.

# Install requirements needed by tensorflow-gpu.
pip install astor google-pasta grpcio termcolor keras-applications keras-preprocessing wrapt
# Some Python modules are required with version restrictions.
pip install gast==0.2.2 "tensorboard<1.15.0" "tensorflow-estimator<1.15.0rc0"
# Install NVIDIA TensorFlow with GPU support for Nano and Xavier.
pip install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu

See https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html for more information.

Install pixelwise_segmentation_SBB

This is currently incomplete and not working because the required Python module cv2 (OpenCV) is missing for ARM.

# Install requirements.
pip install keras sacred seaborn tqdm # cv2?

# Get code.
mkdir -p $HOME/src/github/qurator-spk
cd $HOME/src/github/qurator-spk
git clone https://github.com/qurator-spk/pixelwise_segmentation_SBB.git

Install OCR-D components and required software

Install required packages

The Python3 module ocrd-kraken requires the Python3 module clstm which needs the swig executable. libeigen3-dev is needed for building clstm. TODO: Building clstm still fails.

# Debian / Ubuntu packages needed for clstm
sudo apt install libeigen3-dev protobuf-compiler swig

Install Tesseract

Either install Tesseract 4.1 from your Linux distribution or build Tesseract 5 from the sources.

Install Python3 module tesserocr

# Get code and install from latest code (required for Tesseract 5).
mkdir -p $HOME/src/github/sirfz
cd $HOME/src/github/sirfz
git clone https://github.com/sirfz/tesserocr.git
cd tesserocr
pip install .

Install OCR-D components

mkdir -p $HOME/src/github/OCR-D
cd $HOME/src/github/OCR-D
git clone https://github.com/OCR-D/core.git
git clone https://github.com/OCR-D/ocrd_calamari.git
git clone https://github.com/OCR-D/ocrd_keraslm.git
git clone https://github.com/OCR-D/ocrd_kraken.git
git clone https://github.com/OCR-D/ocrd_ocropy.git
git clone https://github.com/OCR-D/ocrd_olena.git
git clone https://github.com/OCR-D/ocrd_segment.git
# Install ocrd_tesserocr from source.
git clone https://github.com/OCR-D/ocrd_tesserocr.git
cd ocrd_tesserocr
pip install .
cd ..

pip install ocrd ocrd-kraken ocrd-ocropy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly