Elementary introduction to deep learning through the case of convolutional neural networks (CNN)
Use case: heritage images classification
Elementary introduction to neural networks and deep learning for non-technical persons in two parts.
This introduction presents the formal neuron model, neural networks and convolutional neural networks.
The hands-on session is based on Gallica heritage images (or any other IIIF-enabled repository), brought into play in an image classification scenario, aiming to deduce the technique or genre of these images (picture, drawing, map...) using a CNN trained model (supervised approach).
The trainees have to:
- select and download images from an digital repository (Gallica and the Welcome Collection are used) and create a training dataset
- train a classification model (with commercial APIs or open source IA frameworks)
- apply the model to the heritage images extracted thanks to APIs (including IIIF)
- Theoritical presentation (.ppt, French)
This hands-on session (FR, EN) illustrates a basic images classification use case.
A more detailled tutorial is also available here.
The content to be classified are Gallica images handled thanks to the IIIF protocol. The documents metadata and files are extracted from the Gallica digital repository with the help of Gallica's APIs and the PyGallica Python wrapper for the Gallica's APIs. Then, the image files are processed with a supervised classification approach. The Wellcome Collection digital library is also used.
The session leverages SaS (IBM Watson, Google Cloud Vision) and deep learning platforms (Tensorflow) for the processing.
- IBM Watson Studio account or Google Cloud AutoML account for all participants. See the setup document (FR, EN)
- Basic scripting and command line skills for participants wishing to go through the Python scripts.
- Use case definition: choice of the source images and the model classes; downloading of the images samples
- Training with IBM Watson Visual Recognition or Google Cloud AutoML
- Test of the model (using IBM Watson/Google Cloud platforms)
- Local test of the model with Python scripting (IBM Watson case). Launch the notebook with Binder here:
- For people with command line skills: training and test on the same dataset with TensorFlow Python scripts
- Tutorial (French, English)
- IBM Watson use case: Python 3 script for extracting documents from Gallica and inferencing images with IBM Watson + Jupyter notebook (English)
- Tensorflow use case: Python 3 scripts
- BnF GallicaPix use case. Training dataset1 (4 classes, 100 images each), dataset2 (11 classes, 1,000 images each)
- IBM Watson documentation
- Google AutoML documentation
- Convolutional neural networks:
- Library of Congress Newspaper Navigator use case