Brain Lesion Analysis and Segmentation Tool for Computed Tomography - Version 2.0.0
This repository provides our deep learning image segmentation tool for traumatic brain injuries in 3D CT scans.
Please consider citing our article when using our software:
Monteiro M, Newcombe VFJ, Mathieu F, Adatia K, Kamnitsas K, Ferrante E, Das T, Whitehouse D, Rueckert D, Menon DK, Glocker B. Multi-class semantic segmentation and quantification of traumatic brain injury lesions on head CT using deep learning – an algorithm development and multi-centre validation study. The Lancet Digital Health (2020). Monteiro and Newcombe are equal first authors. Menon and Glocker are equal senior authors.
NOTE: This software is not intended for clinical use.
The provided source code enables training and testing of our convolutional neural network designed for multi-class brain lesion segmentation in head CT. Additionally, it allows for localisation of the segmented image, i.e. calculation of the volume of lesion per brain region (list of regions in blast_ct/data/localisation_files/atlas_labels.csv). NOTE: The localisation is based on linear image registration, hence it does not allow for voxel-wise precision.
In version 2.0.0 of this tool, we also make available a model that has been trained on a set of 680 annotated CT scans obtained from multiple clinical sites.
The output of our lesion segmentation tool is a segmentation map in NIfTI format with integer values ranging from 1 to 4 representing:
- Intraparenchymal haemorrhage (IPH);
- Extra-axial haemorrhage (EAH);
- Perilesional oedema;
- Intraventricular haemorrhage (IVH).
A CSV file with the total volume of lesion calculated for each lesion class is also part of the output. If the user chooses to perform localisation of lesions, this file will also include the volume of lesion per brain region, the volume of each brain region as well as the total brain volume.
As of the latest version, the tool resamples images internally and returns the output segmentation in the same space as the input image, so there is no need to preprocess the input.
On a fresh python3 virtual environment install blast-ct
via
pip install git+https://github.com/biomedia-mira/blast-ct.git
If you are using miniconda, create a new conda environment and install PyTorch
conda create -n blast-ct python=3
conda activate blast-ct
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
Then install blast-ct
via
pip install git+https://github.com/biomedia-mira/blast-ct.git
Please run the following in your bash console to obtain an example data that we use to illustrate the usage of our tool in the following:
mkdir blast-ct-example
cd blast-ct-example
svn checkout "https://github.com/biomedia-mira/blast-ct/trunk/blast_ct/data/"
To run inference on one image using our pre-trained model:
blast-ct --input <path-to-input-image> --output <path-to-output-image> --device <device-id>
--input
: path to the input input image which must be in nifti format (.nii
or.nii.gz
);--output
: path where prediction will be saved (with extension.nii.gz
);--device <device-id>
the device used for computation. Can be'cpu'
(up to 1 hour per image) or an integer indexing a cuda capable GPU on your machine. Defaults to CPU;- Pass
--ensemble True
: to use an ensemble of 15 models which improves segmentation quality but slows down inference (recommended for gpu). - Pass
--localisation True
to localise the segmented lesion, i.e. calculate the volume of lesion per brain region. - (Only if
--do-localisation True
)'--num-reg-runs'
: how many times to run registration between native scan and CT template. Running it more than one time prevents initialisation errors, as only the best performing run is kept.
Run the following in the blast-ct-example
directory (might take up to an hour on CPU):
blast-ct --input data/scans/scan_0/scan_0_image.nii.gz --output scan_0_prediction.nii.gz
To run inference on multiple images using our ensemble of pre-trained models:
blast-ct-inference \
--job-dir <path-to-job-dir> \
--test-csv-path <path-to-test-csv> \
--device <device-id>
--job-dir
: the path to the directory where the predictions and logs will be saved;--test-csv-path
: the path to a csv file containing the paths of the images to be processed;--device <device-id>
the device used for computation. Can be'cpu'
(up to 1 hour per image) or an integer indexing a cuda capable GPU on your machine. Defaults to CPU;- Pass
--overwrite True
: to write over existingjob-dir
. Set asFalse
if you want to continue a run previously started. - Pass
--do-localisation True
to localise the segmented lesion, i.e. calculate the volume of lesion per brain region. - (Only if
--do-localisation True
)'--num-reg-runs'
: how many times to run registration between native scan and CT template. Running it more than one time prevents initialisation errors, as only the best performing run is kept.
Run the following in the blast-ct-example
directory (GPU example):
blast-ct-inference --job-dir my-inference-job --test-csv-path data/data.csv --device 0
NOTE: If the run breaks before all images are processed, run again with --overwrite False
to finish from where it was left on the previous run.
To train your own model:
blast-ct-train \
--job-dir <path-to-job-dir> \
--config-file <path-to-config-file> \
--train-csv-path <path-to-train-csv> \
--valid-csv-path <path-to-valid-csv> \
--num-epochs <num-epochs> \
--device <gpu_id> \
--random-seed <list-of-random-seeds>
--job-dir
: the path to the directory where the predictions and logs will be saved;--config-file
: the path to a json config file (seedata/config.json
for example);--train-csv-path
: the path to a csv file containing the paths of the images, targets and sampling masks used to train th model;--valid-csv-path
: the path to a csv file containing the paths of the images used to keep track of the model's performance during training;--num-epochs
: the number of epochs for which to train the model (1200 was used with the example config)--device <device-id>
the device used for computation ('cpu'
or integer indexing GPU). GPU is strongly recommended.-random-seeds
: a list of random seeds used for training. Pass more than one to train multiple models one after the other.- pass
--overwrite True
: to write over existingjob-dir
. Set asFalse
if you want to continue a run previously started.
Run the following in the blast-ct-example
directory (GPU example, takes time):
blast-ct-train \
--job-dir my-training-job \
--config-file data/config.json \
--train-csv-path data/data.csv \
--valid-csv-path data/data.csv \
--num-epochs 10 \
--device 0 \
--random-seeds "1"
To run inference with your own models and config use
blast-ct-inference \
--job-dir <path-to-job-dir> \
--config-file <path-to-config-file> \
--test-csv-path <path-to-test-csv> \
--device <gpu_id> \
--saved-model-paths <list-of-paths-to-saved-models>
--job-dir
: the path to the directory where the predictions and logs will be saved;--config-file
: the path to a json config file (seedata/config.json
for example);--test-csv-path
: the path to a csv file containing the paths of the images to be processed;--device <device-id>
the device used for computation. Can be'cpu'
(up to 1 hour per image) or an integer indexing a cuda capable GPU on your machine. Defaults to CPU;--saved-model-paths
is a list of pre-trained model paths;- pass
--overwrite True
: to write over existingjob-dir
. Set asFalse
if you want to continue a run previously started. - pass
--do-localisation True
to localise the segmented lesion, i.e. calculate the volume of lesion per brain region. - (Only if
--do-localisation True
)'--num-reg-runs'
: how many times to run registration between native scan and CT template. Running it more than one time prevents initialisation errors, as only the best performing run is kept.
Run the following in the blast-ct-example
directory (GPU example):
blast-ct-inference \
--job-dir my-custom-inference-job \
--config-file data/config.json \
--test-csv-path data/data.csv \
--device 0 \
--saved-model-paths "data/saved_models/model_1.pt data/saved_models/model_3.pt data/saved_models/model_6.pt
--do-localisation True
The tool takes input from csv files containing lists of images with unique ids. Each row in the csv represents a scan and must contain:
- A column named
id
which must be unique for each row (otherwise overwriting will happen); - A column named
image
which must contain the path to a nifti file; - (training only) A column named
target
containing a nifti file with the corresponding labels for training; - (training only; optional) A column named
sampling_mask
containing a nifti file with the corresponding sampling mask for training; Seedata/data.csv
for a working example with 10 rows/ids (even though in this example they point to the same image).