-
Notifications
You must be signed in to change notification settings - Fork 1.1k
IO Working Group Meeting Notes
The goal of the I/O working group is to define how data is read into and written out from memory in MONAI. Such input and output requires consideration of (a) the research and application workflows in which MONAI will operate, (b) the importance of effectively utilizing all available data for deep learning development and evaluation, and (c) the critical significance of understanding and preserving the physical space that is represented by a medical image so that its clinical validity is preserved.
We have defined three broad sets of requirements / use-cases for input and output for MONAI:
- research I/O,
- reproducible I/O, and
- clinical I/O.
Research I/O requirements are concerned with common image and data file formats and libraries. Reproducible I/O requirements are concerned with forming a comprehensive description of the training and testing data, parameters, and models used in an experiment, so that the experiment can be repeated. Clinical I/O requirements are concerned with interfacing MONAI with clinical systems such as PACS and health records systems. To address these requirements, we are investigating how MONAI should integrate with and contribute to existing “third-party” libraries, rather than develop new solutions. Example third-party libraries being considered include ITK, MLFlow, PyDICOM, GDCM, XNat, and FHIR. Our guiding principles during those considerations are (a) don’t reinvent the wheel, (b) focus MONAI's energy and contributions on deep learning methods, not on supporting I/O methods, and (c) provide clinical relevance
- Stephen Aylward (Kitware) [email protected]
- Marco Nolden (DKFZ) [email protected]
- Jayashree Kalpathy-Cramer (MGH) [email protected]
- Brad Genereaux (Nvidia) [email protected]
- Ben Murray (Nvidia) [email protected]
- Wenqi Li (Nvidia) [email protected]
- Jorge Cardoso (KCL) [email protected]
- Prerna Dogra (Nvidia) [email protected]
The motivation for the following proposals is based on MONAI 0.2 release. In this release, MONAI supports PNG and NIFTI, however, those readers have two major shortcomings. One, the readers are file-format specific. The file type being read must be known when the python script is being written. So, if you want to read a PNG instead of a NIFTI, you must change the python code to use the PNG reader. Two, a MONAI image read via PNG will have different meta data than a MONAI image read via NIFTI, e.g., NIFTI tags and PNG tags are different and no effort is made to enforce a standard dictionary when a file is read, so if you want to know, for example, the date an image was acquired, it is likely to require accessing one tag if the file was a NIFTI and a different tag if the file was PNG. So, changing file format being read may require changes throughout a python script.
Goal: Support reading and writing common research image and data file formats: NRRD, Nifti, DICOM (objects), JPG, PNG, TIFF, MetaIO, ...
Proposal: MONAI should offer an extensible framework for research image I/O and rely on ITK for the default handling of the file formats that it supports.
There are three components to this proposal:
The framework should have a default list of image readers and writers for files, based on the file extension; and an Experiment should be able to specify alternative image readers and writers to use, based on the field extension.
This will allow researchers to define their own I/O methods to be used in their own experiments, while MONAI provides a default system that is highly capable of handling most I/O requirements. The framework should make use of ITK for standard research image formats such as NIFTI, NRRD, GPL, JPG, TIFF, MetaIO, IPL, VTK, Simulate, SiemensVision, GIF, GE, and DICOM (via GDCM).
MONAI should optionally provide support for additional file formats via other toolkits such as the following:
- BIDS (brain imaging data structure)
- NDWB (neurodata without boarders)
The framework should use meta-data dictionary terms commonly used by ITK (based on DICOM images read by GDCM)
The framework should preserve an image’s clinical relevance by maintaining its real-world properties, e.g., orientation, spacing, and origin
Have a global property file defined in MONAI that defines default handlers for files. Each handler definition specifies the file suffix, dependencies, version of dependencies, and function to be called.
MONAI.IO
{
FOO LoadFOO
PNG ITK/5.1/LoadITK
TIFF,TIF ITK/5.1/LoadITK
* ITK/5.1/LoadITK
}
Experiment files can optionally include a property file that defines alternative handlers:
Experiment.IO
{
PNG ITK/5.2/LoadITK
TIF LoadTIF
}
This mechanism could also be used to specify alternative implementations of other functions to be used by experiments, e.g., alternative transforms, etc.
Goal: Generate comprehensive descriptions of the training and testing data, parameters, and models used in an experiment So that the experiment can be repeated.
Proposal: To be determined
This proposal is a work-in-progress. Tentatively MONAI's dataset design may address these requirements. It may be necessary to incorporate MD5/checksums to ensure reproducibility
Tools / examples to consider include: HDF5 extensions,
Goal: Simplify interfacing MONAI with clinical systems such as PACS and health records systems: DICOM communications, HL7, ...
Proposal: Tentative: Provide examples of integrations by do not attempt to provide a definitive / comprehensive solution
Consider, in particular, the FHIR (Fast Healthcare Interoperability Resources) standard as a driving example.