The nf-imagecleaner is a Nextflow pipeline that prepares images for upload by removing sensitive data. This includes AcquisitionDate
and StructuredAnnotations
from OME-TIFF files, label images and Date
from SVS files, and specified metadata tags from TIFFs. It handles Synapse URIs, local file paths, and mixtures of both in its input samplesheet.
- Nextflow
- Docker
To run the pipeline with default parameters and docker (recommended), use:
nextflow run ncihtan/nf-imagecleaner --input <path/to/samplesheet.csv> -profile docker
The input to the pipeline is a CSV file (specified with --input
) where the image
column contains paths to images. If the path is a Synapse URL (starts with syn://
), this file will be downloaded from Synapse.
For example:
image
syn://syn00123
/local/path/to/image.svs
s3://my-bucket/my-image.ome.tiff
outdir
: Directory for outputs (default:outputs
)outsuffix
: Suffix for output files (default:_cleaned
)Coming soon!rm_svs_macro
: Boolean indicating whether to remove the macro image in SVS files (default:false
)rm_svs_label
: Boolean indicating whether to remove the label image in SVS files (default:true
)rm_ome_sa
: Boolean indicating whether to remove structural annotations in OME-XML files (default:true
)
The cleaned images will be placed in the directory specified by --outdir
.
The specific tags removed are:
for TIFFs:
- DateTime
- NDPI_ScanTime
- NDPI_WriteTime
- Artist
- HostComputer
- WangAnnotation
- WriterSerialNumber
- MDLabName
- MDPrepDate
- MDSampleInfo
- Software
for SVSs:
- Date
- Time Zone
- ScanScope ID
- User
- Time
- DSR ID
for OME-TIFFs:
- Whole StructuredAnnotations block
- Experimenter's e-mail, first name, and last name
- AcquisitionDate
tifftools
: for handling TIFF and OME-TIFF metadataome_types
: for handling OME-XMLsynapseclient
: for downloading data from Synapse
This README.md
was automatically generated by jaredcd/ai-tools and GPT-4.