-
Notifications
You must be signed in to change notification settings - Fork 404
Config File Refactors
As a result of the June 19 Hackathon, a number of potential changes to the config file have been identified and should be considered for implementation:
- Move preprocessing fields from 'NETWORK' to each 'Input' section
- Add 'channel' keyword to inputs
- Explicit input groups
There is an assumption at present that the pre-processing fields are applied consistently across inputs, or at least that there is, in the application, conventions that some of the fields are applied to images and some to labels.
This approach is fine for network types that have a single set of inputs and a single set of labels, for example, but doesn't cover the possibility that a user's overall dataset is composed of a number of discreet sub-datasets, which are not necessarily consistent in the processing that needs to be performed. As such, it makes sense that the input preprocessing flags currently defined in NETWORK would be better associated with each input.
[T1]
path_to_search = ./path/to/some/t1s
filename_contains = T1
filename_not_contains =
spatial_window_size = (96, 96, 96)
pixdim = (1.0, 1.0, 1.0)
axcodes=(A, R, S)
interp_order = 3
[NETWORK]
name = unet
activation_function = prelu
batch_size = 1
decay = 0
reg_type = L2
volume_padding_size = 44
histogram_ref_file = ./path/to/histograms.txt
norm_type = percentile
cutoff = (0.01, 0.99)
normalisation = True
whitening = True
normalise_foreground_only=True
foreground_type = otsu_plus
multimod_foreground_type = and
queue_length = 128
window_sampling = uniform
[T1]
path_to_search = ./path/to/some/t1s
filename_contains = T1
filename_not_contains =
spatial_window_size = (96, 96, 96)
pixdim = (1.0, 1.0, 1.0)
axcodes=(A, R, S)
interp_order = 3
histogram_ref_file = ./path/to/t1_histogram.txt
norm_type = percentile
cutoff = (0.01, 0.99)
normalisation = True
whitening = True
normalise_foreground_only=True
foreground_type = otsu_plus
multimod_foreground_type = and
[NETWORK]
name = unet
activation_function = prelu
batch_size = 1
decay = 0
reg_type = L2
volume_padding_size = 44
queue_length = 128
window_sampling = uniform
The benefit to such an approach is that one can have a dataset composed of a mixture of un-preprocessed and preprocessed data and have them handled by different readers with different preprocessing steps. This is currently not possible.
Add a channel field that specifies that the user is only interested in specific channels from an image. This allows a user to avoid preprocessing steps on data that will not be used in the network. This is supplied as an offset or series of offsets that specify the subset of the channels to be used from an image source. Python slice syntax can also be used:
[image_t1]
path_to_search=/path/to/files
channels=0
...
[image_mris]
path_to_search=/path/to/files
channels=0,2,3
...
[image_mris]
path_to_search=/path/to/files
#get channels 1,2 & 3
channels=1:4
...
This is slightly more problematic for non programmers. Slice syntax is exclusive for the upper bound. We may which to provide a kinder format such as:
[image_mris]
path_to_searh=/path/to/files
channels=1-3
...
Currently, readers and reader groups are 'sort of' defined in the config file but there is a need for the application writer to correctly associate different inputs with a given reader. Adding a little verbosity to this part of the config should allow users and application writers to have more control over how inputs are combined into the datasets that are processed by NiftyNet.
At the moment, the user can specify a named input group in the [APPLICATION] section of the config file in one of two ways This specifies a tuple of tensors associated with a reader.
[APPLICATION]
images = image_t1, image_fl
This specifies a tensor with multiple channels
[APPLICATION]
images = [image_t1, image_fl]
We can map inputs to tensors in more sophisticated ways, but to do so requires a more explicit representation of input groups that are more configurable.
- The user can treat input groups interchangeably with inputs
- The user can explicitly specify sampling / partitioning to be shared between inputs
- The user can create an input group by concatenating inputs