Skip to content

Commit

Permalink
Merge pull request #58 from MStarmans91/development
Browse files Browse the repository at this point in the history
Release version 3.4.1
  • Loading branch information
MStarmans91 authored May 18, 2021
2 parents 47d354c + fd656ba commit f9349c0
Show file tree
Hide file tree
Showing 159 changed files with 5,515 additions and 5,433 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -132,3 +132,4 @@ WORC/external/*
WORC/exampledata/ICCvalues.csv
WORC/tests/*.png
WORC/tests/*.mat
WORC/tests/WORC_Example_STWStrategyHN_Regression
8 changes: 7 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,13 @@ matrix:
- fastr trace /tmp/WORC_Example_STWStrategyHN/__sink_data__.json --sinks classification --samples all
- fastr trace /tmp/WORC_Example_STWStrategyHN/__sink_data__.json --sinks performance --samples all
- fastr trace /tmp/GS/DEBUG_0/tmp/__sink_data__.json --sinks output --samples id_0__0000__0000

# Change the tutorial script to also run a regression experiment,
# using the previously calculated features
- rm -r /tmp/GS/DEBUG_0
- python WORC/tests/WORCTutorialSimple_travis_regression.py
- fastr trace /tmp/WORC_Example_STWStrategyHN_Regression/__sink_data__.json --sinks classification --samples all
- fastr trace /tmp/WORC_Example_STWStrategyHN_Regression/__sink_data__.json --sinks performance --samples all
- fastr trace /tmp/GS/DEBUG_0/tmp/__sink_data__.json --sinks output --samples id_0__0000__0000
notifications:
slack:
secure: ytP+qd6Rx1m1uXYMaN7dFHnFNu+bCIcyugSnAY7BtbumJwCuEt8hbWvQ/sDoAKqxj5VYcnBlTRDn1gjg2t2shs7pBGgjdeZQpQglXyAtN4bz3suSUbQ9/RIwt+RPmbiTXkWQtoZ4q0DotydozKMnq8Cvhdy+d5pMqToER6kMq/WCC+Y/99mmnqO2VrWpvAvP6bBOWDvrk/C4u3y5m3Rp5iE7uAYR3TDTprIW9UNEntDoEYT2T+bidkDRl7DMsi8R4q4s/A6EhZpB4Tnhwz7ama155z77ywdZLhdmk5HJvngXcunVwH4v/l8DbBZU0PqMEJzaRMn/tQCCqjx1/unpyFCv+QuhmP5K4wo17R77jHlcn7SBkdzYr/CKHrilWuShmvOMCckBShpQw3H9PivcI6/G5mVA23tH+gJSQUbzZmBR683x7oQHmnK3g977yD/ufEvV6qME9HFXt3+jIzVEwsUjtJsTV/NsbHlErJfhBp8HJTpq6IRhtKcX9QS1i/APXcYcCSCFJe8tOTLN6xmAKBgONG3XOAvJwfwXbF+rmfjX0x6KMUuD5WmHLjMLhQp0dS00LV7C9s18UkFBgKydqvF2AMPUsbgIGyZ/Vz3v5nz7JiNLDfp0HxQpqAABpdwDHR3/CfuhCDcqzIXAgRgXaFrqCxqoH6OrsgRH6UxUXnM=
31 changes: 31 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,37 @@ All notable changes to this project will be documented in this file.
The format is based on `Keep a Changelog <http://keepachangelog.com/>`_
and this project adheres to `Semantic Versioning <http://semver.org/>`_


3.4.1 - 2021-05-18
------------------

Fixed
~~~~~
- Bugfix when PCA cannot be fitted.
- Bugfix when using LOO cross-validation in performance evaluation.
- Fix XGboost verson, as newest version automatically uses multihreading,
which is unsuitable for clusters.
- Bug in decomposition for Evaluation.
- RankedPosteriors naming of images was rounded to an integer, now unrounded
- Several fixes for regression.
- Regression in unit test.
- Several fixes for using 2D images.

Changed
~~~~~~~
- Reverted back to weighted f1-score without predictproba for optimization,
more stable.
- Updated regressors in SimpleWORC.

Added
~~~~~~~
- Option to combine features from a varying number of objects per patient,
e.g. by averaging or taking the maximum.
- Logarithmic z-score scaler to be more robust to non-normal distributions
and outliers.
- Linear and Ridge regression.
- Precision-recall curves.

3.4.0 - 2021-02-02
------------------

Expand Down
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# WORC v3.4.0
# WORC v3.4.1
## Workflow for Optimal Radiomics Classification

## Information
Expand Down Expand Up @@ -33,6 +33,15 @@ and support of different software languages (python, MATLAB, ruby, java etc.), w
collaboration, standardisation and comparison of different radiomics approaches. By combining this in a single framework,
we hope to find a universal radiomics strategy that can address various problems.

## License
This package is covered by the open source [APACHE 2.0 License](APACHE-LICENSE-2.0).

When using WORC, please cite this repository as following:

``Martijn P.A. Starmans, Sebastian R. van der Voort, Thomas Phil and Stefan Klein. Workflow for Optimal Radiomics Classification (WORC). Zenodo (2018). Available from: https://github.com/MStarmans91/WORC. DOI: http://doi.org/10.5281/zenodo.3840534.``

For the DOI, visit [![][DOI]][DOI-lnk].

## Disclaimer
This package is still under development. We try to thoroughly test and evaluate every new build and function, but
bugs can off course still occur. Please contact us through the channels below if you find any and we will try to fix
Expand Down Expand Up @@ -86,15 +95,6 @@ Besides a Jupyter notebook with instructions, we provide there also an example s
- We are writing the paper on WORC.
- We are expanding the example experiments of WORC with open source datasets.

## License
This package is covered by the open source [APACHE 2.0 License](APACHE-LICENSE-2.0).

When using WORC, please cite this repository as following:

``Martijn P.A. Starmans, Sebastian R. van der Voort, Thomas Phil and Stefan Klein. Workflow for Optimal Radiomics Classification (WORC). Zenodo (2018). Available from: https://github.com/MStarmans91/WORC. DOI: http://doi.org/10.5281/zenodo.3840534.``

For the DOI, visit [![][DOI]][DOI-lnk].

## Contact
We are happy to help you with any questions. Please sent us a mail or place an issue on the Github.

Expand Down
26 changes: 13 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
WORC v3.4.0
WORC v3.4.1
===========

Workflow for Optimal Radiomics Classification
Expand Down Expand Up @@ -28,6 +28,18 @@ comparison of different radiomics approaches. By combining this in a
single framework, we hope to find a universal radiomics strategy that
can address various problems.

License
-------

This package is covered by the open source `APACHE 2.0
License <APACHE-LICENSE-2.0>`__.

When using WORC, please cite this repository as following:

``Martijn P.A. Starmans, Sebastian R. van der Voort, Thomas Phil and Stefan Klein. Workflow for Optimal Radiomics Classification (WORC). Zenodo (2018). Available from: https://github.com/MStarmans91/WORC. DOI: http://doi.org/10.5281/zenodo.3840534.``

For the DOI, visit |image5|.

Disclaimer
----------

Expand Down Expand Up @@ -111,18 +123,6 @@ WIP
- We are expanding the example experiments of WORC with open source
datasets.

License
-------

This package is covered by the open source `APACHE 2.0
License <APACHE-LICENSE-2.0>`__.

When using WORC, please cite this repository as following:

``Martijn P.A. Starmans, Sebastian R. van der Voort, Thomas Phil and Stefan Klein. Workflow for Optimal Radiomics Classification (WORC). Zenodo (2018). Available from: https://github.com/MStarmans91/WORC. DOI: http://doi.org/10.5281/zenodo.3840534.``

For the DOI, visit |image5|.

Contact
-------

Expand Down
9 changes: 8 additions & 1 deletion WORC/IOparser/config_io_classifier.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python

# Copyright 2016-2020 Biomedical Imaging Group Rotterdam, Departments of
# Copyright 2016-2021 Biomedical Imaging Group Rotterdam, Departments of
# Medical Informatics and Radiology, Erasmus MC, Rotterdam, The Netherlands
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand Down Expand Up @@ -131,9 +131,16 @@ def load_config(config_file_path):
[int(str(item).strip()) for item in
settings['Featsel']['ReliefNumFeatures'].split(',')]

# Feature preprocessing before the whole HyperOptimization
settings_dict['FeatPreProcess']['Use'] =\
[str(settings['FeatPreProcess']['Use'])]

settings_dict['FeatPreProcess']['Combine'] =\
settings['FeatPreProcess'].getboolean('Combine')

settings_dict['FeatPreProcess']['Combine_method'] =\
str(settings['FeatPreProcess']['Combine_method'])

# Imputation
settings_dict['Imputation']['use'] =\
[str(item).strip() for item in
Expand Down
85 changes: 77 additions & 8 deletions WORC/IOparser/file_io.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python

# Copyright 2016-2020 Biomedical Imaging Group Rotterdam, Departments of
# Copyright 2016-2021 Biomedical Imaging Group Rotterdam, Departments of
# Medical Informatics and Radiology, Erasmus MC, Rotterdam, The Netherlands
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand All @@ -23,8 +23,10 @@
import os


def load_data(featurefiles, patientinfo=None, label_names=None, modnames=[]):
''' Read feature files and stack the features per patient in an array.
def load_data(featurefiles, patientinfo=None, label_names=None, modnames=[],
combine_features=False, combine_method='mean'):
"""Read feature files and stack the features per patient in an array.
Additionally, if a patient label file is supplied, the features from
a patient will be matched to the labels.
Expand All @@ -44,8 +46,13 @@ def load_data(featurefiles, patientinfo=None, label_names=None, modnames=[]):
List containing all the labels that should be extracted from
the patientinfo file.
'''
combine_features: boolean, default False
Determines whether to combine the features from all samples
of the same patient or not.
combine_methods: string, mean or max
If features per patient should be combined, determine how.
"""
# Read out all feature values and labels
image_features_temp = list()
feature_labels_all = list()
Expand Down Expand Up @@ -138,11 +145,64 @@ def load_data(featurefiles, patientinfo=None, label_names=None, modnames=[]):
label_data = dict()
label_data['patient_IDs'] = patient_IDs

# Optionally, combine features of same patient
if combine_features:
print('Combining features of the same patient.')
feature_labels = image_features[0][1]
label_name = label_data['label_name']
new_label_data = list()
new_pids = list()
new_features = list()
pid_length = len(label_data['patient_IDs'])
print(f'\tOriginal number of samples / patients: {pid_length}.')

already_processed = list()
for pnum, pid in enumerate(label_data['patient_IDs']):
if pid not in already_processed:
# NOTE: should check whether we have already processed this patient
occurrences = list(label_data['patient_IDs']).count(pid)

# NOTE: Assume all object from one patient have the same label
label = label_data['label'][0][pnum]
new_label_data.append(label)
new_pids.append(pid)

# Only process patients which occur multiple times
if occurrences > 1:
print(f'\tFound {occurrences} occurrences for {pid}.')
indices = [i for i, x in enumerate(label_data['patient_IDs']) if x == pid]
feature_values_thispatient = np.asarray([image_features[i][0] for i in indices])
if combine_method == 'mean':
feature_values_thispatient = np.nanmean(feature_values_thispatient, axis=0).tolist()
else:
raise WORCexceptions.KeyError(f'{combine_method} is not a valid combination method, should be mean or max.')
features = (feature_values_thispatient, feature_labels)

# And add the new one
new_features.append(features)
else:
new_features.append(image_features[pnum])

already_processed.append(pid)

# Adjust the labels and features for further processing
label_data = dict()
label_data['patient_IDs'] = np.asarray(new_pids)
label_data['label'] = np.asarray([new_label_data])
label_data['label_name'] = label_name

image_features = new_features

pid_length = len(label_data['patient_IDs'])
print(f'\tNumber of samples / patients after combining: {pid_length}.')

return label_data, image_features


def load_features(feat, patientinfo, label_type):
''' Read feature files and stack the features per patient in an array.
def load_features(feat, patientinfo, label_type, combine_features=False,
combine_method='mean'):
"""Read feature files and stack the features per patient in an array.
Additionally, if a patient label file is supplied, the features from
a patient will be matched to the labels.
Expand All @@ -162,7 +222,14 @@ def load_features(feat, patientinfo, label_type):
List containing all the labels that should be extracted from
the patientinfo file.
'''
combine_features: boolean, default False
Determines whether to combine the features from all samples
of the same patient or not.
combine_methods: string, mean or max
If features per patient should be combined, determine how.
"""
# Check if features is a simple list, or just one string
if '=' not in feat[0]:
feat = ['Mod0=' + ','.join(feat)]
Expand All @@ -186,7 +253,9 @@ def load_features(feat, patientinfo, label_type):
# Read the features and classification data
label_data, image_features =\
load_data(feat, patientinfo,
label_type, modnames)
label_type, modnames,
combine_features,
combine_method)

return label_data, image_features

Expand Down
Loading

0 comments on commit f9349c0

Please sign in to comment.