Siem de Jong
MSc thesis
View latest build »
Table of Contents
This repository contains the source code of the MSc thesis of Siem de Jong's. Research is conducted in the context of deep learning on higher harmonics generation imaging at the University of Amsterdam and Vrije Universiteit Amsterdam in the Biomedical Imaging and Photonics group.
A LaTeX preprocessor is required to build pdfs from source. This repo is tested with on Windows with MiKTeX and Linux with TeX Live.
- Clone the repo
The
git clone --recurse-submodules https://github.com/siemdejong/mscthesis.git
recurse-submodules
flag is needed to download the custom kaobook style. - Install the local TEXMF directory as a TEXMF root directory to install kaobook.
For MiKTeX on Windows, in the project directory, run
For TeX Live on a UNIX system, run
install_windows.bat
Kaobook can also be installed manually by following the instructions for the installed TeX distribution../install_linux.sh
To compile the output, run
pdflatex
pdflatex
biber
pdflatex
with your preferred optional arguments.
As the thesis aims to report on two diagnostic prediction models for development and validation, a TRIPOD-AI-like checklist is followed. See below. Yet to be adapted to this study.
-
Title page
-
Abstract
-
General introduction
- Link skin and brain project
- Mention TRIPOD-AI
-
Theoretical background of convolutional neural networks
-
Skinstression
- Abstract
- Introduction
- Background (diagnostic + rationale for dev/val + purpose)
- Objectives (development + validation)
- Methods
- Sources of data
- source of data of training/val/test
- origin of data
- dates of data collection
- Participants (study setting + eligibility + no specific treatment)
- study setting: tertiary care, VUmc
- eligibility for participants or data sources
- treatment received
- Data preparation
- stress-strain curves
- images
- data augmentation
- Outcome of model
- What is predicted?
- How is prediction assessed?
- (Why choosing this outcome measurement if alternatives exist?)
- Predictors
- Alternatives for predictors
- three parameters + how they are measured
- source of predictors + known biases
- Sample size
- Missing data
- sex and age
- Statistical analysis methods
- Diagram of analytical process
- handling of predictors
- Pre-selection of predictors prior to model building (results for exp/pca/logistic)
- rescaling/transformation on predictors (LDS + reweighting)
- type of model, building model + predictor selection + internal validation
- model ensembling techniques (if used)
- detailed model description
- initialization of model parameters
- training approaches (hyperparameters, number of models trained, used datasets)
- Measures to assess model performance + model comparison
- model updating arising from validation
- how final model is selected
- explainability and interpretability
- software used
- Sources of data
- Results
- Participants (flow, demographics, comparison train/val/test (predictor distributions and images))
- Model dev and per participant outcome in
- Hyperparameter tuning
- Training
- Testing
- Model specification (present model + explain how it must be used)
- Model performance
- accuracy WITH confidence interval
- results of analysis on performance errors
- Model updating (performance per update)
- Usability
- how and when in the clinical pathway to use the prediction AI
- how will the AI be integrated into the target setting + requirements (on-/offsite)
- how will poor data be assessed when implementing AI model
- any human interaction needed for data to be used with the model + expertise of users
- Sensitivity analysis?
- Discussion
- Limitations
- Interpretation (dev/val data performance + overall interpretation considering objectives/limitations/similar study results/other evidence)
- Implications
- potential use (also in a general way)
- how will clinical practice be different if using the AI and how will it be used
- Supplementary information
- Data?
- Code
- Funding?
- References
-
Pediatric brain tumors
- Abstract
- Introduction
- Background (diagnostic + rationale for dev/val + purpose)
- Objectives (development + validation)
- Theory
- Feature extraction
- MIL
- Classical
- DeepMIL
- VarMIL
- Model performance
- ROC Curve
- PR Curve
- PRG Curve
- IoU
- Methods
- Sources of data
- source of data of training/val/test
- origin of data
- dates of data collection
- Participants (study setting + eligibility + no specific treatment)
- study setting: tertiary care, Princess maxima center
- eligibility for participants or data sources
- treatment received
- Data preparation
- targets (from text to numbers)
- images
- getting images from raw data
- scaling overview images
- masking
- tiling
- (optionally) denoising
- ...
- data augmentation
- Masking (mini study)
- Outcome of model
- What is predicted?
- How is prediction assessed?
- (Why choosing this outcome measurement if alternatives exist?)
- Predictors
- Alternatives for predictors
- pathologist decision
- genetic marker
- how does pathologist make decision?
- source of predictors + known biases
- age
- location
- ...
- Alternatives for predictors
- Sample size
- Missing data
- Statistical analysis methods
- Diagram of analytical process
- handling of predictors
- Pre-selection of predictors prior to model building
- rescaling/transformation on predictors
- type of model, building model + predictor selection + internal validation
- detailed model description
- initialization of model parameters
- simclr pretrain
- imagenet
- training approaches (hyperparameters, number of models trained, used datasets)
- hyperparameters trained on one split
- 5 splits, 5 models
- Measures to assess model performance + model comparison
- AUPR
- AUPRG
- simclr init vs imagenet init vs ...
- model updating arising from validation
- how final model is selected
- best F1 per split
- explainability and interpretability
- multiply attention vector with input tiles
- software used
- Ray
- Optuna
- Pytorch (Lightning)
- setup used
- Sources of data
- Results
- Participants (flow, demographics, comparison train/val/test (predictor distributions and images))
- Model specification (present model + explain how it must be used)
- Model performance
- AUPRG WITH confidence interval over splits
- results of analysis on performance errors
- Attention maps
- Loss curves
- nearest neighbours simclr
- tsne simclr
- Usability
- how and when in the clinical pathway to use the prediction AI
- how will the AI be integrated into the target setting + requirements (on-/offsite)
- how will poor data be assessed when implementing AI model
- any human interaction needed for data to be used with the model + expertise of users
- Discussion
- Limitations
- bad data? noise exclusion
- overfitting fold 1
- Interpretation (dev/val data performance + overall interpretation considering objectives/limitations/similar study results/other evidence)
- Implications
- potential use (also in a general way)
- how will clinical practice be different if using the AI and how will it be used
- Limitations
- Supplementary information
- Data?
- Code
- References
-
Discussion and conclusion
- Discussion
- Conclusion
-
All references
-
Acknowledgments
See the open issues for a list of discussions (and known issues).
Diagrams are made with Mermaid (mermaid.cli) and PlantUML. Their outputs are already compiled.
Run
mmdc -i mermaid/input.mmd -o mermaid/output.pdf -f
to compile Mermaid diagrams, and run
java -jar plantuml.jar input.puml
to compile PlantUML diagrams.
Move the diagrams to the mermaid
or plantuml
folder and import the pdf/svg with includegraphics
/includesvg
.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Siem de Jong - siemdejong
Skinstression: siemdejong/shg-strain-stress