Retraining vs Fine-tuning in nnUNetv2 #36

rohanbanerjee · 2024-03-26T05:57:10Z

I am opening this issue to have a discussion around retraining vs fine-tuning models across different active learning rounds.

Context:

I have trained a baseline model (ref #34) and now I am moving on to my next round of training (will be called round 1 from now on) for which I will use 30 subjects. I have used two different strategies for this round of re-training,

Retraining: Adding the subjects to the baseline round training set and training a model from scratch.

Pros:

Good for reproducibility (i.e. if someone wants to retrain the model from scratch, this would be easier to reproduce)
Final model (after all the active learning rounds) would be able to learn from all the subjects --> no catastrophic forgetting

Cons:

The specific things (eg, not segmenting the first or last slice in my case) that the model learnt in the baseline model still persists even after the round 1 of active learning.
Explaination
After the baseline model was trained, I ran inference on a set of test images. I observed that in a few cases the model was not segmenting the first and the last slice. I corrected these images (drew the first and last slices) and used them for the training. The issue of not segmenting the first/last slice still persists. For eg,

sub-nwMW07 (from Northwestern Motor Weber)

Finetuning: Using pretrained weights from baseline training to initialize the round 1 training with only the new 30 subjects.
Pros:

More precise segmentations overall
Improves the short comings observed in the inference of the baseline models

The observed problem in Retraining was solved in fine-tuning

Cons:

Not reproducible (as mentioned in the Pro of the Retraining)

I gave one example of how I am observing that the fine-tuning is better than retraining but I would like to hear if anyone had a different experience or anything else that I should keep in mind.
P.S. I do understand that this strategy depends on the type of region of interest but I still wanted inputs.
tagging @valosekj @naga-karthik @plbenveniste @Nilser3 @hermancollin

hermancollin · 2024-03-26T14:41:44Z

@rohanbanerjee I would like to fine-tune as well, mostly because it would be more ressource efficient (less epochs). When you fine-tune, do you mean training on the bigger updated dataset or only on the new data?

Also, I don't see why the fine-tuning strategy is not reproducible. You could simply provide the initial weights and anyone could re-train your model, no?

naga-karthik · 2024-03-27T15:46:52Z

Thanks @rohanbanerjee for opening a discussion on this!

I observed that in a few cases the model was not segmenting the first and the last slice.

Wait, is this also the case with the nnunet model ? Because I am seeing something similar with the contrast-agnostic model's inference. Could you please confirm that you have trained and tested using the nnunet model?

Improves the short comings observed in the inference of the baseline models

By this, you mean that the model that was used for finetuning (initialized with the baseline model's weights) produces better segmentations at test time? i.e. first and last slices are properly segmented?

Not reproducible (as mentioned in the Pro of the Retraining)

I don't think I agree with this. As Armand suggested, it is much easier to fine-tune on the new data (given, the weights of your pretrained model) rather than collecting all the data (i.e. from various active learning rounds) and then re-training everything. Plus, a benefit is that fine-tuning takes less epochs than re-training everything from scratch.

BUT, before concluding that fine-tuning is the way to proceed in your case, consider this experiment: Fix a test set (called it Test Set A) and compare the performance of the: (1) baseline model on Test Set A, (2) Fine-tuned model (on Train Set B) initialized with baseline model's weights on Test Set A and (3) New model retrained on Train Set A and Train Set B.

If the fine-tuned model (2) is performing well than the new model re-trained on both Train Sets A and B (3), then you can proceed with fine-tuning for your future rounds of active learning. let me know if this makes sense, i'd be happy to clarify further!

rohanbanerjee changed the title Retraining vs Fine-tuning in nnUNetv2 **in progress** Retraining vs Fine-tuning in nnUNetv2 Mar 27, 2024

rohanbanerjee mentioned this issue Apr 3, 2024

Training and inference discussion for active learning round 1 #35

Closed

4 tasks

rohanbanerjee added the question Further information is requested label Apr 5, 2024

rohanbanerjee mentioned this issue Jun 10, 2024

Strategy for model training #48

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retraining vs Fine-tuning in nnUNetv2 #36

Retraining vs Fine-tuning in nnUNetv2 #36

rohanbanerjee commented Mar 26, 2024 •

edited

Loading

hermancollin commented Mar 26, 2024

naga-karthik commented Mar 27, 2024

Retraining vs Fine-tuning in nnUNetv2 #36

Retraining vs Fine-tuning in nnUNetv2 #36

Comments

rohanbanerjee commented Mar 26, 2024 • edited Loading

Context:

hermancollin commented Mar 26, 2024

naga-karthik commented Mar 27, 2024

rohanbanerjee commented Mar 26, 2024 •

edited

Loading