Don't you accumulate the validation gradients too while training? #57

burak43 · 2020-02-24T19:58:25Z

in train_i3d.py file, you do loss.backward() for both train and val phases. Doesn't it accumulate gradients for the validation loss too no matter you put the model in eval mode (since it only affects the behaviour of some layers such as dropout, batch norm)? Is there pytorch 0.3.0 specific thing that blocks validation gradient accumulation?

The text was updated successfully, but these errors were encountered:

piergiaj · 2020-02-24T20:01:15Z

These lines:
https://github.com/piergiaj/pytorch-i3d/blob/master/train_i3d.py#L115-L119

Only apply the gradient step when in training model. Combined with https://github.com/piergiaj/pytorch-i3d/blob/master/train_i3d.py#L86
the gradients from the validation step are never applied.

For efficiency, the loss.backward() could be removed from the validation step, but since they are never applied, it will not impact model accuracy.

burak43 · 2020-02-25T05:36:48Z

These lines:
https://github.com/piergiaj/pytorch-i3d/blob/master/train_i3d.py#L115-L119

Only apply the gradient step when in training model. Combined with https://github.com/piergiaj/pytorch-i3d/blob/master/train_i3d.py#L86
the gradients from the validation step are never applied.

For efficiency, the loss.backward() could be removed from the validation step, but since they are never applied, it will not impact model accuracy.

I see. Then, as I said in #44 (comment), when num_steps_per_update is not a multiple of len(dataloader), the leftover accumulated training gradiens are zeroed before calling optimizer.step() when phase change from training to validation. As a result, leftover forward training pass losses are not used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't you accumulate the validation gradients too while training? #57

Don't you accumulate the validation gradients too while training? #57

burak43 commented Feb 24, 2020 •

edited

Loading

piergiaj commented Feb 24, 2020

burak43 commented Feb 25, 2020 •

edited

Loading

Don't you accumulate the validation gradients too while training? #57

Don't you accumulate the validation gradients too while training? #57

Comments

burak43 commented Feb 24, 2020 • edited Loading

piergiaj commented Feb 24, 2020

burak43 commented Feb 25, 2020 • edited Loading

burak43 commented Feb 24, 2020 •

edited

Loading

burak43 commented Feb 25, 2020 •

edited

Loading