You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While training a model in the webapp, the training accuracy reported by tensorflow.js fit method can be widely different from evaluation values, especially when using batch norm layers (such as in mobilenet).
For example, calling model.evaluateDataset on the training dataset after each epoch can show diverging trends, with the tfjs accuracy logs rising to 1 while the manual evaluation staying constant around random or even dropping to 0.
Similarly, using the webapp to test a model that we just trained and selecting the same training set as test set yields different result than what is reported in the training board (which are tfjs training logs).
The difference stays small for small networks but accuracies completely diverge when doing transfer learning with a pre-trained model such as mobilenet.
A small difference can be explained by how tfjs fit method evaluates the accuracy: the model is updated after each batch so the accuracy reported is an aggregation of many models versions rather than one model evaluating all the training set.
This is related to keras-team/keras#6977 which seems to blame dropout and batch normalization layers. I did not manage to mitigate the issue by following fixes mentioned in the issue (sometimes due to tensorflow.js not allowing certain operations)
This stackoverflow post reports a similar issue during transfer learning and having solved it by retraining all Batch normalization layers to fit the statistics to the new dataset.
Main points:
If the webapp training board (= tfjs fit method training logs) shows a certain accuracy, it doesn't mean that evaluating the model on the same training set will yield the same accuracy
Empirically, discrepancies seems to only occur when models contain some Batch Norm. It didn't manage to mitigate the issue (mostly due to fixes being in python and tfjs limiting our options) but it is worth investigating further. Models without batch norm show a small and expected difference
The text was updated successfully, but these errors were encountered:
While training a model in the webapp, the training accuracy reported by tensorflow.js fit method can be widely different from evaluation values, especially when using batch norm layers (such as in mobilenet).
For example, calling model.evaluateDataset on the training dataset after each epoch can show diverging trends, with the tfjs accuracy logs rising to 1 while the manual evaluation staying constant around random or even dropping to 0.
Similarly, using the webapp to test a model that we just trained and selecting the same training set as test set yields different result than what is reported in the training board (which are tfjs training logs).
The difference stays small for small networks but accuracies completely diverge when doing transfer learning with a pre-trained model such as mobilenet.
A small difference can be explained by how tfjs fit method evaluates the accuracy: the model is updated after each batch so the accuracy reported is an aggregation of many models versions rather than one model evaluating all the training set.
This is related to keras-team/keras#6977 which seems to blame dropout and batch normalization layers. I did not manage to mitigate the issue by following fixes mentioned in the issue (sometimes due to tensorflow.js not allowing certain operations)
This stackoverflow post reports a similar issue during transfer learning and having solved it by retraining all Batch normalization layers to fit the statistics to the new dataset.
Main points:
The text was updated successfully, but these errors were encountered: