You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you only want to use our trained checkpoints for inference or fine-tuning, here is the collection of models.
Models with 3 experts are standard models that provide a good trade-off between computation/model size and accuracy. Note that models with 2 experts sometimes have even lower computational cost than baseline models. However, we will also release some models that achieve higher accuracy such as models with 6 experts, which can be used as teacher models to distill other models. Some models are trained in an old config format so that config may mismatch. If you cannot load the checkpoint, please let us know.
Imbalanced CIFAR 100/CIFAR-LT 100 (100 epochs)
CE and Decouple: baseline results for cross-entropy and decouple (cRT/tau-norm/LWS)
RIDE: ResNet32 backbone, without distillation, with EA
RIDE + Distill: ResNet32 backbone, with distillation, with EA
Teacher Model: ResNet32 backbone, 6 experts, without EA. Working as the teacher model when optimizing RIDE with knowledge distillation.