Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

The loss has converged at early stage? #22

Open
sinsauzero opened this issue Apr 11, 2023 · 0 comments
Open

The loss has converged at early stage? #22

sinsauzero opened this issue Apr 11, 2023 · 0 comments

Comments

@sinsauzero
Copy link

I used the default vits-16 config to train on Imagenet1k end to end. But I found that the loss has converged to 2.492 after one epoch. Is that normal?
And if so, how does the preformance improve since the loss seems not decrease any more in the next hundreds epoch?
And if not, is there anything I did wrong? the config I used is as follows

criterion: ent_weight: 0.0 final_sharpen: 0.25 me_max: true memax_weight: 1.0 num_proto: 1024 start_sharpen: 0.25 temperature: 0.1 batch_size: 32 use_ent: true use_sinkhorn: true data: color_jitter_strength: 0.5 pin_mem: true num_workers: 10 image_folder: /gruntdata6/xinshulin/data/imagenet/new_train/1 label_smoothing: 0.0 patch_drop: 0.15 rand_size: 224 focal_size: 96 rand_views: 1 focal_views: 10 root_path: /gruntdata6/xinshulin/data/imagenet/new_train logging: folder: checkpoint/msn_os_logs4/ write_tag: msn-experiment-1 meta: bottleneck: 1 copy_data: false drop_path_rate: 0.0 hidden_dim: 2048 load_checkpoint: false model_name: deit_small output_dim: 256 read_checkpoint: null use_bn: true use_fp16: false use_pred_head: false optimization: clip_grad: 3.0 epochs: 800 final_lr: 1.0e-06 final_weight_decay: 0.4 lr: 0.001 start_lr: 0.0002 warmup: 15 weight_decay: 0.04

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant