Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supernet Training with Constraints #16

Open
betterhalfwzm opened this issue Mar 17, 2020 · 0 comments
Open

Supernet Training with Constraints #16

betterhalfwzm opened this issue Mar 17, 2020 · 0 comments

Comments

@betterhalfwzm
Copy link

Thanks for your excellent work!
When i train supernet with constraints with follow script, i meet error in the val.

export MXNET_SAFE_ACCUMULATION=1

python train_imagenet.py
--rec-train /data3/wangzhaoming/mxnet_imagenet/rec/train.rec --rec-train-idx /data3/wangzhaoming/mxnet_imagenet/rec/train.idx
--rec-val /data3/wangzhaoming/mxnet_imagenet/rec/val.rec --rec-val-idx /data3/wangzhaoming/mxnet_imagenet/rec/val.idx
--mode imperative --lr 1.3 --wd 0.00004 --lr-mode cosine --dtype float16
--num-epochs 120 --batch-size 128 --num-gpus 8 -j 48
--label-smoothing --no-wd --warmup-epochs 5 --use-rec
--model ShuffleNas
--epoch-start-cs 60 --cs-warm-up --channels-layout OneShot
--save-dir params_shufflenas_supernet --logging-file ./logs/shufflenas_supernet.log
--train-upper-constraints flops-160-params-2.5 --train-bottom-constraints flops-90-params-1.4
--train-constraint-method evolution

Epoch[0] Batch [49] Speed: 322.095226 samples/sec accuracy=0.000605 lr=0.010393
Epoch[0] Batch [99] Speed: 492.513575 samples/sec accuracy=0.000791 lr=0.020787
Epoch[0] Batch [149] Speed: 457.981573 samples/sec accuracy=0.000937 lr=0.031180
Epoch[0] Batch [199] Speed: 688.650089 samples/sec accuracy=0.000903 lr=0.041573
Epoch[0] Batch [249] Speed: 465.918790 samples/sec accuracy=0.000957 lr=0.051967
Epoch[0] Batch [299] Speed: 490.846376 samples/sec accuracy=0.000957 lr=0.062360
Epoch[0] Batch [349] Speed: 606.910845 samples/sec accuracy=0.000977 lr=0.072753
Epoch[0] Batch [399] Speed: 567.445527 samples/sec accuracy=0.000986 lr=0.083147
Epoch[0] Batch [449] Speed: 618.184875 samples/sec accuracy=0.000990 lr=0.093540
Epoch[0] Batch [499] Speed: 593.677446 samples/sec accuracy=0.000982 lr=0.103933
Epoch[0] Batch [549] Speed: 631.991306 samples/sec accuracy=0.000978 lr=0.114327
Epoch[0] Batch [599] Speed: 614.757373 samples/sec accuracy=0.000985 lr=0.124720
Epoch[0] Batch [649] Speed: 568.749700 samples/sec accuracy=0.000975 lr=0.135114
Epoch[0] Batch [699] Speed: 610.768222 samples/sec accuracy=0.000961 lr=0.145507
Epoch[0] Batch [749] Speed: 659.102106 samples/sec accuracy=0.000961 lr=0.155900
Epoch[0] Batch [799] Speed: 563.044769 samples/sec accuracy=0.000964 lr=0.166294
Epoch[0] Batch [849] Speed: 572.482835 samples/sec accuracy=0.000959 lr=0.176687
Epoch[0] Batch [899] Speed: 611.510812 samples/sec accuracy=0.000969 lr=0.187080
Epoch[0] Batch [949] Speed: 585.310555 samples/sec accuracy=0.000970 lr=0.197474
Epoch[0] Batch [999] Speed: 586.269362 samples/sec accuracy=0.000970 lr=0.207867
Epoch[0] Batch [1049] Speed: 584.871140 samples/sec accuracy=0.000973 lr=0.218260
Epoch[0] Batch [1099] Speed: 580.345403 samples/sec accuracy=0.000976 lr=0.228654
Epoch[0] Batch [1149] Speed: 604.746532 samples/sec accuracy=0.000979 lr=0.239047
Epoch[0] Batch [1199] Speed: 425.625182 samples/sec accuracy=0.000976 lr=0.249440
Epoch[0] Batch [1249] Speed: 673.577257 samples/sec accuracy=0.000977 lr=0.259834
Traceback (most recent call last):
File "train_imagenet.py", line 738, in
main()
File "train_imagenet.py", line 734, in main
train(context)
File "train_imagenet.py", line 710, in train
err_top1_val, err_top5_val = test(ctx, val_data, epoch)
File "train_imagenet.py", line 439, in test
ignore_first_two_cs=opt.ignore_first_two_cs)
File "/data3/wangzhaoming/Single-Path-One-Shot-NAS-MXNet/oneshot_nas_network.py", line 248, in random_channel_mask
channel_choice = random.randint(channel_scale_start, len(self.candidate_scales) - 1)
File "/data3/wangzhaoming/anconda3/lib/python3.7/random.py", line 222, in randint
return self.randrange(a, b+1)
File "/data3/wangzhaoming/anconda3/lib/python3.7/random.py", line 200, in randrange
raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (68,10, -58)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant