Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

further-pretraining #16

Open
yuyi5187 opened this issue Jun 23, 2021 · 1 comment
Open

further-pretraining #16

yuyi5187 opened this issue Jun 23, 2021 · 1 comment

Comments

@yuyi5187
Copy link

yuyi5187 commented Jun 23, 2021

I got this error when doing further-pretraining

my environment
Ubuntu 18.04.4 LTS (GNU/Linux 5.4.0-74-generic x86_64)
GPU 2080ti

I use following command
python run_pretraining.py
--input_file=./tmp/tf_AGnews.tfrecord
--output_dir=./uncased_L-12_H-768_A-12_AGnews_pretrain
--do_train=True
--do_eval=True
--bert_config_file=./uncased_L-12_H-768_A-12/bert_config.json
--init_checkpoint=./uncased_L-12_H-768_A-12/bert_model.ckpt
--train_batch_size=8
--max_seq_length=128
--max_predictions_per_seq=20
--num_train_steps=100000
--num_warmup_steps=10000
--save_checkpoints_steps=10000
--learning_rate=5e-5

I got this message and further pretraining does not work
How can I fix this problem?

WARNING:tensorflow:It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 62 vs previous value: 62. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.
W0622 17:33:44.304897 140418054317888 basic_session_run_hooks.py:724] It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 62 vs previous value: 62. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.

@xuyige
Copy link
Owner

xuyige commented Jun 30, 2021

hello~
thank you for your interest in our repo.
for this problem, do you satisfy the requirements mentioned in README.md? that is, tensorflow==1.1x. or you just used tensorflow 2.x?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants