Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bert_bilstm_crf_adv:ValueError: Shape must be rank 2 but is rank 1 for 'task1_msra/crf_layer/Slice_2' (op: 'Slice') with input shapes: [?], [2], [2]. #10

Open
LinJingOK opened this issue Mar 23, 2022 · 6 comments

Comments

@LinJingOK
Copy link

LinJingOK commented Mar 23, 2022

No description provided.

@LinJingOK LinJingOK changed the title @ZR5932 第一个问题我不太确定,可能是你下载的word embedding 是binary format的。如果是glove format试一下把glove_2_wv里面加载词向量的部分KeyedVectors.load_word2vec_format,设置binary=True。word enhance可以看下这篇博客https://www.cnblogs.com/gogoSandy/p/14965711.html bert_bilstm_crf_adv:ValueError: Shape must be rank 2 but is rank 1 for 'task1_msra/crf_layer/Slice_2' (op: 'Slice') with input shapes: [?], [2], [2]. Mar 23, 2022
@LinJingOK
Copy link
Author

报错信息:

`During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/workspace/PycharmProjects/nlp/EmilyNER/test/ChineseNER-main/main.py", line 212, in
singletask_train(args)
File "E:/workspace/PycharmProjects/nlp/EmilyNER/test/ChineseNER-main/main.py", line 83, in singletask_train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 473, in train_and_evaluate
return executor.run()
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 613, in run
return self.run_local()
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 714, in run_local
saving_listeners=saving_listeners)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default
features, labels, ModeKeys.TRAIN, self.config)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "E:\workspace\PycharmProjects\nlp\EmilyNER\test\ChineseNER-main\tools\train_utils.py", line 150, in model_fn
loss, pred_ids = build_graph(features=features, labels=labels, params=params, is_training=is_training)
File "E:\workspace\PycharmProjects\nlp\EmilyNER\test\ChineseNER-main\model\bert_bilstm_crf.py", line 30, in build_graph
trans, log_likelihood = crf_layer(logits, label_ids, seq_len, params['label_size'], is_training)
File "E:\workspace\PycharmProjects\nlp\EmilyNER\test\ChineseNER-main\tools\layer.py", line 126, in crf_layer
sequence_lengths=150 # [batch_size] [32]
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\contrib\crf\python\ops\crf.py", line 257, in crf_log_likelihood
transition_params)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\contrib\crf\python\ops\crf.py", line 116, in crf_sequence_score
false_fn=_multi_seq_fn)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\layers\utils.py", line 202, in smart_cond
pred, true_fn=true_fn, false_fn=false_fn, name=name)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\framework\smart_cond.py", line 56, in smart_cond
return false_fn()
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\contrib\crf\python\ops\crf.py", line 106, in _multi_seq_fn
transition_params)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\contrib\crf\python\ops\crf.py", line 332, in crf_binary_score
truncated_masks = array_ops.slice(masks, [0, 1], [-1, -1])
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\ops\array_ops.py", line 733, in slice
return gen_array_ops.slice(input, begin, size, name=name)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 10488, in _slice
"Slice", input=input, begin=begin, size=size, name=name)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\framework\ops.py", line 2027, in init
control_input_ops)
File "D:\environment\Anaconda3\envs\ChineseNER-main\lib\site-packages\tensorflow\python\framework\ops.py", line 1867, in _create_c_op
raise ValueError(str(e))
ValueError: Shape must be rank 2 but is rank 1 for 'crf_layer/Slice_2' (op: 'Slice') with input shapes: [?], [2], [2].

Process finished with exit code 1
`
python main.py --model bert_bilstm_crf --data msra
单任务时也报这个错,我在想会不会是数据处理时的问题。所以我把自己数据处理的流程详细说一下
先根据readme下载了google的bert文件放在pretrain_model/ch_google中,然后执行data/msra/preprocess.py,生成了bert和其他模型的tfrecord文件,这里bert的有三个bert_giga_valid.tfrecord,bert_giga_train.tfrecord,bert_giga_predict.tfrecord。然后执行python main.py --model bert_bilstm_crf --data msra
就出现了Shape must be rank 2 but is rank 1 for 'crf_layer/Slice_2' (op: 'Slice') with input shapes: [?], [2], [2].这个错误。麻烦了

@DSXiangLi
Copy link
Owner

@LinJingOK 是数据生成有问题,giga和bert是两个不同的tokenizer,前者是词粒度,后者是token粒度。bert模型使用的都是bert tokenizer,所以tfrecord文件是bert_train.tfrecord, 其他非bert模型是giga_train.tfrecord, 词表增强文件会是giga_softword.tfrecord之类的

@LinJingOK
Copy link
Author

@LinJingOK 是数据生成有问题,giga和bert是两个不同的tokenizer,前者是词粒度,后者是token粒度。bert模型使用的都是bert tokenizer,所以tfrecord文件是bert_train.tfrecord, 其他非bert模型是giga_train.tfrecord, 词表增强文件会是giga_softword.tfrecord之类的

您好,谢谢,这个问题已经解决,将bert的路径改为绝对路径解决了。目前生成了您所说的bert_train.tfrecord,bert_valid.tfrecord,bert_predict.tfrecord三个文件.我将config.py中的epoch_size设置为1,然后,执行了python main.py --model bert_bilstm_crf --data msr,项目跑起来了,gpu内存利用率也有,但是一次迭代已经训练了两个小时了还没有结束,输出预测信息,终端日志里面除了打印参数信息,剩下的都是warning,没有其他输出,我想问一下这样的训练是正常的吗,大概需要多久才能训练完成?我看您默认的迭代次数是50,您训练多久?
==========TRAIN PARAMS==========
{'dtype': tf.float32, 'lr': 5e-06, 'log_steps': 100, 'pretrain_dir': './pretrain_model/ch_google', 'batch_size': 32, 'epoch_size': 1, 'warmup_ratio': 0.1, 'early_stop_ratio': 1, 'cell_type': 'lstm', 'cell_size': 1, 'hidden_units_list': [128], 'keep_prob_list': [0.8], 'rnn_activation': 'relu', 'diff_lr_times': {'crf': 500, 'logit': 500, 'lstm': 100}, 'n_sample': 86918, 'max_seq_len': 150, 'label_size': 7, 'tag2idx': {'[PAD]': 0, 'B': 1, 'I': 2, 'E': 3, 'S': 4, '[CLS]': 5, '[SEP]': 6}, 'idx2tag': {0: '[PAD]', 1: 'B', 2: 'I', 3: 'E', 4: 'S', 5: '[CLS]', 6: '[SEP]'}, 'step_per_epoch': 2716, 'num_train_steps': 2716}
==========RUN PARAMS==========
{'summary_steps': 10, 'log_steps': 100, 'save_steps': 500, 'keep_checkpoint_max': 3, 'allow_growth': True, 'pre_process_gpu_fraction': 0.8, 'log_device_placement': True, 'allow_soft_placement': True, 'inter_op_parallel': 2, 'intra_op_parallel': 2}

@DSXiangLi
Copy link
Owner

@LinJingOK checkpoint里面会生成对应ckpt文件,可以用tensorboard --logdir ./checkpoint/your_model_path 来查看模型当前训练进展

@LinJingOK
Copy link
Author

@LinJingOK checkpoint里面会生成对应ckpt文件,可以用tensorboard --logdir ./checkpoint/your_model_path 来查看模型当前训练进展

很抱歉又要打扰您,我训练单任务花费了很长时间,但是程序能够正常结束,并作了evaluation,可以输出预测结果。现在我在跑(bert_bilstm_crf_adv.py)我的命令是python main.py --model bert_bilstm_crf_adv --data msra,msr,参数batch=16,epoch=1,程序正常运行了大概一个小时,程序报错了,生成的文件夹中ner_msra_msr_bert_bilstm_crf_adv中最后个文件是model.ckpt-7500,tensorboard中loss还在2左右,底层错误我先查找了环境的版本,重要的依赖与您的都保持一直了,报错信息如下:
`Traceback (most recent call last):
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
return fn(*args)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: slice index 0 of dimension 0 out of bounds.
[[{{node strided_slice_2}}]]
(1) Invalid argument: slice index 0 of dimension 0 out of bounds.
[[{{node strided_slice_2}}]]
[[gradients/task2_msr/bilstm_layer/bidirectional_rnn/bw/bw/transpose_grad/InvertPermutation/_6090]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "F:/linjing/workspace/ChineseNER-main/main.py", line 149, in
multitask_train(args)
File "F:/linjing/workspace/ChineseNER-main/main.py", line 103, in multitask_train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 473, in train_and_evaluate
return executor.run()
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 613, in run
return self.run_local()
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 714, in run_local
saving_listeners=saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1192, in _train_model_default
saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1484, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 754, in run
run_metadata=run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1252, in run
run_metadata=run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1353, in run
raise six.reraise(*original_exc_info)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\six.py", line 719, in reraise
raise value
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1338, in run
return self._sess.run(*args, **kwargs)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1411, in run
run_metadata=run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1169, in run
return self._sess.run(*args, **kwargs)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: slice index 0 of dimension 0 out of bounds.
[[node strided_slice_2 (defined at F:\linjing\workspace\ChineseNER-main\tools\train_utils.py:199) ]]
(1) Invalid argument: slice index 0 of dimension 0 out of bounds.
[[node strided_slice_2 (defined at F:\linjing\workspace\ChineseNER-main\tools\train_utils.py:199) ]]
[[gradients/task2_msr/bilstm_layer/bidirectional_rnn/bw/bw/transpose_grad/InvertPermutation/_6090]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'strided_slice_2':
File "F:/linjing/workspace/ChineseNER-main/main.py", line 149, in
multitask_train(args)
File "F:/linjing/workspace/ChineseNER-main/main.py", line 103, in multitask_train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 473, in train_and_evaluate
return executor.run()
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 613, in run
return self.run_local()
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 714, in run_local
saving_listeners=saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default
features, labels, ModeKeys.TRAIN, self.config)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "F:\linjing\workspace\ChineseNER-main\tools\train_utils.py", line 199, in model_fn
tokens = tf.boolean_mask(features['tokens'], mask1, axis=0)[0,:]
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\ops\array_ops.py", line 680, in _slice_helper
name=name)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\ops\array_ops.py", line 846, in strided_slice
shrink_axis_mask=shrink_axis_mask)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 12096, in strided_slice
shrink_axis_mask=shrink_axis_mask, name=name)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\anaconda\anaconda3\envs\dong_chineseNER\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init
self._traceback = tf_stack.extract_stack()

Process finished with exit code 1`
如果您有时间的话,能帮我看看吗?还有我在运行顺序:我先msra和msr生成tfrecord文件,然后运行adv的命令,我的执行顺序对着吗?疑问dataset.py需要吗?

@weiambt
Copy link

weiambt commented Apr 24, 2024

@LinJingOK,您好,请问这个问题解决了吗,我bert_bilstm_crf_adv.py最近好像也遇到了这个问题,报错InvalidArgumentError,我的epoch_size也设置的是1。

tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds.
	 [[node strided_slice_4 (defined at /share/home/MP2209128/ChineseNER/ChineseNER-local/tools/train_utils.py:204) ]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants