We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在NER中, 一个比较容易出错的地方是由于tokenizer以后, 导致 句子和原始输入的句子长度、token的位置不一致. 在 tokenier.py的代码中, 好像并没有解决 tokenizer输入和输出长度不一致的问题. 例如, 在读入 粗粒度NER的语料后, sents_src, sents_tgt = read_corpus(data_path) 其中的 sents_src[3], sents_tgt[3], 经过 tokenizer以后, 长度并一致, 这样会报错.
The text was updated successfully, but these errors were encountered:
早就在群里说过了,这里有坑,想想如何对应上,不难。
Sorry, something went wrong.
No branches or pull requests
在NER中, 一个比较容易出错的地方是由于tokenizer以后, 导致 句子和原始输入的句子长度、token的位置不一致.
在 tokenier.py的代码中, 好像并没有解决 tokenizer输入和输出长度不一致的问题.
例如, 在读入 粗粒度NER的语料后,
sents_src, sents_tgt = read_corpus(data_path)
其中的 sents_src[3], sents_tgt[3], 经过 tokenizer以后, 长度并一致, 这样会报错.
The text was updated successfully, but these errors were encountered: