Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

预训练拼接不同文本之间是否加入了分隔符 #108

Open
fengcai24 opened this issue Oct 12, 2023 · 0 comments
Open

预训练拼接不同文本之间是否加入了分隔符 #108

fengcai24 opened this issue Oct 12, 2023 · 0 comments

Comments

@fengcai24
Copy link

看代码逻辑:首先遍历现有的批处理(self._inputs),计算每个批处理的剩余空间(space),并检查新数据(input_ids.shape[0])是否能够适应该剩余空间。如果能够适应,代码会更新best_fit和best_fit_space变量,以找到剩余空间最小且能容纳新数据的批处理。
问题:比如一个pack里有多个原始数据。那不同的数据有不同的任务,但是都一块被输入到模型中了,拼接不同文本之间是否加入了分隔符?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant