Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: nlpaug could generate an indefinite number of augmented samples. #75

Closed
3 tasks done
HYLcool opened this issue Nov 15, 2023 · 0 comments · Fixed by #76
Closed
3 tasks done

[Bug]: nlpaug could generate an indefinite number of augmented samples. #75

HYLcool opened this issue Nov 15, 2023 · 0 comments · Fixed by #76
Assignees
Labels
bug Something isn't working

Comments

@HYLcool
Copy link
Collaborator

HYLcool commented Nov 15, 2023

Before Reporting 报告之前

  • I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。

  • I have read the README carefully and no error occurred during the installation process. (Otherwise, we recommend that you can ask a question using the Question template) 我已经仔细阅读了 README 上的操作指引,并且在安装过程中没有错误发生。(否则,我们建议您使用Question模板向我们进行提问)

Search before reporting 先搜索,再报告

  • I have searched the Data-Juicer issues and found no similar bugs. 我已经在 issue列表 中搜索但是没有发现类似的bug报告。

OS 系统

all

Installation Method 安装方式

from source

Data-Juicer Version Data-Juicer版本

latest

Python Version Python版本

3.8

Describe the bug 描述这个bug

During the FT-Ranker competition, a user used nlpaug_en_mapper to augment the dataset but an error occurs during processing:

image

which shows that for some samples, nlpaug generated no augmented samples. Fields alignment except text should be modified similar to nlpcda_zh_mapper.

To Reproduce 如何复现

Run nlpaug_en_mapper on FT-Ranker Competition dataset raw_data_en.jsonl.

Configs 配置信息

Logs 报错日志

Screenshots 截图

Additional 额外信息

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant