Codes for ACL2023 Findings paper: Boosting Text Augmentation via Hybrid Instance Filtering Framework
you can find examples for augmenting text classification and aspect-term sentiment classification at https://github.com/yangheng95/PyABSA/tree/v2/examples-v2/text_augmentation
This tool depends on the PyABSA, and is integrated with the ABSADatasets.
To augment your own dataset, you need to prepare your dataset according to ABSADatasets. Refer to the instruction to process or annotate your dataset.
git clone https://github.com/yangheng95/BoostTextAugmentation
cd BoostTextAugmentation
pip install .
- If the augmentation traning is terminated by accidently or you want to rerun augmentation, set
rewrite_cache=True
in augmentation. - If you have many datasets, run augmentation for differnet datasets IN SEPARATE FOLDER, otherwise
IO OPERATION
may CORRUPT other datasets
This is the draft code, so do not perform cross-boosting on different dataset in the same folder, which will raise some Exception
@inproceedings{yang-li-2023-boosting,
title = "Boosting Text Augmentation via Hybrid Instance Filtering Framework",
author = "Yang, Heng and
Li, Ke",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.findings-acl.105",
pages = "1652--1669",
}