New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

tabular data/ noisy instances /new data #7

Open

nazaretl opened this issue May 9, 2022 · 1 comment

nazaretl commented May 9, 2022

Hi,
thanks for sharing your implementation. I have some questions about it:

Does it also work on tabular data?
Is the code tailored to the datasets used in the paper or can one apply it to any data?
Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

H-Jamieu commented May 17, 2024

I am not the authors, so the answer is from my own understanding and may not be true.

Possibile after mofification.
According to my understanding, the loss.py is somehow appliable to any data whose loss function is CE
In this method, to filiter out noisy id is no difference with using small loss trick. Just rank the loss and label the bottom ones (e.g. last 5%) as possibile noisy. You can aggrate the noisy candidate over epoches and analyse which ones are frequent large loss samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment