Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tabular data/ noisy instances /new data #7

Open
nazaretl opened this issue May 9, 2022 · 1 comment
Open

tabular data/ noisy instances /new data #7

nazaretl opened this issue May 9, 2022 · 1 comment

Comments

@nazaretl
Copy link

nazaretl commented May 9, 2022

Hi,
thanks for sharing your implementation. I have some questions about it:

  1. Does it also work on tabular data?
  2. Is the code tailored to the datasets used in the paper or can one apply it to any data?
  3. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

@H-Jamieu
Copy link

I am not the authors, so the answer is from my own understanding and may not be true.

  1. Possibile after mofification.
  2. According to my understanding, the loss.py is somehow appliable to any data whose loss function is CE
  3. In this method, to filiter out noisy id is no difference with using small loss trick. Just rank the loss and label the bottom ones (e.g. last 5%) as possibile noisy. You can aggrate the noisy candidate over epoches and analyse which ones are frequent large loss samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants