-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different results and accuracy down to 10% with PandasParallelLFApplier vs PandasLFApplier in Snorkel 0.9.5 #1587
Comments
Hi @durgeshiitj, apologies for the delayed response here! This is likely due to using an unsorted index with |
Hi Henry, |
Hi @durgeshiitj, thanks for reporting and we'll look into version compatibility on our side! |
I didn't get any update on the issue |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Issue description
I ran snorkel(v 0.9.5) on a dataset using PandasParrallelLFApplier and to my surprise I got 10% accuracy which I was expecting to be 90%. Then tried to use PandasLFApplier just to cross verify and I got 90% accuracy. When I compared the LabelMatrixs, both were not eqauls.
Before I was using 0.9.3 never faced problem. Just to cross verify I ran the same dataset on a different sytem having version 0.9.3 with both PandasParallelLFApplier and PandasLFApplier and found that in 0.9.3, both are yielding same Label-Matrix and same accuracy with same LFAnalysis.
Expected behavior
Both LFAppliers should yield similar results.
Screenshots
I'm attaching screenshots for your reference.
V 0.9.5 Analysis:
PandasLFApplier:
PandasParallelLFApplier:
Label-Matrix Comparison:
V 0.9.3 Analysis:
PandasLFApplier:
PandasParallelLFApplier:
Label-Matrix Comparison:
System info
Additional context
Please look into this asap.
The text was updated successfully, but these errors were encountered: