-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Result #20
Comments
Before merge the ner labels, these are the labels distribution:
|
Good work! So it looks like around 70% is where we're going to be for now. Can you get per-class accuracy, too? We don't really care so much about MISC and it could be that that one is harder than the rest. |
@ahalterman |
1.augmented_for_training_1 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on GPE: 2.augmented_for_training_2 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on PERSON: 3.augmented_for_training_3 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on ORG: This is the overall accuracy including all the class: @ahalterman hey Andy check these training result, so all trained on data without MISC and eval on individual tag class, pretty average the accuracy for each class, you can see how many records has been evaluated on (for each training the first picture). |
@khaledJabr @ahalterman
1.With Pretrained pruned vector and ner spacy trained model, then update the model only with prodigy labled data, like 800 tokens, we get this: no merged ner class yet
2.with no pretrained model eveything else is the same as case 1 we got this: no merged ner class yet (so yes, the pretained model does help)
3 trained with only ldc data with Prodigy with pretraind spacy ner model, other case like case 1.
4. prodigy data + 23 times prodigy size reheasal data other case like case 3.
since we have 18670 (onto token) and 801 (prodigy labeled token) in order to get used all of the data we use 23 as multiplier since 18670/801=23
with 18670 training samples we get 4122 empty spanned removed.
5. with merged ner class, other condition like 4.
6 with Khaled cleaned data , other condition like 5
The text was updated successfully, but these errors were encountered: