Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add process to check for major drops in data between updates #16

Open
2 tasks done
andrewtavis opened this issue Jun 8, 2024 · 4 comments
Open
2 tasks done

Add process to check for major drops in data between updates #16

andrewtavis opened this issue Jun 8, 2024 · 4 comments
Labels
feature New feature or request help wanted Extra attention is needed question Further information is requested

Comments

@andrewtavis
Copy link
Member

andrewtavis commented Jun 8, 2024

Terms

Description

Based on scribe-org/Scribe-Data#68, we need to keep in mind that there will be cases that a property on Wikidata will change such that there will be a large drop in data. In the referenced issue, Portuguese verbs are using a non-standard past perfect PID that could be combined with the more widely used one at some point.

This issue would look into ways of diffing the current data coverage against the new data coming in, which could be as simple as total keys and total non-null values of keys of sub-objects. We could then discuss a viable cutoff, and trigger some kind of warning or a Scribe-Data issue if it's too low 😊

Contribution

Would be happy to discuss! Could also help implement, but might be better if others get to this eventually as I'm a long way off on Go :)

@andrewtavis andrewtavis added help wanted Extra attention is needed question Further information is requested feature New feature or request labels Jun 8, 2024
@andrewtavis
Copy link
Member Author

Even just an email or a Matrix bot with a summary of the changes with some coverage metrics would be great 😊

@daveads
Copy link
Contributor

daveads commented Jun 26, 2024

cool i will take a look at this... @andrewtavis

@wkyoshida
Copy link
Member

Hey @daveads - thank you for the interest in this issue!
Just a quick FYI though that this might be a little ways away, since this is likely dependent on having the CLI for Scribe-Data polished up (i.e. the ongoing GSoC project) and then likely at least v1 of Scribe-Server implemented as well. But once we get there, we can definitely have you take this on!

@daveads
Copy link
Contributor

daveads commented Jun 30, 2024

Hey @daveads - thank you for the interest in this issue! Just a quick FYI though that this might be a little ways away, since this is likely dependent on having the CLI for Scribe-Data polished up (i.e. the ongoing GSoC project) and then likely at least v1 of Scribe-Server implemented as well. But once we get there, we can definitely have you take this on!

oh okay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed question Further information is requested
Projects
Status: Todo
Development

No branches or pull requests

3 participants