Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Persian data process queries #400

Open
7 tasks done
catreedle opened this issue Oct 17, 2024 · 8 comments
Open
7 tasks done

Create Persian data process queries #400

catreedle opened this issue Oct 17, 2024 · 8 comments
Assignees
Labels
data Relates to data or Wikidata good first issue Good for newcomers hacktoberfest Included as a part of Hacktoberfest help wanted Extra attention is needed

Comments

@catreedle
Copy link
Contributor

catreedle commented Oct 17, 2024

Terms

Languages

Persian

Description

This issue would look into expanding the src/scribe_data/language_data_extraction/Persian files with as much data as are possible from the current data on Wikidata. We can use code for getting data from other languages, and from there we can check Persian data on Wikidata for what conjugations are available. We would likely need to filter for fa for Farsi. We can then expand the query with optional selections of certain forms as is done in other SPARQL queries. The query can be tried on the Wikidata Query Service UI during development :)

Data types to include:

  • Nouns
  • Verbs
  • Adjectives
  • Adverbs
  • Prepositions

Contribution
Happy to work on this and help anyone interested 😊

@catreedle catreedle added the data Relates to data or Wikidata label Oct 17, 2024
@andrewtavis andrewtavis added help wanted Extra attention is needed good first issue Good for newcomers hacktoberfest Included as a part of Hacktoberfest labels Oct 17, 2024
@VNW22
Copy link
Contributor

VNW22 commented Oct 17, 2024

hey @catreedle can I work with you on this?

@catreedle
Copy link
Contributor Author

hey @VNW22, sure can! :)
I have not started any work on this. We can comment here to keep track of which data type we're working on? What do you think?

@andrewtavis
Copy link
Member

Keep in mind that we'll need a language filter for Farsi via "fa" as is done in the Hindustani queries :) Not 100% sure, but worth checking.

@VNW22
Copy link
Contributor

VNW22 commented Oct 18, 2024

hey @VNW22, sure can! :) I have not started any work on this. We can comment here to keep track of which data type we're working on? What do you think?

I can do verbs and prepositions

@catreedle
Copy link
Contributor Author

okay @VNW22, I'll start on nouns and adjectives :)

@catreedle
Copy link
Contributor Author

Keep in mind that we'll need a language filter for Farsi via "fa" as is done in the Hindustani queries :) Not 100% sure, but worth checking.

hi @andrewtavis, how to decide/check if a language needs a filter?

catreedle added a commit to catreedle/Scribe-Data that referenced this issue Oct 18, 2024
@andrewtavis
Copy link
Member

Check to see if the forms have more than one language on them :) So you can see for say a Hindustani lexeme that all the forms have hi and ur equivalents. If there's nothing like that for Persian, then maybe we don't have to worry 😊

@catreedle
Copy link
Contributor Author

I'll work on adverbs :) @VNW22

andrewtavis added a commit that referenced this issue Oct 22, 2024
* add Persian query adjectives #400

* fix comment language qid

* remove filter fa for persian query

* Persian adverbs query

* Minor query formatting

---------

Co-authored-by: Andrew Tavis McAllister <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Relates to data or Wikidata good first issue Good for newcomers hacktoberfest Included as a part of Hacktoberfest help wanted Extra attention is needed
Projects
Status: Todo
Development

No branches or pull requests

3 participants