Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand Norwegian/Bokmal verb query #455

Merged
merged 14 commits into from
Oct 23, 2024

Conversation

VNW22
Copy link
Contributor

@VNW22 VNW22 commented Oct 21, 2024

Contributor checklist


Description

This pull request expands bokmal verbs query, and have been tested.

Related issue

  • #ISSUE_NUMBER

Copy link

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@andrewtavis
Copy link
Member

andrewtavis commented Oct 22, 2024

Can you handle the merge conflict here, @VNW22? :)

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 22, 2024

Hey @andrewtavis can you help me understand what i need to fix here?
Data/src/scribe_data/language_data_extraction/Norwegian/Bokmål/verbs/query_verbs_1.sparql:

  • masculineFeminineSingularIndefinitePastParticiple : QIDs Q110786 , Q12717679 , Q1775415 , Q499327 not included in metadata
  • SingularneuterIndefinitePastParticiple : QIDs Q110786 , Q12717679 , Q1775461 not included in metadata
  • singularDefinitePastParticiple : QIDs Q110786 , Q12717679 not included in metadata
  • pluralPastParticiple : QID Q12717679 not included in metadata

Data/src/scribe_data/language_data_extraction/Norwegian/Bokmål/verbs/query_verbs.sparql:

  • presentPerfect : perfectPresentPerfectPerfectPresentPerfect
  • infinitiveActive : QID Q179230 not included in metadata
  • presentActive : QID Q192613 not included in metadata
  • infinitivePassive : QID Q179230 not included in metadata
  • presentPassive : QID Q192613 not included in metadata
  • activePresent : QID Q192613 not included in metadata

@andrewtavis andrewtavis self-requested a review October 22, 2024 19:29
@andrewtavis andrewtavis added the hacktoberfest-accepted Accepted as a part of Hacktoberfest label Oct 22, 2024
@andrewtavis
Copy link
Member

Sorry I just improved the error output of this, @VNW22 :) You need to add the form to lexeme_form_metadata.json in an appropriate place so that we can check if the label is appropriate. perfectPresentPerfectPerfectPresentPerfect looks like a bug though 🤔 Do you want to look into this, or should I try to support a bit more?

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 23, 2024

Sorry I just improved the error output of this, @VNW22 :) You need to add the form to lexeme_form_metadata.json in an appropriate place so that we can check if the label is appropriate. perfectPresentPerfectPerfectPresentPerfect looks like a bug though 🤔 Do you want to look into this, or should I try to support a bit more?

sorry, just seeing this now,
I was trying to do so but some forms are alredy in the lexeme_form_metadata.json like masculine, feminine and most of them. So i was kinda unsure of whta to do next.
I will look into the perfectPresentPerfectPerfectPresentPerfect bug :)

@andrewtavis
Copy link
Member

This should be figured out if you pull from main, @VNW22. Big thing is that you should format your query a bit, so please put a period before each . and ; and run the check again :) Might be good if there was a specific check for this that would suggest that someone formats the query if it's found that there are non whitespace characters before periods and semicolons.

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 23, 2024

Got it, thanks! . A specific check for this would definitely be helpful to catch these formatting issues, I can look into it.

@andrewtavis
Copy link
Member

andrewtavis commented Oct 23, 2024

Would be really great, @VNW22! You can do a regex check for something like \s+[.;], i.e. any number of non space characters followed by period or semicolon. From there have the message added to the error output be that the form needs to be formatted and say that spaces and semicolons need spaces before them. This isn't necessary for the query, but it is coding standard for SPARQL :)

@andrewtavis
Copy link
Member

So rather than Invalid query formatting found for these ones add something more specific like Invalid query formatting found - please put spaces before all periods and semicolons :)

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 23, 2024

I added the check under def check_query_forms() -> None:
elif query_form_check_dict[k]["form_rep_match"] is False:
incorrect_query_labels.append(
(k, "Form and representation labels don't match")
)

                # Additional check for formatting
                elif re.search(r"\S+[.;]", k):
                    incorrect_query_labels.append(
                        (k, "Invalid query formatting found - please put spaces before all periods and semicolons :)")
                    )

is this okay?

@andrewtavis
Copy link
Member

Let's give it a try :) I'll check it all after work, but if that's providing the errors to you then looks good to me 😊

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 23, 2024

I'm thinking of adding a check that would suggest that someone formats the query if it's found that there are whitespace characters before commas after the form QIDs, coz i've realise that was the reason why my check was failing

@VNW22
Copy link
Contributor Author

VNW22 commented Oct 23, 2024

do i open a PR for this?

@andrewtavis
Copy link
Member

Yes sure, I'll bring this one in and you can open a new PR for the change to the check 😊

Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work here, @VNW22! Thanks for fixing all the form names 😊

@andrewtavis andrewtavis merged commit 122c9ab into scribe-org:main Oct 23, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Accepted as a part of Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants