Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(utils.py): move language data to JSON file (resolves #52) #54

Merged
merged 1 commit into from
Oct 30, 2023

Conversation

m-charlton
Copy link
Contributor

Contributor checklist


Description

Moves language meta data from python source code into an external JSON resource which is loaded when the utils module is imported.

Notes

language-meta-data.json

Please place attention to the comment properties: "description", "iso", "qid" & "remove-words".

utils.py

Couldn't think of a better name for the _find function - suggestions welcome.

@github-actions
Copy link

github-actions bot commented Oct 18, 2023

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!

Maintainer checklist

  • The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution

    • The contributor's name and icon in remote commits should be the same as what appears in the PR
    • If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
  • The CHANGELOG has been updated with a description of the changes for the upcoming release (if necessary)

@@ -230,7 +241,7 @@ def get_path_from_format_file() -> str:
return "../../../../../.."


def get_path_from_load_dir() -> str:
def get_path_from_load_dir():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type annotation: str



def get_language_words_to_remove(language: str) -> list[str]:
"""
Returns the words that should not be included as autosuggestions for the given language.
Returns the words that should not be included as autosuggestions for the given
language.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new line (combine with previous line)

The words that should not be included as autosuggestions for the given language as values of a dictionary.
list[str]
The words that should not be included as autosuggestions for the given
language
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new line (combine with previous line)

The words that should not be included as autosuggestions for the given language as values of a dictionary.
list[str]
The words that should not be included as autosuggestions for the given
language
Copy link
Contributor Author

@m-charlton m-charlton Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new line (combine with previous line)

@andrewtavis : This function has the same description as get_language_words_to_remove on line 189. This could be an error. One this is rectified the entry on line 9 of src/scribe_data/resources/language_meta_data.json can be updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Fixed locally and will send it shortly :)

@@ -349,7 +363,8 @@ def check_and_return_command_line_args(
all_args, first_args_check=None, second_args_check=None
):
"""
Checks command line arguments passed to Scribe-Data files and returns them if correct.
Checks command line arguments passed to Scribe-Data files and returns them if
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new line (combine with previous line)

@@ -365,7 +380,8 @@ def check_and_return_command_line_args(
Returns
-------
first_args, second_args: list(str)
The subset of possible first and second arguments that have been verified as being valid.
The subset of possible first and second arguments that have been verified
as being valid.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new line (combine with previous line)

"Spanish",
"Swedish",
]
return sorted(entry["language"].capitalize() for entry in _languages)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice :)

)

return language_qid_dict[language]
return _find(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool 🤩

)

return language_iso_dict[language]
return _find(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I'm talking about in #55. We could basically get rid of this function and use langcodes, thus allowing us to add languages without needing to track their ISOs :)

raise ValueError(f"{iso.upper()} is currently not a supported ISO language.")

return iso_language_dict[iso]
return _find(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for this, we could remove it with #55 :)

@@ -255,7 +266,8 @@ def get_ios_data_path(language: str) -> str:

Returns
-------
The path to the data json for the given language.
str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these 🙏 Definitely is good standard :)

Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great, @m-charlton! Thanks for all this :) I'll do a quick sweep to get any minor fixes that remain 😊

@andrewtavis andrewtavis merged commit ad07bb1 into scribe-org:main Oct 30, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants