Simplify project's emoji keyword functionality #359

andrewtavis · 2024-10-15T00:01:25Z

Terms

I have searched open and closed feature requests
I agree to follow Scribe-Data's Code of Conduct

Description

Something that should be changed about the project is the way that the emoji keyword functionality works. Basically all of the files in question are the same except for a few variables, and there are already CLI arguments being passed to these files. We do want the structure of the package to determine the functionality of the project, but then this is a case where there's really no benefit of repetition as there is for the queries where they serve has a record for how to get data from Wikidata via SPARQL.

Some ideas:

We could leave the __init__.py files in the emoji keyword directories that would serve to still include the this functionality for those languages that it works for
We could then move all of the functionality to the src/scribe_data/unicode directory, and this one file would be called from the CLI

Contribution

Thoughts on this would be very appreciated! Happy tor review and work with people on this 😊

The text was updated successfully, but these errors were encountered:

andrewtavis · 2024-10-15T00:02:46Z

CC @DeleMike, @catreedle, @KesharwaniArpita and @VNW22 for the initial discussion 😊

KesharwaniArpita · 2024-10-15T00:32:28Z

Sounds good to me> The emoji keyword functionality is essentially the same for all languages, so centralizing it makes sense. @andrewtavis Can I be assigned?

DeleMike · 2024-10-15T06:44:02Z

This is a great issue! @andrewtavis. I imagine we want to update something related to emojis and we would have to go through all the directories!

As you suggested, having a single point of call for the emojis is important. I went through the files, I believe gen_emoji_lexicon in src/scribe_data/unicode/process_unicode.py will be an important function.

This is just a shallow thought for now. I would love to contribute in any way to resolve this issue.

andrewtavis · 2024-10-15T07:49:19Z

Let's leave this issue for a bit and continue the conversation on it. We all have a lot being worked on right now, so let's close some current issues and then we can plan the work from there :)

Thanks for your interest in helping!

catreedle · 2024-10-15T08:14:59Z

I agree with centralizing the functionality, it’ll certainly save a lot of repetitive work!
I'm curious, will the emoji generation remain uniform across languages, or is there a plan to account for different linguistic forms, such as gender variations?

andrewtavis · 2024-10-15T09:22:04Z

Gendered emojis should be coming out in the current setup as it's Unicode's words that are associated with a given emoji ordered by their usage and then the top X - usually 3 - are selected :) We can keep this in mind for later though 😊

catreedle · 2024-10-15T09:53:39Z

I see. Then I think there shouldn't be any issue with centralizing it :)

Ekikereabasi-Nk · 2024-10-15T09:54:19Z

Hi @andrewtavis @catreedle @KesharwaniArpita @VNW22 @DeleMike I'm interested in joining the team

catreedle · 2024-10-15T10:13:55Z

Hi @andrewtavis @catreedle @KesharwaniArpita @VNW22 @DeleMike I'm interested in joining the team

Welcome! I think we're still very much in the discussion phase. Looking forward to collaborating with you! :)

Ekikereabasi-Nk · 2024-10-15T10:28:02Z

Hi @andrewtavis @catreedle @KesharwaniArpita @VNW22 @DeleMike I'm interested in joining the team

Welcome! I think we're still very much in the discussion phase. Looking forward to collaborating with you! :)

Thank you so much. Has the discussion started? Is the discussion on the element app?

andrewtavis · 2024-10-15T10:35:07Z

I was going to suggest this to you, @Ekikereabasi-Nk :) We're discussing it in the issue right now. Can you look at the emoji keyword functionality and make a suggestion on how to centralize this functionality into the src/scribe_data/unicode directory? :)

Ekikereabasi-Nk · 2024-10-15T12:32:07Z

To achieve a centralize functionality I suggest the steps:

move the shared logic generate_emoji_keyword.py to src/scribe_data/unicode use all the code, and remove variable definition LANGUAGE = "English", emojis_per_keyword = 3 and DATA_TYPE = "emoji-keywords. Instead we will have a centralized function will take these language-specific parameters ( LANGUAGE and emojis_per_keyword) as arguments.
Next, we will need to modify the emoji language file for each language to import the centralized function from step 1 and create a simplify code
We will also need to do some modification in the process_unicode.py

So, how do you all see this suggestion? @andrewtavis

KesharwaniArpita · 2024-10-15T13:16:54Z

Next, we will need to modify the emoji language file for each language to import the centralized function from step 1 and create a simplify code

Hi @andrewtavis, @DeleMike , @catreedle. @Ekikereabasi-Nk, Do you think we should modify the __init__.py files to import and call centralized function, passing in the appropriate variables as arguments (e.g., language and emoji-specific variables)? It will be able to cater the grouped languages (SA Hindustani and Norweign etc) too and any other specific required customization.

andrewtavis · 2024-10-15T15:22:30Z

I'm generally thinking that we follow @Ekikereabasi-Nk's suggestions here and maybe keep the empty __init__.py files as a means of keeping the functionality from the project structure, but more the full process is done in the unicode directory. I'm actually not sure what languages Unicode has support for, so maybe that's something that we could explore a bit - i.e. what languages are included in the CLDR dataset. There's no better source of this information, and with this we'd know to just put an __init__.py file in the directories for those languages that we find have emoji support. What's more, another check could be written to find which languages do have support and make sure that each of them and only them have an __init__.py file :)

andrewtavis · 2024-10-15T15:24:04Z

A basic thing is that the __init__.py files should remain empty as this is Python packaging convention. They should make it easier to load something with a different name or do nothing, as I understand it.

KesharwaniArpita · 2024-10-15T15:32:38Z

Thanks for the feedback! I agree with the idea of keeping the init.py files as a Python packaging convention, especially to maintain the project's structure and potentially assist with language-specific functionality loading.

Regarding the suggestion of using the CLDR dataset to check which languages have emoji support, that sounds like a great idea. It will ensure we're only including relevant languages in the directories.

VNW22 · 2024-10-15T19:11:51Z

heyy, I'm kinda late but i'd like to join in the discussion :)

andrewtavis · 2024-10-15T19:42:25Z

By all means, @VNW22! Let's try to get to this soon :) @Ekikereabasi-Nk, do you want to open a PR for this and the others can review?

VNW22 · 2024-10-15T19:53:52Z

I fully support the plan to centralize the emoji-keyword functionality by moving the shared logic to src/scribe_data/unicode—this will streamline the process and reduce redundancy. It seems like a solid solution has emerged from the discussion so far, but I’d be happy to assist with any part of the refactoring or the exploration of the CLDR dataset to ensure we cover all relevant languages.

andrewtavis · 2024-10-15T20:10:11Z

Do you want to look into the script to check that we have emoji support for all languages that we can and don't for those that we shouldn't, @VNW22? You'd need to do the setup for CLDR, which is difficult to do on Windows (if that's your operating system, then you'll likely need WSL to run the emoji programs on a Linux machine).

Let us know!

Ekikereabasi-Nk · 2024-10-15T20:17:01Z

By all means, @VNW22! Let's try to get to this soon :) @Ekikereabasi-Nk, do you want to open a PR for this and the others can review?

Alright @andrewtavis

KesharwaniArpita · 2024-10-16T00:47:49Z

@Ekikereabasi-Nk and @andrewtavis , I wanted to rewrite the code for the language emoji files. I think we can start collaborating on the code. While @Ekikereabasi-Nk is working on the centralized script, is it alright that I start working on the function call for the languages? We can make the minor changes later too?

Ekikereabasi-Nk · 2024-10-16T02:31:09Z

@Ekikereabasi-Nk and @andrewtavis , I wanted to rewrite the code for the language emoji files. I think we can start collaborating on the code. While @Ekikereabasi-Nk is working on the centralized script, is it alright that I start working on the function call for the languages? We can make the minor changes later too?

Sure @KesharwaniArpita I'm also through with the centralize function

andrewtavis · 2024-10-16T05:40:20Z

Feel free to send along PRs and we'll see on both ends :)

VNW22 · 2024-10-16T06:32:10Z

Do you want to look into the script to check that we have emoji support for all languages that we can and don't for those that we shouldn't, @VNW22? You'd need to do the setup for CLDR, which is difficult to do on Windows (if that's your operating system, then you'll likely need WSL to run the emoji programs on a Linux machine).

Let us know!

is it possible on mac?

VNW22 · 2024-10-16T08:24:39Z

Do you want to look into the script to check that we have emoji support for all languages that we can and don't for those that we shouldn't, @VNW22? You'd need to do the setup for CLDR, which is difficult to do on Windows (if that's your operating system, then you'll likely need WSL to run the emoji programs on a Linux machine).

Let us know!

okay, I'll be working on it

andrewtavis · 2024-10-16T10:56:25Z

Sorry I was planning on sending along an explanation here, @VNW22, but got caught up with things :)

Is much easier on Mac and Linux. Specifically we to have a guide for this here. Let me know if anything is confusing and we can update the guide!

Thanks for looking into this 😊

Ekikereabasi-Nk · 2024-10-17T02:57:04Z

Thanks @KesharwaniArpita for the work here #397

VNW22 · 2024-10-17T06:25:35Z

Sorry I was planning on sending along an explanation here, @VNW22, but got caught up with things :)

Is much easier on Mac and Linux. Specifically we to have a guide for this here. Let me know if anything is confusing and we can update the guide!

Thanks for looking into this 😊

no worries :) looking into it

andrewtavis added feature New feature or request help wanted Extra attention is needed hacktoberfest Included as a part of Hacktoberfest labels Oct 15, 2024

andrewtavis assigned DeleMike, catreedle, KesharwaniArpita, Ekikereabasi-Nk and VNW22 Oct 15, 2024

andrewtavis mentioned this issue Oct 15, 2024

Estonian verb data query #345

Open

1 task

Ekikereabasi-Nk mentioned this issue Oct 16, 2024

Centralizing the emoji keyword generation logic #379

Open

1 task

KesharwaniArpita mentioned this issue Oct 17, 2024

Centralized Emoji Keyword Functionality call for All Languages #397

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify project's emoji keyword functionality #359

Simplify project's emoji keyword functionality #359

andrewtavis commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

KesharwaniArpita commented Oct 15, 2024

DeleMike commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

catreedle commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

catreedle commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

catreedle commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024 •

edited

Loading

KesharwaniArpita commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

KesharwaniArpita commented Oct 15, 2024

VNW22 commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

VNW22 commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

KesharwaniArpita commented Oct 16, 2024

Ekikereabasi-Nk commented Oct 16, 2024

andrewtavis commented Oct 16, 2024

VNW22 commented Oct 16, 2024 •

edited

Loading

VNW22 commented Oct 16, 2024

andrewtavis commented Oct 16, 2024

Ekikereabasi-Nk commented Oct 17, 2024 •

edited

Loading

VNW22 commented Oct 17, 2024

Simplify project's emoji keyword functionality #359

Simplify project's emoji keyword functionality #359

Comments

andrewtavis commented Oct 15, 2024

Terms

Description

Contribution

andrewtavis commented Oct 15, 2024

KesharwaniArpita commented Oct 15, 2024

DeleMike commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

catreedle commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

catreedle commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

catreedle commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024 • edited Loading

KesharwaniArpita commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

KesharwaniArpita commented Oct 15, 2024

VNW22 commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

VNW22 commented Oct 15, 2024

andrewtavis commented Oct 15, 2024

Ekikereabasi-Nk commented Oct 15, 2024

KesharwaniArpita commented Oct 16, 2024

Ekikereabasi-Nk commented Oct 16, 2024

andrewtavis commented Oct 16, 2024

VNW22 commented Oct 16, 2024 • edited Loading

VNW22 commented Oct 16, 2024

andrewtavis commented Oct 16, 2024

Ekikereabasi-Nk commented Oct 17, 2024 • edited Loading

VNW22 commented Oct 17, 2024

Ekikereabasi-Nk commented Oct 15, 2024 •

edited

Loading

VNW22 commented Oct 16, 2024 •

edited

Loading

Ekikereabasi-Nk commented Oct 17, 2024 •

edited

Loading