Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2024-06-24]: [Alphabetical ordering for Ukrainian and Portuguese languages] #724

Closed
olexandr-konovalov opened this issue Jun 24, 2024 · 9 comments · Fixed by #748
Closed
Assignees
Labels
bug Something isn't working lang: pt issues and PR for Portuguese entries lang: ua issues and PR for Ukrainian entries

Comments

@olexandr-konovalov
Copy link
Collaborator

The right navigation bar with letters at https://glosario.carpentries.org/uk/ does not follow the order of letters in Ukrainian alphabet, placing "І" at the last place. Is there a way to enforce the fixed order of the letters?

Screenshot 2024-06-24 at 18 53 38

@olexandr-konovalov olexandr-konovalov added bug Something isn't working lang: ua issues and PR for Ukrainian entries labels Jun 24, 2024
@olexandr-konovalov
Copy link
Collaborator Author

olexandr-konovalov commented Jul 25, 2024

It seems to me the issue is not with the navigation bar, but with terms being ordered in the wrong way in the first instance. I have traced this to "Find terms defined in this language and sort alphabetically":

https://github.com/carpentries/glosario/blob/b7d10b315e6e2b34572dfab4c5692187ec5fb225/_includes/glossary.html#L9C1-L9C61

but I don't know where to go from here.

@YehorBoiar
Copy link

Could it be because we don't have all letters for all terms? So it just ordered ones that only exist

@olexandr-konovalov
Copy link
Collaborator Author

@YehorBoiar we can of course try to push for more translations to cover more Ukrainian letters - new PRs welcome :)

But I don't think that "І" at the last place depends somehow on that. That is perhaps due to Unicode codes playing the role. Similar problem for Portuguese in https://glosario.carpentries.org/pt/ where  is not after A, but at the very end. I don't have any knowledge of Portuguese to comment on this, but https://portuguese.stackexchange.com/questions/5797/do-diacritical-marks-accents-affect-alphabetical-sorting-of-words suggests that this is not the way it should be.

@olexandr-konovalov olexandr-konovalov added the lang: pt issues and PR for Portuguese entries label Jul 25, 2024
@olexandr-konovalov olexandr-konovalov changed the title [2024-06-24]: [Ordering alphabetically for Ukrainian language] [2024-06-24]: [Alphabetical ordering for Ukrainian and Portuguese languages] Jul 25, 2024
@olexandr-konovalov
Copy link
Collaborator Author

P.S. Indeed the unicode code seems to be relevant: compare Ukrainian alphabet by position in alphabet at https://en.wikipedia.org/wiki/Ukrainian_alphabet with the Ukrainian character table at https://character-table.netlify.app/ukrainian/. So the question is as before - how to introduce sorting in a properly alphabetical order?

@froggleston
Copy link
Contributor

This will be down to how liquid and jekyll implement their sort filter.

It seems that they sort by unicode rather than natural ordering in that specific language:

Relevant other info:

@froggleston
Copy link
Contributor

Related to #254

@olexandr-konovalov
Copy link
Collaborator Author

Would it be possible to provide correct sorting for each language explicitly (and request providing that as a part of adding new languages)?

@froggleston
Copy link
Contributor

I believe I've implemented this correctly in the PR above, but due to my lack of lingual knowledge of many of the languages we have in the glossary, it's hard for me to verify! @olexandr-konovalov has been kind enough to verify the Ukrainian sorting as correct.

Remaining issues:

  • Portuguese: It has split "A", "Â" and "A" and I'm not sure why. I don't think it's lingually split if that makes sense, rather technically. So I'll go back and check the template rendering process.
  • Amharic: It seems to have produced a long list of "letter" entries which from my cursory reading about the fascinating language of Ethiopia are phonetic constructs rather than the base letter. Is this correct to do?

@olexandr-konovalov
Copy link
Collaborator Author

Excellent - thanks @froggleston!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lang: pt issues and PR for Portuguese entries lang: ua issues and PR for Ukrainian entries
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants