-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An issue on Searching Japanese words. #1875
Comments
#79 discuss about the page search which rely on chrome's search functionality. Chrome does not support it. |
Are my messages currently being deleted? I swear this is the 3rd time i post a message here, last time i even attached a video. This issue is getting old. Adding morphology doesn't fix the problem, and even searching for kanjinized words you get the same results...tons of unrelated entries by all dictionaries of the group of search. goldendict_wrong_entries.mp4 |
が has two representations As a standalone Or as a combination of Then we get the Unicode normalization's table8, both NFKD & NFKC will merge various combination of hiragana (?) letter and voice sound marks. https://www.unicode.org/reports/tr15/tr15-56.html Other references
So, the root issue appears to be the normalization process, which is considered as a feature in some other languages. |
in other words, "we're not fixing this". |
Can you provide the dictionary for testing ?
Maybe in the future ,this issue will be solved. |
JMDict Furigana, JMDict+: https://jd4gd.com/jmdictplus.html That's 4. |
Did it and nothing changed. Also, i've deleted all these dicts from the directory and still the same issue with the rest of dicts. |
The fix is changing Unicode normalization strategy from goldendict-ng/src/common/folding.cc Line 25 in dda91a3
The technical reason behind is shown in this table https://unicode.org/reports/tr15/#NFKD_And_NFKC_Applied_Table In Not sure what to do. Because changing that value will probably break other languages that consider characters with these marks as the same characters (Also not sure if this is true. Does It may also require all dicts to be reindexed. |
This has not explained why search |
BTW,You do not have to delete the other dictionaries , just disable them .or only enable the tested dictionary. |
well, the issue is that after taking off those dicts, the problem still didn't get fixed. This is how it looks now: (only showing three dicts from a bigger list) |
An issue on Searching Japanese words.
When searching for Japanese words, there is no distinction between unvoiced consonants(清音) and voiced consonants(濁音).
For example, if you search for a word "Agaru(あがる)", "Agaru(あがる)" and "Akaru(あかる)" will be searched at the same time.
Those are making the search display very complicated.
The text was updated successfully, but these errors were encountered: