Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hanzi.to_pinyin delimiter is ignored #28

Open
glowinthedark opened this issue Aug 1, 2020 · 2 comments
Open

hanzi.to_pinyin delimiter is ignored #28

glowinthedark opened this issue Aug 1, 2020 · 2 comments

Comments

@glowinthedark
Copy link

glowinthedark commented Aug 1, 2020

Summary

The delimiter parameter to to_pinyin() has no effect

Example:

hanzi.to_pinyin("我猕猴桃过敏。", delimiter='.')
# ACTUAL OUTPUT:
#     'wǒmíhóutáoguòmǐn。'

# EXPECTED OUTPUT:
#     'wǒ.míhóutáo.guòmǐn。'

The default delimiter of empty string ' ' is not applied either:

hanzi.to_pinyin("我猕猴桃过敏。"')
# ACTUAL OUTPUT:
#     'wǒmíhóutáoguòmǐn。'

# EXPECTED OUTPUT:
#     'wǒ míhóutáo guòmǐn。'
@tsroten
Copy link
Owner

tsroten commented Aug 1, 2020

Hello @glowinthedark! So, hanzi.to_pinyin()'s delimiter argument is referring to the Chinese character source string. It's used to partition the string by words rather than characters, allowing for a more accurate Pinyin reading.

delimiter is the character used to indicate word boundaries in s. This is used to differentiate between words and characters so that a more accurate reading can be returned.

Being able to format the output of the function makes sense though 👍

@glowinthedark
Copy link
Author

glowinthedark commented Aug 1, 2020

@tsroten: Understood. Thanks for clarifying. But then delimiter is really misleading because in other libraries delimiter or separator are used to signify the string to use as a delimiter in the generated output, and not as a hint about the format of the input string. The semantics would rather fit the description of input_delimiter or source_delimiter rather than delimiter.

How to generate then something like 'wǒ míhóutáo guòmǐn。'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants