Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4 minor changes to list of phonemes in na.py #204

Open
alexis-michaud opened this issue Oct 13, 2018 · 1 comment
Open

4 minor changes to list of phonemes in na.py #204

alexis-michaud opened this issue Oct 13, 2018 · 1 comment
Milestone

Comments

@alexis-michaud
Copy link

alexis-michaud commented Oct 13, 2018

(This issue is specific to Yongning Na: preprocessing the XML files)

Taking (belatedly) a look at persephone/persephone/datasets/na.py I wonder why nasal vowels, 'ĩ', 'õ', 'ẽ'appear among the set of unitary phonemes ('mono-graphs'), UNI_PHNS: it looks like a (small) mistake. They also appear among the set of bi-graphic phonemes, composed of 2 symbols ('di-graphs'), BI_PHNS, where they belong. So I guess they should be removed from UNI_PHNS, and that should be that.

Also, something that's for me to correct: 'ɻ̃' (in BI_PHNS) and "ɻ̩̃" (in TRI_PHNS) need to be merged to 'ɻ̍̃'.
Explanation: cases of 'ɻ̃', without the diacritic indicating syllabic status, are mistakes: cases where I've been lazy and forgot to add the diacritic. When finalizing the book, I chose to put the diacritic as superscript, not subscript, for clarity. It's a convention of the International Phonetic Alphabet that diacritics that should be below can be put on top when the main character has a descender.

Likewise, in BI_PHNS, 'ɻ̩', "ɻ̍" needs to be merged to just "ɻ̍".

These conventions have now been worked into the current version of the online texts (on GitHub), through this commit.

Finally (for now), double coding of "ṽ̩", "ṽ̩" can now hopefully be taken care of through NFC Unicode normalization: Issue #125

@oadams
Copy link
Collaborator

oadams commented Oct 14, 2018

Thanks for these clarifications. The good news is that because those nasal vowels were in the BI_PHNS set, that takes precedence and they were always treated as such.

@oadams oadams added this to the 0.4.0 milestone Oct 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants