You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(This issue is specific to Yongning Na: preprocessing the XML files)
Taking (belatedly) a look at persephone/persephone/datasets/na.py I wonder why nasal vowels, 'ĩ', 'õ', 'ẽ'appear among the set of unitary phonemes ('mono-graphs'), UNI_PHNS: it looks like a (small) mistake. They also appear among the set of bi-graphic phonemes, composed of 2 symbols ('di-graphs'), BI_PHNS, where they belong. So I guess they should be removed from UNI_PHNS, and that should be that.
Also, something that's for me to correct: 'ɻ̃' (in BI_PHNS) and "ɻ̩̃" (in TRI_PHNS) need to be merged to 'ɻ̍̃'.
Explanation: cases of 'ɻ̃', without the diacritic indicating syllabic status, are mistakes: cases where I've been lazy and forgot to add the diacritic. When finalizing the book, I chose to put the diacritic as superscript, not subscript, for clarity. It's a convention of the International Phonetic Alphabet that diacritics that should be below can be put on top when the main character has a descender.
Likewise, in BI_PHNS, 'ɻ̩', "ɻ̍" needs to be merged to just "ɻ̍".
These conventions have now been worked into the current version of the online texts (on GitHub), through this commit.
Finally (for now), double coding of "ṽ̩", "ṽ̩" can now hopefully be taken care of through NFC Unicode normalization: Issue #125
The text was updated successfully, but these errors were encountered:
Thanks for these clarifications. The good news is that because those nasal vowels were in the BI_PHNS set, that takes precedence and they were always treated as such.
(This issue is specific to Yongning Na: preprocessing the XML files)
Taking (belatedly) a look at
persephone/persephone/datasets/na.py
I wonder why nasal vowels,'ĩ', 'õ', 'ẽ'
appear among the set of unitary phonemes ('mono-graphs'),UNI_PHNS
: it looks like a (small) mistake. They also appear among the set of bi-graphic phonemes, composed of 2 symbols ('di-graphs'),BI_PHNS
, where they belong. So I guess they should be removed fromUNI_PHNS
, and that should be that.Also, something that's for me to correct:
'ɻ̃'
(inBI_PHNS
) and"ɻ̩̃"
(in TRI_PHNS) need to be merged to'ɻ̍̃'
.Explanation: cases of
'ɻ̃'
, without the diacritic indicating syllabic status, are mistakes: cases where I've been lazy and forgot to add the diacritic. When finalizing the book, I chose to put the diacritic as superscript, not subscript, for clarity. It's a convention of the International Phonetic Alphabet that diacritics that should be below can be put on top when the main character has a descender.Likewise, in BI_PHNS,
'ɻ̩', "ɻ̍"
needs to be merged to just"ɻ̍"
.These conventions have now been worked into the current version of the online texts (on GitHub), through this commit.
Finally (for now), double coding of "ṽ̩", "ṽ̩" can now hopefully be taken care of through NFC Unicode normalization: Issue #125
The text was updated successfully, but these errors were encountered: