Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Compound" neumes are incorrectly labeled by the MEI parser #910

Closed
dchiller opened this issue Sep 23, 2024 · 1 comment · Fixed by #913
Closed

"Compound" neumes are incorrectly labeled by the MEI parser #910

dchiller opened this issue Sep 23, 2024 · 1 comment · Fixed by #913

Comments

@dchiller
Copy link
Collaborator

dchiller commented Sep 23, 2024

This gets classified as a "compound" neume:
image
It's just a punctum (or a neume component with no neume name).

@dchiller
Copy link
Collaborator Author

Here's the neume in question:
image

The problem is in the MEI tokenizer in this area (

if neume_start and largest_num_neumes < self.max_ngram:
min_wanted_ngram_length = max(largest_num_neumes + 1, self.min_ngram)
for wanted_ngram_length in range(
min_wanted_ngram_length, self.max_ngram + 1
):
ngram_neume_names: List[NeumeName] = []
ngram_num_pitches = 0
# We'll add pitches to our ngram until we have the
# number of neumes we want in our ngram or we reach
# the end of the file.
while (len(ngram_neume_names) <= wanted_ngram_length) and (
start_idx + ngram_num_pitches < len(pitches)
):
if (
name_at_pitch := neume_names[start_idx + ngram_num_pitches]
) is not None and len(ngram_neume_names) < wanted_ngram_length:
ngram_neume_names.append(name_at_pitch)
ngram_num_pitches += 1
if len(ngram_neume_names) == wanted_ngram_length:
break
# We'll only add this ngram if we've actually gotten to
# the desired number of neumes (if we didn't, it means
# we reached the end of the file)
if len(ngram_neume_names) == wanted_ngram_length:
ngram_pitches = pitches[
start_idx : start_idx + ngram_num_pitches
]
doc = self._create_document_from_neume_components(ngram_pitches)
doc["neume_names"] = "_".join(ngram_neume_names)
ngram_docs.append(doc)
) where we only add the pitch at the beginning of the neume with the neume name we've collected on line 208.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant