Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid a new line break at the hyphen in "Co-op" in labels #5028

Open
pkoby opened this issue Oct 26, 2024 · 3 comments
Open

Avoid a new line break at the hyphen in "Co-op" in labels #5028

pkoby opened this issue Oct 26, 2024 · 3 comments
Labels

Comments

@pkoby
Copy link

pkoby commented Oct 26, 2024

Expected behavior

As such a short word, it's not expected to see "Co-op" broken into two lines, especially with the weak vowel in the second unstressed syllable. I think it should be kept as one word, if an exception could be made to a rule. I couldn't find where that might happen in the code, or I would have looked into it myself.

Actual behavior

It is split after the hyphen in some instances.

Screenshots with links illustrating the problem

See https://www.openstreetmap.org/#map=18/40.053459/-75.171885
image
Also here: https://www.openstreetmap.org/node/11260100599#map=19/40.046520/-75.195960
And here: https://www.openstreetmap.org/node/2760113374#map=19/40.075227/-75.205517

@imagico
Copy link
Collaborator

imagico commented Oct 26, 2024

Implementing line breaks correctly across multiple languages is a hard problem, especially since we use the same unicode characters in different language where different line breaking conventions apply.

The most reliably way to ensure correct line breaks is for mappers to differentiate between hyphens allowing a line break and those which do not in tagging. Quoting from the Unicode Line Break Algorithm - which Mapnik uses by default:

The rules for treating hyphens in line breaking vary by language. In many instances, these rules are not supported as such in the algorithm, but the correct appearance can be realized by using a non-breaking hyphen.

If there is no consensus to document this differentiation in tagging of names (and you could indeed argue against that) other solutions would be to

  • maintain a dictionary of words in which hyphen should not allow line breaks and substitute such hyphens with non-breaking hyphens in data processing.
  • maintain a set of per-language rules that substitute hyphens with non-breaking hyphens based on context (like the short word aspect here). This requires knowledge of the language of the name tag (which we could use also otherwise - see What about Han unification? #2208).

@imagico imagico added the text label Oct 26, 2024
@pkoby
Copy link
Author

pkoby commented Oct 27, 2024

I had no idea that a non-breaking hyphen was an option (or even existed), and I have no problem using it for my uses. However, it seems like a non-breaking hyphen might not be recognized as a hyphen in searches. Thoughts on that?

@imagico
Copy link
Collaborator

imagico commented Oct 27, 2024

As i said - there are valid arguments against differentiating that in tagging. I am not aware this has ever been discussed more broadly in the mapper community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants