Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup font for missing characters when drawing text #4808

Open
Markxy opened this issue Jul 22, 2020 · 30 comments · May be fixed by #6926
Open

Backup font for missing characters when drawing text #4808

Markxy opened this issue Jul 22, 2020 · 30 comments · May be fixed by #6926

Comments

@Markxy
Copy link

Markxy commented Jul 22, 2020

Description of the request

I want to use a specific font but also draw characters which are not available in that font. Replace them with characters from another "generic" font, like Arial-Bold for example.
Similarly to how web browsers work when they are missing characters from fonts - the OS fills that void from a default font.

from PIL import Image, ImageFont, ImageDraw


temp_canvas = Image.new("RGBA", (1200, 300), (255, 255, 255, 255))
draw_canvas = ImageDraw.Draw(temp_canvas, "RGBA")

font = ImageFont.truetype(r"C:\fonts\BarlowSemiCondensed-Bold.ttf", size=150)
text_string = "hello ಠಠ world"

draw_canvas.text((100, 100), text_string, fill="#000000", font=font)

temp_canvas.show()

The output is:
image

Proposed solution

Add an argument to ImageDraw.Draw().text() like backup_font , which would be used when the first font doesn't have the character specified in text_string . Like in this example

@nulano
Copy link
Contributor

nulano commented Jul 22, 2020

I don't think FreeType (the library Pillow uses to load fonts) is able to combine multiple typefaces into one. This change would be simple with basic layout (Pillow can try loading each glyph from a list of fonts until one succeeds), but Raqm (used for complex scripts) seems to require passing a single typeface to be used for the whole string. So for complex layout this would require a change in Raqm, either upstream or by including a vendored change in Pillow.

The latter could help with some of the distribution issues that have appeared in the past (the dynamic loading would be moved to FriBiDi; users would only need to install LGPL-licensed FriBiDi to enable complex text, not Raqm which is sometimes available in an outdated and unsupported version in some linux distributions). It could also potentially help #3066. Edit: #3066 is now fixed by including a vendored build of Raqm in binary wheels.

As for the API to use here, I would propose ImageFont.truetype_family(font1, font2, font3, ...) to create a compound font, which would get special handling in the C code.

@LateusBetelgeuse
Copy link

This can be specially helpful with emojis. Many emojis font only contains emojis, and the ones that contains regular glyph have styles that can break the image/poster paradigm that one wants to achieve. However this can be tricky because all color emoji fonts that I've tested so far requires exactly a size of 109, which would require super-sampling and then smooth resizing.

@voussoir
Copy link

Hi, I'm interested in this problem too.

At the moment I'm using a workaround based on this stackoverflow answer.

from fontTools.ttLib import TTFont

def has_glyph(font, glyph):
    for table in font['cmap'].tables:
        if ord(glyph) in table.cmap.keys():
            return True
    log.debug('%s does not have %s', font, glyph)
    return False

def determine_font(text):
    text = stringtools.remove_control_characters(text)
    font_options = [
        'C:\\Windows\\Fonts\\NotoSansKR-Bold.otf',
        'C:\\Windows\\Fonts\\NotoSansSC-Bold.otf',
        'C:\\Windows\\Fonts\\NotoSansJP-Bold.otf',
    ]
    for font_name in font_options:
        font = TTFont(font_name)
        if all(has_glyph(font, c) for c in text):
            return font_name
    raise Exception(f'No suitable font for {text}.')

However, this still doesn't work when none of the fonts contain all of the glyphs. I learned that Noto Arabic doesn't contain the ascii letters!

I don't know anything about how web browsers or file explorers handle font stacking, but it would be great if we could get some of that behavior by default in PIL. Anyway just thought I'd share that snippet.

@khaledmsm
Copy link

After searching the internet, I've got a workaround by merging font files into a single font file.

By reading merge_fonts.py, I think the core code about merging font files in python is the following (you may have to install fontTools).

from fontTools import ttLib, merge

def make_font(font_list, output_to):
    merger = merge.Merger()
    font = merger.merge(font_list)
    metrics = read_line_metrics(ttLib.TTFont(font_list[0]))
    set_line_metrics(font, metrics)
    font.save(output_to)
    font.close()

do you have actual code
because this one isn't clear

@bai-yi-bai
Copy link

bai-yi-bai commented Jan 23, 2023

After searching the internet, I've got a workaround by merging font files into a single font file.
By reading merge_fonts.py, I think the core code about merging font files in python is the following (you may have to install fontTools).

from fontTools import ttLib, merge

def make_font(font_list, output_to):
    merger = merge.Merger()
    font = merger.merge(font_list)
    metrics = read_line_metrics(ttLib.TTFont(font_list[0]))
    set_line_metrics(font, metrics)
    font.save(output_to)
    font.close()

do you have actual code because this one isn't clear

I apologize for replying to a closed issue, but I also struggled with finding a merged monospace notosans font file. Pillow's documentation could be improved by providing some hints on how to troubleshoot font issues... I may write an article on this, but I am not an expert on how fonts work, how they are stored, or how Pillow uses them, nor do I expect to invest the time now that I solved my issue.

A lot of the notosans fonts contain the minimum amount of glyphs to support a specific language. For example, NotoSansThai-Regular.ttf doesn't contain any ascii characters, such as pronunciation marks. This results in Pillow adding the 'missing character' glyph to an image. I thought this had something to do with not having libraqm installed correctly until I checked the character map utility (based on the linked issue) and discovered the glyphs weren't there.

In addition, the built-in notosans merge_fonts.py/merge_noto.py seem to be broken in their current state, resulting in a error being raised (see the bottom of this post).

Here are the steps I was able to use to successfully merge ~20 fonts together into one file:

  1. Manually clone the nototools project:
    https://github.com/googlefonts/nototools.git
    This is a large repo, with 1,634 .ttf files.
  2. Set up your Python environment to run nototools (venv/requirements.txt)
  3. Move the undesired .ttf files out of the root directory. I started from scratch by moving all .ttf files to /backup_moved_fonts_from_root
  4. Move the fonts you want to merge into the /root folder.
  5. Create a script merge_noto_diy.py with this content:
from fontTools import ttLib, merge
from nototools.substitute_linemetrics import read_line_metrics, set_line_metrics
import os
font_list = []
for a_file in os.listdir(os.getcwd()):
    if a_file.endswith('.ttf'):
        font_list.append(a_file)
        print(a_file)

def make_font(font_list, output_to):
    merger = merge.Merger()
    font = merger.merge(font_list)
    metrics = read_line_metrics(ttLib.TTFont(font_list[0]))
    set_line_metrics(font, metrics)
    font.save(output_to)
    font.close()

make_font(font_list=font_list, output_to='NotoSansCombined-Regular.ttf')
  1. Run the script.

I suggest adding fonts one-by-one to make sure the process completes successfully. I mention this because I had trouble with NotoSansGurmukhi-Regular.ttf and NotoSansThaana-Regular.ttf. Python returned this error:

AssertionError: Expected all items to be equal: [1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 2048, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000]

I will have to scour the notofont files to see if I can find suitable replacements for these two fonts.

The only other tips I would like to share are that I was also able to combine the JetBrainsMono-Regular.ttf font with 19 notofonts files to produce a decent monospace font with a large coverage (I will eventually post my project to github demonstrating why I needed a font(s) with 6,000+ glyphs, check my profile), but I could not combine it with the ChineseJapaneseKorean font sarasa-fixed-cl-regular.ttf.

@Yay295
Copy link
Contributor

Yay295 commented Jan 23, 2023

There's actually an issue on the Noto Fonts GitHub about this: notofonts/noto-fonts#167

tl;dr

The technical limit is in the font format. That's one reason there isn't a single font with everything, but another reason is that it would be a very big file which would be slow to work with. And most people only need a couple of these fonts so it makes more sense to keep them separate from a design/production standpoint as well as a delivery/use standpoint.

The font format technical limit (65,535 glyphs in one file) is probably why you can't combine sarasa-fixed-cl-regular.ttf with anything.

So being able to use fallback fonts definitely seems like the better option.

@bai-yi-bai
Copy link

Thank you for linking me to that issue, I saw it before and knew there was a limitation on the number of glyphs in a font file. I was able to use pyftsubset and glyphhanger to reduce the size of the font files I need.

Back to the main topic, I agree having a backup font solution in Pillow would be preferable. Being able to provide a list of fonts in order of preference would be even better [highest, lowest].

Are there any proposals on how to build this functionality? I don't see any follow-up on this comment: #4808 (comment)

Looking around in the Pillow source code, I cannot determine how the glyph/ideograph not found character U+25A1 is rendered to be the 'fallback' when a glyph doesn't exist in a given font file. For example, when ImageDraw.py textbbox tries to generate a bitmap, does it use the default font built into Pillow from ImageFont load_default to generate this glyph? This contains a font encoded in base64.
I guess my question is at what point in the process could this 'fallback to U+25A1' code be expanded to perform a search (try/except) statement on each provided font file? Or alternatively, could multiple instances of the class FreeTypeFont be combined together to provide greater coverage?

@nulano
Copy link
Contributor

nulano commented Jan 24, 2023

Pillow does not currently handle fallback at all, it only uses FreeType's default behaviour: If a font is missing support for some Unicode code point, it is rendered using the font's "missing glyph", which is usually a rectangle or question mark.

To add fallback font support in Pillow, two things are required:

  • Detect when FreeType returns a "missing glyph" and figure out which font can be used instead,
  • Somehow decide which parts of the input string should use which font (AFAIK web browsers split text at word boundaries Edit: at least Chromium and Firefox seem to be splitting by clusters).

This is not too difficult for basic layout, but basic layout is not very good for non-English text, where fallback fonts are most useful. However, detecting which characters are not supported with Raqm layout is more tricky because complex text layout can reorder or even completely replace the input characters.

I am not aware of anyone currently working on this, feel free to implement it and open a pull request.

@owocado
Copy link

owocado commented Jan 24, 2023

Thanks for the insightful response. Though until Pillow supports this feature in future, I am temporarily using https://github.com/nathanielfernandes/imagetext-py which adds fallback fonts and wraps around Pillow, if this helps anyone in finding a temporary solution. 👍

@nulano nulano linked a pull request Feb 2, 2023 that will close this issue
4 tasks
@nulano
Copy link
Contributor

nulano commented Feb 2, 2023

I needed this functionality myself yesterday so I've created a proof-of-concept implementation in #6926 (I've included a few sample pictures there).

Edit: For the OP, the result is:

from PIL import Image, ImageFont, ImageDraw


temp_canvas = Image.new("RGBA", (1200, 300), (255, 255, 255, 255))
draw_canvas = ImageDraw.Draw(temp_canvas, "RGBA")

font = ImageFont.truetype(r"C:\Users\Nulano\AppData\Local\Microsoft\Windows\Fonts\BarlowSemiCondensed-Bold.ttf", size=150)
backup_font = ImageFont.truetype("Nirmala.ttf", size=150)
font_family = ImageFont.FreeTypeFontFamily(font, backup_font)

text_string = "hello ಠಠ world"

draw_canvas.text((100, 100), text_string, fill="#000000", font=font_family)

temp_canvas.show()
temp_canvas.save("E:\\4808.png")

4808

@nissansz
Copy link

nissansz commented Jul 16, 2023

@nulano How to install this module?

AttributeError: module 'PIL.ImageFont' has no attribute 'FreeTypeFontFamily'

@radarhere
Copy link
Member

An answer to the above question of how to install the proof-of-concept can be found at #6926 (comment)

@pengzhendong
Copy link

@nulano How to install this module?

AttributeError: module 'PIL.ImageFont' has no attribute 'FreeTypeFontFamily'

The PR is not merged yet.

@TheWalkingSea
Copy link

I solved this using masks

from PIL import Image, ImageFont, ImageDraw

def getEmojiMask(font: ImageFont, emoji: str, size: tuple[int, int]) -> Image:
    """ Makes an image with an emoji using AppleColorEmoji.ttf, this can then be pasted onto the image to show emojis
    
    Parameter:
    (ImageFont)font: The font with the emojis (AppleColorEmoji.ttf); Passed in so font is only loaded once
    (str)emoji: The unicoded emoji
    (tuple[int, int])size: The size of the mask
    
    Returns:
    (Image): A transparent image with the emoji
    
    """

    mask = Image.new("RGBA", (160, 160), color=(255, 255, 255, 0))
    draw = ImageDraw.Draw(mask)
    draw.text((0, 0), emoji, font=font, embedded_color=True)
    mask = mask.resize(size)

    return mask

def getDimensions(draw: ImageDraw, text: str, font: ImageFont) -> tuple[int, int]:
    """ Gets the size of text using the font
    
    Parameters:
    (ImageDraw): The draw object of the image
    (str)text: The text you are getting the size of
    (ImageFont)font: The font being used in drawing the text
    
    Returns:
    (tuple[int, int]): The width and height of the text
    
    """
    left, top, right, bottom = draw.multiline_textbbox((0, 0), text, font=font)
    return (right-left), (bottom-top)

def addEmojis():
    # Now add any emojis that weren't embedded correctly
    modifiedResponseL = modifiedResponse.split("\n")
    for i, line in enumerate(modifiedResponseL):
        for j, char in enumerate(line):
            if (not char.isascii()):
                
                # Get the height of the text ABOVE the emoji in modifiedResponse
                aboveText = "\n".join(modifiedResponseL[:i])
                _, aboveTextHeight = getDimensions(draw, aboveText, poppinsFont)

                # The height that we paste at is aboveTextHeight + (marginHeight+PADDING) + (Some error)
                # (marginHeight+PADDING) is where we pasted the entire paragraph
                y = aboveTextHeight + (marginHeight+PADDING) + 5

                # Get the length of the text on the line up to the emoji
                beforeLength, _ = getDimensions(draw, line[:j], poppinsFont)

                # The x position is beforeLength + 75; 75px is where we pasted the entire paragraph
                x = (75) + beforeLength

                # Create the mask
                emojiMask = getEmojiMask(emojiFont, char, (30, 30))

                # Paste the mask onto the image
                img.paste(emojiMask, (int(x), int(y)), emojiMask)

def addEmojis(img: Image, text: str, box: tuple[int, int], font: ImageFont, emojiFont: ImageFont) -> None:
    """ Adds emojis to the text
    
    Parameters:
    (Image)img: The image to paste the emojis onto
    (tuple[int, int])box: The (x,y) pair where the textbox is placed
    (ImageFont)font: The font of the text
    (ImageFont)emojiFont: The emoji's font
    
    """
    draw = ImageDraw.Draw(img)
    width, height = box
    # Now add any emojis that weren't embedded correctly
    text_lines = text.split("\n")
    for i, line in enumerate(text_lines):
        for j, char in enumerate(line):
            if (not char.isascii()):
                
                # Get the height of the text ABOVE the emoji in modifiedResponse
                aboveText = "\n".join(text_lines[:i])
                _, aboveTextHeight = getDimensions(draw, aboveText, font)

                # The height that we paste at is aboveTextHeight + height + (Some error)
                y = aboveTextHeight + height + 5

                # Get the length of the text on the line up to the emoji
                beforeLength, _ = getDimensions(draw, line[:j], font)

                # The x position is beforeLength + width
                x = width + beforeLength

                # Create the mask; You might want to adjust the size parameter
                emojiMask = getEmojiMask(emojiFont, char, (30, 30))

                # Paste the mask onto the image
                img.paste(emojiMask, (int(x), int(y)), emojiMask)

The code above adds the emojis to the screen which you can copy + paste

To use it:

img = Image.new("RGB", (200, 200), (255, 255, 255))

font = ImageFont.truetype("./fonts/Poppins-Regular.ttf", 25)

# Ref: https://github.com/samuelngs/apple-emoji-linux/releases
emojiFont = ImageFont.truetype(r"fonts\AppleColorEmoji.ttf", 137)

draw = ImageDraw.Draw(img)
draw.text((0, 0), "Hello \U0001f4a4", fill=(0, 0, 0), font=font)

addEmojis(img,  "Hello \U0001f4a4", (0, 0), font, emojiFont)
img.show()

@TrueMyst
Copy link

Hey @Markxy @nulano

I've worked on something similar that fixes this issue. You can check it out here. It's efficient and works really well out of the box.

I don't really like imagetext-py, since it cannot handle other languages that well. You can correct me if I'm wrong. The one I made can easily be used with Pillow.

Due to the lack of features, I made this tool for my project.

Though my tool contains bugs, I do intend to fix them as soon as possible.

Let me know your feedback!
Cheers ❤️

@TrueMyst
Copy link

@aclark4life @Markxy @nulano

It seems like I've found an easier way to fix this issue. Right now I'm using a language model which really complicates it. I'll let you know if it works!

Cheers ❤️

@TrueMyst
Copy link

@aclark4life @nulano @Markxy Update time!

Hey everyone, I just want to let you know that I did end up finding a good solution that doesn't uses a language model.

This time, I'm using fontTools.

Here is how it works, in the writing.py the load_fonts function loads font files specified by their paths into memory, storing them as font objects in a dictionary.

Next, the has_glyph function checks if a given font contains a glyph for a specified character.

Then, the merge_chunks function optimizes font lookup by merging consecutive characters with the same font into clusters, with the help of has_glyph function. Finally, the draw_text_v2 function utilizes these fonts to draw text on an image.

I've updated the name of the functions, so that they don't conflict with Pillow's one.

If we talk about the time it takes to render text on the image then here you go.

Current Solution:
image

Previous Solution (If you use the entire language model):
image

They both give out the same result.
image

The code looks fairly simply, and heavily inspired from @nulano's proof of concept.

from PIL import Image, ImageDraw
from fontfallback import writing

text_0 = """
My time - Bo en
おやすみ おやすみ
Close your, eyes and you'll leave this dream
おやすみ おやすみ
I know that it's hard to do
"""

text_2 = """
English Text: That's amazing
Arabic Text: هذا مذهل
Korean Text: 그 놀라운
Chinese Simplified: 太棒了
Japanese: すごいですね
"""

fonts = writing.load_fonts(
    "./fonts/Oswald/Oswald-Regular.ttf",
    "./fonts/NotoSansJP/NotoSansJP-Regular.ttf",
    "./fonts/NotoSansKR/NotoSansKR-Regular.ttf",
    "./fonts/NotoSansSC/NotoSansSC-Regular.ttf",
    "./fonts/NotoSansArabic/NotoSansArabic-Regular.ttf",
)

image = Image.new("RGB", (500, 350), color=(255, 255, 255))
draw = ImageDraw.Draw(image)

writing.draw_multiline_text_v2(draw, (40, 10), text_0, (0, 0, 0), fonts, 20)
writing.draw_multiline_text_v2(draw, (40, 150), text_1, (0, 0, 0), fonts, 20)

image.show()

I tried to optimize it as much I can, but if you have any good suggestions to make let me know.
I hope you're happy with the results, you can check it out here. PillowFontFallBack

Cheers ❤️

@nissansz
Copy link

which version pillow to use for above script?

@TrueMyst
Copy link

which version pillow to use for above script?

Latest Release :))

@nissansz
Copy link

dev. version pillow? or any version?

@TrueMyst
Copy link

dev. version pillow? or any version?

the one on pypi, that'll work :))

@TrueMyst
Copy link

@aclark4life @nissansz Does it work properly?

@nissansz
Copy link

I still use pillow dev 10.4

@TrueMyst
Copy link

TrueMyst commented Apr 21, 2024

I still use pillow dev 10.4

It doesn't matter, the script can be run separately

@nulano
Copy link
Contributor

nulano commented Apr 21, 2024

@TrueMyst I haven't tried it, but I'm not sure if your approach will work with composed glyphs. For example, country flag emoji are composed of two unicode code points which render as a single glyph. Also, if you try to make it compatible with the current ImageDraw API (by implementing an object with a getmask2 method), I expect you'll run into the same issue that ultimately made me stop working on it - limitations in the current line spacing calculation caused by the current API.

@TrueMyst
Copy link

@nulano

I've somewhat fixed things. My code definitely supports multiple languages. I think the most ideal solution would be to get an image for the emoji from the internet based on the Unicode using emojipedia. Some emoji fonts don't have great support for emoji. Not the best, I would say, but if you're interested, we can get it up and running. A little bit of help and optimization could make it work.

@TheWalkingSea
Copy link

TheWalkingSea commented Apr 22, 2024

I solved this using masks

from PIL import Image, ImageFont, ImageDraw



def getEmojiMask(font: ImageFont, emoji: str, size: tuple[int, int]) -> Image:

    """ Makes an image with an emoji using AppleColorEmoji.ttf, this can then be pasted onto the image to show emojis

    

    Parameter:

    (ImageFont)font: The font with the emojis (AppleColorEmoji.ttf); Passed in so font is only loaded once

    (str)emoji: The unicoded emoji

    (tuple[int, int])size: The size of the mask

    

    Returns:

    (Image): A transparent image with the emoji

    

    """



    mask = Image.new("RGBA", (160, 160), color=(255, 255, 255, 0))

    draw = ImageDraw.Draw(mask)

    draw.text((0, 0), emoji, font=font, embedded_color=True)

    mask = mask.resize(size)



    return mask



def getDimensions(draw: ImageDraw, text: str, font: ImageFont) -> tuple[int, int]:

    """ Gets the size of text using the font

    

    Parameters:

    (ImageDraw): The draw object of the image

    (str)text: The text you are getting the size of

    (ImageFont)font: The font being used in drawing the text

    

    Returns:

    (tuple[int, int]): The width and height of the text

    

    """

    left, top, right, bottom = draw.multiline_textbbox((0, 0), text, font=font)

    return (right-left), (bottom-top)


def addEmojis(img: Image, text: str, box: tuple[int, int], font: ImageFont, emojiFont: ImageFont) -> None:

    """ Adds emojis to the text

    

    Parameters:

    (Image)img: The image to paste the emojis onto

    (tuple[int, int])box: The (x,y) pair where the textbox is placed

    (ImageFont)font: The font of the text

    (ImageFont)emojiFont: The emoji's font

    

    """

    draw = ImageDraw.Draw(img)

    width, height = box

    # Now add any emojis that weren't embedded correctly

    text_lines = text.split("\n")

    for i, line in enumerate(text_lines):

        for j, char in enumerate(line):

            if (not char.isascii()):

                

                # Get the height of the text ABOVE the emoji in modifiedResponse

                aboveText = "\n".join(text_lines[:i])

                _, aboveTextHeight = getDimensions(draw, aboveText, font)



                # The height that we paste at is aboveTextHeight + height + (Some error)

                y = aboveTextHeight + height + 5



                # Get the length of the text on the line up to the emoji

                beforeLength, _ = getDimensions(draw, line[:j], font)



                # The x position is beforeLength + width

                x = width + beforeLength



                # Create the mask; You might want to adjust the size parameter

                emojiMask = getEmojiMask(emojiFont, char, (30, 30))



                # Paste the mask onto the image

                img.paste(emojiMask, (int(x), int(y)), emojiMask)

The code above adds the emojis to the screen which you can copy + paste

To use it:

img = Image.new("RGB", (200, 200), (255, 255, 255))



font = ImageFont.truetype("./fonts/Poppins-Regular.ttf", 25)



# Ref: https://github.com/samuelngs/apple-emoji-linux/releases

emojiFont = ImageFont.truetype(r"fonts\AppleColorEmoji.ttf", 137)



draw = ImageDraw.Draw(img)

draw.text((0, 0), "Hello \U0001f4a4", fill=(0, 0, 0), font=font)



addEmojis(img,  "Hello \U0001f4a4", (0, 0), font, emojiFont)

img.show()

@nulano Try this solution; It supports emojis with modifier unicode letters

@TrueMyst
Copy link

@TheWalkingSea there are so many problems with this code, especially with the types. You mind me fixing them?

@TheWalkingSea
Copy link

TheWalkingSea commented Apr 23, 2024

It works perfectly for me and the types are correct. Let me know if you have any specific issues and I'll make sure to update it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

14 participants