Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add webfont and script to subset a font. #35

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Szunti
Copy link

@Szunti Szunti commented Sep 22, 2018

How about this for #34?

  • A python script (python 3.6+, but easy to backport) using the fontTools package to subset multiple fontfiles until all the codepoints can be displayed. Configured by a json file.
  • Also @font-face rule and HanaMinA subsetted webfont.

Hanazono font was chosen because for Han Nom:

  1. couldn't find a license
  2. I don't know if it matters but it's a Vietnamese font, as far as I know Hanazono is Japanese

Do you want anything different?

make_woff.py --help
usage: ./make_woff.py conffile

conffile is in JSON format. Example with explanatory comments (don't include
these comments in the JSON file):
{
  // List of files or globs relative to the config file's dir.
  "dataFiles": [
    "../data/pages/en-GB/**/*.txt"
  ],
  // List of individual code points and ranges (list with start an end).
  "excludeCodepoints": [
    [0, 127]
  ],
  // The order of fonts are important, fonts will be checked for every code
  // point until one has a glyph. First element is the input, second is the
  // output.
  "fonts": [
    ["HanaMinA.ttf", "../HanaMinA.woff"],
    ["HanaMinB.ttf", "../HanaMinB.woff"]
  ],
  // Missing characters will be written to this file in utf-8. relative to
  // config file's location
  "missingOutput": "missing.txt",
  // if true, only print which fonts are used and missing characters
  "justCheck": false
}

Subsetted HanaMinA also included.
style.css Outdated
.nrGrammardata {
background: white;
font-family: serif, HanaMinA;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always pick serif because it will always work as per the CSS spec, so you'll have to swap these around. The generic family class keyword always has to come last.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was if the serif font has CJK glyphs than the reader probably likes the default font and don't really need the HanaMinA substitute. But if the default serif doesn't have CJK glyphs, which is I guess common for learners, they have their nice readable latin font and only CJK glyphs fall back to HanaMinA.

Alternative is to have a good looking latin font before HanaMinA and serif at the end.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I have no intention of honoring the reader's locally installed fonts: the reason we're subsetting is to ensure that everyone sees the same thing, with a font that covers exactly what is necessary =)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HanaMinA's latin is really ugly here. Added Times first.

style.css Outdated
@@ -52,7 +58,7 @@
height: 500px;
text-align: right;
font-size: 13px;
font-family: Arial;
font-family: Arial, HanaMinA;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good to add a , sans-serif since we're touching this rule anyway.



if sys.version_info[0:2] < (3, 6):
error("This script requires Python version 3.6")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.6 or higher?

import fontTools.merge
import fontTools.misc.loggingTools
except ImportError:
error("Please install the fontTools python module.")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels like we should be able to invoke pip here.

accepted = ask("FontTools dependency is missing, would you like to install them now? [Y/n]");
if accepted is True:
    runPipInstalls()
else:
    error("Dependencies not installed, exiting...")
    exit(1)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are just so many possibilities. Do you want it to be installed with the distribution's package manager or pip? Install globally (which also needs root permission probably) or only for the current user or maybe in a virtual environment? But I can make something if it's really necessary.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offering the choice, only makes it more usable. We can add in the option to use pip (which is the defacto manager these days) and if someone says "no" then they get exactly as much functionality as if we just immediately exit. So worst case: things stay the same. Best case: the script takes care of it and just continues.

Let's add in a pip install (and if it complains about permissions, we can make it say "could not install dependencies, please install the following packages manually:" and then a list of the dependencies necessary)


class BaseFilter():
def feed(self, codepoint):
"""Let the filter handle the codepoint.
Copy link
Owner

@Pomax Pomax Sep 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let a subclassing filter handle the codepoint because without overriding this function literally breaks python.

uniq_filt = UniqueFilter()
pipeline.add_filter(uniq_filt)

excl_filt = ExcludeFilter()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ASCII range takes up near-as-makes-no-difference nothing compared to the rest of the data that gets loaded, and having the Hanazono-specific latin characters in the font means that stretches that end up being marked as "use Japanese typesetting, even for the latin bits" will look better later.

Let's take out the exclusion option.

return False


class ExcludeFilter(BaseFilter):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"dataFiles": [
"../data/pages/en-GB/**/*.txt"
],
"excludeCodepoints": [
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

@Pomax Pomax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some review comments as normal comments.

@Pomax
Copy link
Owner

Pomax commented Sep 23, 2018

Having both PHP and Python as build tools might cause some problems when it comes to adding CI to this, but that can be a later concern. I've left a few comments but overall this looks fine to me. Thank you for helping get this implemented.

Useful to ignore missing TAB and LFD, those don't have glyphs in font
files.
style.css Outdated
@@ -1,11 +1,12 @@
@font-face {
font-family: HanaMinA;
src: url(HanaMinA.woff);
src: local("HanaMinA")
Copy link
Owner

@Pomax Pomax Sep 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is still to offer exactly the font necessary so that everyone sees exactly the same, so let's remove local(), because there's no guarantee their local font is the same font version as the one used to build the woff subset font.

If the initial .woff is too big up front, let's use woff2 instead and slice it so that it loads incrementally based on what's necessary per chapter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

323K as woff, 241K as woff2

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brotli showing off its power. Very nice!

@Szunti
Copy link
Author

Szunti commented Sep 25, 2018

Unless slicing for sections are needed, I think I'm done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants