Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Windows zlib with zlib-ng #8495

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

dofuuz
Copy link

@dofuuz dofuuz commented Oct 24, 2024

Changes proposed in this pull request:

  • Replacing zlib with zlib-ng makes huge speedup for PNG encoding.

Here are my quick and dirty benchmark.
(on x64 Windows, read/write PNG 10 times)

Original (with zlib):

PIL.__version__='11.1.0.dev0'
read PNG: 0.092544 (sec)
on-memory PNG: 1.034144 (sec)
write PNG: 1.054776 (sec)
PNG size 1352945 bytes

With zlib-ng:

PIL.__version__='11.1.0.dev0'
read PNG: 0.089149 (sec)
on-memory PNG: 0.515073 (sec)
write PNG: 0.527402 (sec)
PNG size 1394511 bytes

It's about 2x speed-up with little effort.

Comparison & benchmark of zlib forks (JAN 2021):
https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/

@nulano
Copy link
Contributor

nulano commented Oct 24, 2024

The changes look OK (although we might want to consider switching to the CMake build at some point as it seems to be more actively maintaned).

We probably want to add a flag in _imaging.c/PIL.features similar to the libjpeg-turbo flag:

Pillow/src/_imaging.c

Lines 4352 to 4368 in 822ec3d

PyObject *have_libjpegturbo;
#ifdef LIBJPEG_TURBO_VERSION
have_libjpegturbo = Py_True;
{
#define tostr1(a) #a
#define tostr(a) tostr1(a)
PyObject *v = PyUnicode_FromString(tostr(LIBJPEG_TURBO_VERSION));
PyDict_SetItemString(d, "libjpeg_turbo_version", v ? v : Py_None);
Py_XDECREF(v);
#undef tostr
#undef tostr1
}
#else
have_libjpegturbo = Py_False;
#endif
Py_INCREF(have_libjpegturbo);
PyModule_AddObject(m, "HAVE_LIBJPEGTURBO", have_libjpegturbo);

Pillow/src/PIL/features.py

Lines 299 to 302 in 822ec3d

if name == "jpg":
libjpeg_turbo_version = version_feature("libjpeg_turbo")
if libjpeg_turbo_version is not None:
v = "libjpeg-turbo " + libjpeg_turbo_version

It is currently reporting zlib version 1.3.1.zlib-ng:

--- JPEG support ok, compiled for libjpeg-turbo 3.0.4
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.2
--- ZLIB (PNG/ZIP) support ok, loaded 1.3.1.zlib-ng

Here are my quick and dirty benchmark.

Can you provide the benchmark itself? (I imagine its only a couple of lines, but it would make it easier to compare with others. Using one of our test images could make it even simpler.)

It looks like zlib-ng produces a slightly larger file. This is probably fine, but I would be curious what effect changing the compress_level parameter has.

Also, do note that the FreeType library also uses zlib for decompressing font files, although I do not expect any issues there.
Edit: It is also used in libtiff.

@dofuuz
Copy link
Author

dofuuz commented Oct 25, 2024

ARM Windows build fails, so I think we should switch to CMake build.
OR just use pre-compiled release from zlib-ng like https://github.com/zlib-ng/zlib-ng/releases/download/2.2.1/zlib-ng-win-x86-64-compat.zip

For version number, zlib-ng reports compatible zlib version in zlib-compat API.

Here is the bench script:
import os
import timeit

from PIL import Image


REPEAT = 10

print(f'{Image.__version__ = }')
print(f'{Image.core.zlib_version = }')

def read_png():
    img = Image.open('example.png')
    img.load()

t = timeit.timeit(read_png, number=REPEAT)
print(f'read PNG: {t:f} (sec)')

img = Image.open('example.png')
img.load()

fp = open(os.devnull, "wb")
t = timeit.timeit(lambda: img.save(fp, format='png'), number=REPEAT)
print(f'on-memory encoding PNG: {t:f} (sec)')

t = timeit.timeit(lambda: img.save('pil_png.png'), number=REPEAT)
print(f'write PNG: {t:f} (sec)')

print(f'PNG size {os.path.getsize("pil_png.png")} bytes')

@hugovk
Copy link
Member

hugovk commented Oct 25, 2024

Comparison & benchmark of zlib forks (JAN 2021):
https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/

Is it worth re-benchmarking these other forks or finding some newer results?

@dofuuz
Copy link
Author

dofuuz commented Oct 25, 2024

@hugovk Maybe yes. But it's quite hard problem and I don't want to go deeper.

There are some newer benchmarks:
zlib-ng/zlib-ng#1486 (May 2023)
https://github.com/simonis/zlib-bench/tree/master/graphs/ (Oct 2022)
But they use text corpus, so it may not applicable to PNG. And I couldn't find benchmark with image corpus.
It's not trivial to make benchmark properly, so I gave up with further investigation.

Anyway, the purpose of this PR is free speedup without much effort. So I considered about easy application and portability.

There are several alternative to zlib:

  • zlib-ng (nice drop-in replacement for zlib)
  • zlib-cloudflare (portability issue)
  • libdeflate (not a drop-in replacement)

So I choose zlib-ng.

@nulano
Copy link
Contributor

nulano commented Oct 25, 2024

For version number, zlib-ng reports compatible zlib version in zlib-compat API.

My point being that we should report the zlib-ng version when using it, not the version of the zlib-compat API. This is to make it clear which version is used in tests and bug reports.

@nulano
Copy link
Contributor

nulano commented Oct 25, 2024

I have created #8500 as an alternative using the CMake build instead, as we can use a newer version that way. I also added a feature flag so that the use of zlib-ng is clear from running python3 -m PIL.report.

Here is the bench script

Thanks for that, I've adjusted it slightly to also test at various compression levels. It seems that in compression, zlib-ng usually produces a slightly larger file but in much less time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants