Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for inflate flush points #254

Open
mxmlnkn opened this issue Aug 24, 2023 · 0 comments
Open

Add support for inflate flush points #254

mxmlnkn opened this issue Aug 24, 2023 · 0 comments

Comments

@mxmlnkn
Copy link

mxmlnkn commented Aug 24, 2023

Hi,

The gzip/deflate decompression speed of this library is amazing. That's why I'm trying to get it to work in as many places as possible in my multi-threaded gzip decompression tool rapidgzip.

However, in order to really use it in as many places as possible (without modifying the Assembler routines), I would require flush points to be implemented because I need those to decide suitable points to stop decompression at inside one of the decompression threads.

Zlib has this description for inflate:

The flush parameter of inflate() can be Z_NO_FLUSH, Z_SYNC_FLUSH, Z_FINISH,
Z_BLOCK, or Z_TREES. Z_SYNC_FLUSH requests that inflate() flush as much
output as possible to the output buffer. Z_BLOCK requests that inflate()
stop if and when it gets to the next deflate block boundary. When decoding
the zlib or gzip format, this will cause inflate() to return immediately
after the header and before the first block. When doing a raw inflate,
inflate() will go ahead and process the first block, and will return when it
gets to the end of that block, or when it runs out of data.

I would need support for Z_BLOCK and Z_TREES. As a proof of concept, I tried to implement this in my fork. However, it would be nicer to have this implemented in the upstream and by someone who knows the code inside out. E.g., I'm unsure about when exactly those tmp_in_buffer and tmp_in_size state members hold valid values. This is important to me because I not only need ISA-l to stop at those flush points, I also need to be able to infer the exact bit position in the encoded stream and byte position in the decoded stream from read_in_length and avail_in after it has stopped decompression at those flush points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants