Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to enable igzip multi-threaded run to improve compression throughput, -T parameter does not work #250

Open
xyajungo opened this issue Jul 19, 2023 · 11 comments

Comments

@xyajungo
Copy link

I download the isa-l's soure code,after compiling,open the directory programs/.libs ,find two bin files, igzip and lt-igzip, running as

first: time ./igzip -z /raid0/data/fq/SRR9613620_1.fq -o /raid0/data/fq/9613620_00.gz

real 0m41.708s
user 0m36.434s
sys 0m5.177s

second: time ./igzip -T 10 -z /raid0/data/fq/SRR9613620_1.fq -o /raid0/data/fq/9613620_000.gz

real 0m40.680s
user 0m36.335s
sys 0m4.249s

The time of first and second is nearly, mean that -T parameter does not work.

Thank you for your help.

@gbtucker
Copy link
Contributor

Threading is not added by default in the igzip utility. Need to add HAVE_THREADS to options. Please try the following.

$ ./autogen.sh
$ ./configure
$ make D="-DHAVE_THREADS"
$ sudo make install
$ time igzip -T 4 file

@xyajungo
Copy link
Author

xyajungo commented Jul 21, 2023

Thank you for your guidance.I tried it the way you said, and I encountered new problems.
...
CCLD programs/igzip
programs/igzip_cli.o: In function pool_create': /root/isa-l-2.30.0/programs/igzip_cli.c:556: undefined reference to pthread_create'
programs/igzip_cli.o: In function pool_quit': /root/isa-l-2.30.0/programs/igzip_cli.c:570: undefined reference to pthread_join'
collect2: error: ld returned 1 exit status

...

And in the makefile(make.inc), it is found that there is a link to the thread library,but there are still compile errors.
...
# Check for pthreads
have_threads ?= $(shell printf "#include <pthread.h>\nint main(void){return 0;}\n" | $(CC) -x c - -o /dev/null -lpthread && echo y )
THREAD_LD
$(have_threads) := -lpthread
THREAD_CFLAGS_$(have_threads) := -DHAVE_THREADS

progs: $(bin_PROGRAMS)
$(bin_PROGRAMS): CFLAGS += -DVERSION="$(version)"
$(bin_PROGRAMS): LDLIBS += $(THREAD_LD_y)
$(bin_PROGRAMS): CFLAGS += $(THREAD_CFLAGS_y)_
...

This expression such as "have_threads ?= ..." is hard to understand.

@xyajungo
Copy link
Author

xyajungo commented Jul 24, 2023

@gbtucker One more thing to check is whether the decompression supports multithreading.As you can see from the help information, the -T parameter supports compression only.
...
-T, --threads use n threads to compress if enabled
...

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Jul 24, 2023

As deflate blocks are interdependent, multithreaded decompression is impossible. Pigz supports it technically, but it is actually only running the decompression and the checksumming in different threads. The result is a marginably quicker wall clock time (10%) for a moderate increase in CPU resources (30%). It is in my opinion not a very worthwile endeavour to put it in ISA- L, though it is not my say as I am just a user of the project, not a developer.

I only managed to get threading built using make -f Makefile.unx and the appropriate variables but YMMV. It seems as if the autotools build doesn't load the proper values to enable it. (At least when building stable 2.30 version).

@mxmlnkn
Copy link

mxmlnkn commented Aug 9, 2023

As deflate blocks are interdependent, multithreaded decompression is impossible.

Rapidgzip is able to fully parallelize decompression. See also the accompanying paper. (Disclaimer: I am the developer of rapidgzip).

@rhpvorderman
Copy link
Contributor

Rapidgzip is able to fully parallelize decompression

Very interesting! It needs 4 cores though to be on par with igzip. That is quite a cost, unless wall clock time is the only thing to worry about. If you have 200 files to decompress it is 4 times more efficient to do 200 files in parallel rather than using rapidgzip.
Do you plan to integrate ISA-L decoding into rapidgzip to mitigate this disadvantage?

@mxmlnkn
Copy link

mxmlnkn commented Aug 9, 2023

Rapidgzip is able to fully parallelize decompression

Very interesting! It needs 4 cores though to be on par with igzip. That is quite a cost, unless wall clock time is the only thing to worry about. If you have 200 files to decompress it is 4 times more efficient to do 200 files in parallel rather than using rapidgzip. Do you plan to integrate ISA-L decoding into rapidgzip to mitigate this disadvantage?

Yes that would be ideal if I could integrate ISA-l for that. That's also why I was looking into the ISA-l source code more deeply. My thought was that I might be able to leverage only the Huffman decoder as a first step in the hope that it is easier to integrate with my custom-written inflate implementation and with the assumption that most of the performance is tied to the Huffman decoder anyway.

I already integrated ISA-l for decompression if an index is known (when --import-index is used or when decompressing bgzip-generated files). This use case might not be very common when using rapidgzip as a command line tool but it is very common when using rapidgzip as a library in ratarmount.

@rhpvorderman
Copy link
Contributor

Awesome project! I regularly decompress very big gzip compressed files (an occupational hazard), which is why I frequently roam this part of the internet. I wish you all the best!

@mxmlnkn
Copy link

mxmlnkn commented Aug 13, 2023

My thought was that I might be able to leverage only the Huffman decoder as a first step in the hope that it is easier to integrate with my custom-written inflate implementation and with the assumption that most of the performance is tied to the Huffman decoder anyway.

I have integrated the ISA-l Huffman decoder and it sped up the Silesia benchmark by ~40% thanks to the LUT also including the backreference length, but the base64 test case is only ~5% faster. And both values are still almost half as slow as igzip. It seems that most of the performance gains are left in the handcrafted assembler routine for decode_huffman_code_block_stateless_base. It will be more cumbersome to leverage that. In general, I would "only" have to change the output to be 16-bit instead of 8-bit but that is easier said than done. It's still on my todo list though.

@rhpvorderman
Copy link
Contributor

@mxmlnkn Random base64 is not really worth it though. That is not really representative of real-world data that is gzipped. I would suggest you test on data that matters to you, your company and your projects. That way your optimizations will at least make someone happy. I personally test on big compressed FASTQ data (millions of short DNA fragments).

@xbj110825
Copy link

xbj110825 commented Dec 8, 2023

Thank you for your guidance.I tried it the way you said, and I encountered new problems.

 _CCLD programs/igzip
programs/igzip_cli.o: In function `pool_create': /root/isa-l-2.30.0/programs/igzip_cli.c:556: undefined reference to `pthread_create'
programs/igzip_cli.o: In function `pool_quit': /root/isa-l-2.30.0/programs/igzip_cli.c:570: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status_ 

And in the makefile(make.inc), it is found that there is a link to the thread library,but there are still compile errors.

I found that make.inc is included in Makefile.unx.

 make -f Makefile.unx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants