-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage recommendation request] Decompressing large EXRs in real-time. #1755
Comments
As a baseline, do you have metrics for uncompressed scanline and tiled reads? |
Sure, I should have included some more metrics initially - here's what you asked for:
Those results were taken with calls to Any thoughts appreciated. |
This is fantastic data, thank you. Are you aware of the |
I wasn't aware of that branch - thanks for pointing it out. I've given it a test run and here's the equivalent data:
The speedups for uncompressed data are quite incredible! Unfortunately, as things currently stand, it seems like DWA compressions have a slight decompression performance hit on my system. I've double checked everything and tried to include a brief look at where my CPU is spending time on
A slightly finer grained, but still brief, look at
I notice that, as you mentioned, the
How do these timing line up with what you'd expect from the branch? I don't believe my CPU is making use of a lot of optimisations in Aside from that, if you have any more suggestions to try, they'd be appreciated. Though, if this is the best we'll get on CPU, that's fine - it feels like real-time is quite far away, and I realise I'm asking for long shots. |
Hi,
We're doing a virtual production based research project at Foundry. As part of that, we're investigating OpenEXR based solutions that meet the following criteria:
It's fine to presume we have the PCIe bandwidth to perform transfers at rate, as well as good processing power on both the CPU and GPU front.
We're aware that this is very much "having one's cake, and eating it", however lots of OpenEXR-based options seem close, and we're wondering if the EXR experts can see a trick we've missed.
We're fine with lossy compression, which opens the doors to B44[A] and DWA[A/B]. B44 satisfies
1.
&3.
, but produces relatively large files when compared to DWA. DWA satisfies1.
&2.
, but we would need better performance on the decompression.With 16 threads, for 8k (8192x4096) scanline DWA I'm seeing around 60ms to read a frame, and for 16k I'm seeing around 230ms. Are there any missed tricks on either the encoding or decoding side which could be used to speed up this process? We've thought about a GPU implementation of DWA decoding, but from what we can tell, a combination of Huffman, RLE, deflate and zip are used for DWA's entropy encoding, none of which are particularly GPU friendly, and all together they sound very GPU unfriendly.
Alternatively, are we missing a trick with regards to any of the other compression methods which could help us meet our criteria?
Many thanks,
George
The text was updated successfully, but these errors were encountered: