-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing CPU Support for Comprimato J2K Compress and Decompress #375
base: master
Are you sure you want to change the base?
Conversation
Hi, I would have some remarks to the pull request. Most importantly, the pull request is a bit huge compared to functional changes - it almost rewrite both files (2/3 of code is touched), even the unrelated parts of code. Also the commits can be easily squashed into one, because there is first big commit and then just smaller changes of this one. As there are together functional changes, refactoring and code movement, it is really hard to check what has been exactly changed, so it almost cannot be audited for regressions etc. (I personally usually separate even refactoring and functional commits.) Would it be possible to reduce the extent? You can look (or use) eg. this commit - it modifies your changes in a way that it at least leaves the code in place and reverts few unrelated changes. The advantage should be obvious if you look at git diff, that your changes are more readable and can be reviewed. For me will be acceptable if you'd squash all to a single commit (including the suggested changes). If you'd then like to do some further changes, I'd perhaps prefer separate pull requests. I'll sum up my opinions about the non-essential changes:
Last, I see problematic the conditional compilation with HAVE_CUDA - in the original code the only CUDA depending code is the CUDA host allocator. Actually the proposed pull request requires CUDA toolkit for the CUDA codec variant to be used. That was actually not required before - without CUDA, the codec worked (just the buffers was not in the DMA transfer capable region). I don't know exactly our uses' use cases, but I can imagine that someone compiles UG without having the CUDA toolkit - it may not introduce much performance penalty. But don't worry about this particular case, I can fix this after the merge by myself. |
Hi @MartinPulec! Thanks for taking the time to review this pull request. This is my first one, so my apologies if it's a bit unconventional. I'll be sure to keep your notes in mind, regarding separating functional and refactoring changes, as well as squashing minor changes into one commit, for the next PR. I read through your commit and agree, it's significantly more readable and rolls back some of the non-essential changes. I'm happy to use that as my starting point to re-review, test, and submit if you'd prefer I do that. I had one question that I wanted your input on since I was going back-and-forth with myself about it. Do you agree with Regarding the In terms of the non-essential changes, here's some of the reasoning with why I made those changes. I'm happy to submit separate pull requests if you think any of them really make much of a difference -- don't want to waste your time submitting non-important PRs.
|
src/video_compress/cmpto_j2k.cpp
Outdated
static void usage() { | ||
col() << "J2K compress platform support:\n"; | ||
col() << "\tCPU .... yes\n"; | ||
#ifdef HAVE_CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of insisting on HAVE_CUDA, I think it is better to use cmpto_version::technology bit array
It is totally fine, I believe that your code is great, but the only problem is really that I want to leave the things traceable, for which it is good to separate functional and non-functional changes, I believe. For the other changes it depends - some are ones that I'd agree, the others I would as well, but I would change it just when changing that code anyways.
I believe that the keyword platform is totally fine. I've just looked into the headers and at Comprimato, they denote this as "technology" (CMPTO_TECHNOLOGY_{CPU,CUDA}. But both words sound more or less equivalent in this content.
I have no problem doing it by myself but I don't insist on that. Anyways, if you decide to do it by yourself, you can look at the above code comment.
Sure - I've no particular objections against that. If you wish, I have no problem but it will be nicer if it was in a separate commit. Anyways, just in a case, it doesn't need the
Ok, well, no serious problem with this, indeed. But just a small explanation - I've actually started using I understand that using the macro(s) is slightly controversial in C++ (even these MOD_NAME, which clang-tidy wants to be replaced with a constexpr var), so I don't insist on that, either.
Sure, I've consulted it with a colleague on Friday and he also advocated the use of the identifiers including the std namespace. We may consult it later to unify the style but it just haven't been so far (UltraGrid is written in C historically and the C++ coding standards are not much defined). To sum up the points 1-3, as there is no rule defined, feel free to choose. I also understand that you may want to make this consistent across the file. Provided that it will be in a separate commit, we'd accept it. |
Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to GH-375
be9e24a
to
525a0f4
Compare
525a0f4
to
5df9aae
Compare
…er have_cuda conditionals j2k_compress_platform now uses CMPTO_TECHNOLOGY_{CPU,CUDA} instead of 1, 2 bool supports_cmpto_technology(int) function created for checking if supported technology version is supported on system Added NoCmptoTechnologyFound exception for error reporting
…over cuda conditionals Remove #ifdef HAVE_CUDA j2k_decompress_platform now uses CMPTO_TECHNOLOGY_{CPU,CUDA} instead of 1, 2 bool supports_cmpto_technology(int) function created for checking if supported technology version is supported on system
This matches cpu_allocator naming and helps to be explicit about what allocator is being used during video_frame_pool creation during `bool state_video_compress_j2k::initialize_j2k_enc_ctx()`
Hi @MartinPulec! I've made the requested changes, if you'd like to review them and provide feedback. Here's a summary of what has been done.
One thing I noticed during the
Thanks! |
Implementing Commit [9577372](CESNET@9577372) from master
Implement commit [9e05752](CESNET@9e05752)
Mirror [779021b](CESNET@779021b) from master
Implementation of [c2e7811}(CESNET@c2e7811) from master
Implementing [ca71f59](CESNET@ca71f59) from master
Implementing [4061f8d](CESNET@4061f8d) from master
Implementing [c2cebd3](CESNET@c2cebd3) from master
Implement [af5d584](CESNET@af5d584) and [3adb9a4](CESNET@3adb9a4)
Implement [930abe5](CESNET@930abe5) from master
Implement [39c9c40](CESNET@39c9c40) from master
Implement [cc6b820](CESNET@cc6b820) from master
Implement [1cffc72](CESNET@1cffc72) from master
Implement [7b91ebb](CESNET@7b91ebb) from master
Implement [94afd6c](CESNET@94afd6c) from master
Implement [9304717](CESNET@9304717) from master
Implement the following from master [fa93411](CESNET@fa93411) [4f3add7](CESNET@4f3add7) [876870f](CESNET@876870f) [ad7929b](CESNET@ad7929b) [95dea89](CESNET@95dea89)
Remove duplicate #include, variables, functions, etc Class member 'pool' is a unique_ptr. Changed calls to pool from . to -> Renamed req_tile_limit and req_mem_limit to cuda_tile_limit and cuda_mem_limit to match class member name
…resolution Merge temp branch used to resolve remaining sync conflicts. See commit c307088 for details.
j2k_decompress_reconfigure now checks state_decompress_j2k class member platform to see if it is set to j2k_decompress_platform::CPU or j2k_decompress_platform::CUDA prior to calling cmpto_j2k_dec_ctx_cfg_add_cuda_device / cmpto_j2k_dec_ctx_cfg_add_cpu
ea3e356
to
28b98a9
Compare
Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to CESNETGH-375
Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to CESNETGH-375
Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to CESNETGH-375
Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to GH-375
This pull request implements CPU compression and decompression for the Comprimato J2K Codec.
In the interest of leaving the current implementation unchanged for users already using Comprimato with CUDA, CUDA is still the default platform selected for compression and decompression when CUDA libraries are detected during UltraGrid compilation. This means that
-c cmpto_j2k
will default to using CUDA for compression. A system decompressing a cmpto_j2k stream will default to using CUDA for decompression. This has been guaranteed by the use of#ifdef HAVE_CUDA
directives.If UltraGrid is compiled without CUDA,
-c cmpto_j2k
will default to using CPU for compression. A system compiled without CUDA will also default to CPU for decompressing a cmpto_j2k stream.To allow for the selection of either CUDA or CPU, a new :platform= option has been included when CUDA libraries are detected and UltraGrid is compiled. If no CUDA libraries are detected at compile time (ex. MacOS), the :platform= option will be removed and CPU will be the assumed default with no option for the user to change.
Platform default (CUDA or CPU) is determined at compile time using
#ifdef HAVE_CUDA
directives in thestate_video_compress_j2k
andstate_video_decompress_j2k
structs.Additions to Compression Options and Decompression Parameters
To support CPU usage for compression and decompression, additional options and parameters have been added.
J2K Compression Options Syntax | CUDA
J2K Compression Options Syntax | CPU
New J2K Compression Options:
:platform=<cuda,cpu>
-c cmpto_j2k
will imply-c cmpto_j2k:platform=cuda
. Calling-c cmpto_j2k:platform=cpu
will explicitly select CPU compression over CUDA.:thread_count=
bool initialize_j2k_enc_ctx()
:img_limit=
bool initialize_j2k_enc_ctx()
:lossless
New J2K Decompression params:
j2k-dec-cpu-thread-count=
bool initialize_j2k_dec_ctx()
.j2k-dec-img-limit=
bool initialize_j2k_dec_ctx()
and will set img-limit = cpu-thread-count if exceeded.bool initialize_j2k_dec_ctx()
.j2k-dec-use-cpu
j2k-dec-use-cuda
-c cmpto_j2k:help
print out when CUDA is Present-c cmpto_j2k:help
print out when only CPU is Present--param help
when CUDA is Present--param help
when only CPU is PresentChanges to
state_video_compress_j2k
andstate_video_decompress_j2k
construction and struct membersIn addition to adding CPU support for compression and decompression, I've made changes to the way
state_video)compress_j2k
andstate_video_decompress_j2k
classes are constructed, throwing exceptions if they are unable to be initialized. This has removed the need forgoto
statements and explicitdelete
of the state_j2ks by thej2k_compress_init()
andj2k_decompress_init()
functions.Parsing of compression options have been offloaded to private member,
void parse_fmt(const char*)
instruct state_video_compress_j2k
.Parsing of decompression parameters has been offloaded to private member
void parse_params()
instruct state_video_decompress_j2k
.If help is called during
-c cmpto_j2k:help
, theHelpRequested()
exception will throw and be caught byj2k_compress_init()
, resulting instatic_cast<module*>(INIT_NOERR)
Once parsing has completed, private members
initialize_j2k_enc_ctx()
andinitialize_j2k_dec_ctx()
are called and attempt to complete the final initialization of the J2K context.Any errors during this process will result in a thrown exception that will be caught by
j2k_compress_init()
andj2k_decompress_init()
, resulting in aNULL
return.Compress Exceptions Include:
HelpRequested()
InvalidArgument()
UnableToCreateJ2KEncoderCTX()
Decompress Exceptions Include:
UnableToCreateJ2KDecoderCTX()
To account for differences in the video_frame_pool initialization when using CPU and CUDA platforms during cmpto_j2k compression, the
video_frame_pool pool
has been changed tostd::unique_ptr<video_frame_pool> pool
to allow for reconfiguration during state_video_compress_j2k construction.Changes to
j2k_compress_init
andj2k_decompress_init
Since the j2k_compress and j2k_decompress constructors will throw if there is an error with any of the options or parameters, _init functions are responsible for catching those errors and returning what is needed by the module.
New j2k_compress_init implementation
New j2k_decompress_init implementation
Additions of enums and structs to allow easy selection of platform to use and future expansion
enum j2k_compress_platform
andenum j2k_decompress_platform
have been created to allow easy selection of the platform to use. These options currently include:NONE
CPU
CUDA
struct j2k_compress_platform_info_t
has been created to hold a friendly name (ex. "cpu") and its corresponding j2k_compress_platform type.The function
j2k_compress_platform get_platform_from_name(std::string)
has also been added with the purpose of searchingconstexpr auto compress_platforms = array{};
for friendly names and returning an associated platform.These enums and struct are built with the consideration that OpenCL, since it's also supported by Comprimato, can be implemented in the future.
General Changes
struct Codec
and codec information for both compress and decompress have been changed to constexpr arrays.<algorithm>
header has been included andfind_if
is now used to search for matching codec rather than range-based for loop with if statements inj2k_decompress_reconfigure()
andconfigure_with()
.log_msg()
for message reporting, rather than MSG()Removed some C-style casts and replaced with
static_cast<T>
Made some
#define
statementsconstexpr
Removed some
using
directivesRefactored
void print_dropped
to be platform-aware in reporting hintsMoved #include, #define, constexpr, structs, functions, predeclarations around.
struct Opts
created instead of using anonymous struct and split into General, CUDA, and CPU options. Availability of these options are determined with `#ifdef HAVE_CUDA# directives.Testing
Testing of this implementation has been done on: