Implementing CPU Support for Comprimato J2K Compress and Decompress #375

ATrivialAtomic · 2024-03-20T20:41:33Z

This pull request implements CPU compression and decompression for the Comprimato J2K Codec.

In the interest of leaving the current implementation unchanged for users already using Comprimato with CUDA, CUDA is still the default platform selected for compression and decompression when CUDA libraries are detected during UltraGrid compilation. This means that -c cmpto_j2k will default to using CUDA for compression. A system decompressing a cmpto_j2k stream will default to using CUDA for decompression. This has been guaranteed by the use of #ifdef HAVE_CUDA directives.

If UltraGrid is compiled without CUDA, -c cmpto_j2k will default to using CPU for compression. A system compiled without CUDA will also default to CPU for decompressing a cmpto_j2k stream.

To allow for the selection of either CUDA or CPU, a new :platform= option has been included when CUDA libraries are detected and UltraGrid is compiled. If no CUDA libraries are detected at compile time (ex. MacOS), the :platform= option will be removed and CPU will be the assumed default with no option for the user to change.

Platform default (CUDA or CPU) is determined at compile time using #ifdef HAVE_CUDA directives in the state_video_compress_j2k and state_video_decompress_j2k structs.

Additions to Compression Options and Decompression Parameters

To support CPU usage for compression and decompression, additional options and parameters have been added.

J2K Compression Options Syntax | CUDA

-c cmpto_j2k:platform=cuda[:mem_limit=<m>][:tile_limit=<t>][:rate=<r>][:lossless][:quality=<q>][:pool_size=<p>][:mct] [--cuda-device <c_index>]

J2K Compression Options Syntax | CPU

-c cmpto_j2k:platform=cpu[:thread_count=<t>][:img_limit=<i>][:rate=<r>][:lossless][:quality=<q>][:pool_size=<p>][:mct]

New J2K Compression Options:

:platform=<cuda,cpu>

Option will only be available if CUDA libraries are compiled.
Current options include CUDA or CPU.
CUDA is default selected platform if compiled with CUDA. Calling -c cmpto_j2k will imply -c cmpto_j2k:platform=cuda. Calling -c cmpto_j2k:platform=cpu will explicitly select CPU compression over CUDA.

:thread_count=

Threads to use on the CPU. Default is CMPTO_J2K_ENC_CPU_DEFAULT (0), which is threads equal to all cores.
Comprimato defines a range 0-64, but system hardware may limit 64 to lower max. CTX Create will throw error if there is one and be caught by CHECK_OK() during bool initialize_j2k_enc_ctx()

:img_limit=

Number of images compressed by CPU at once. Default is 0, which matches CMTPO default.
Comprimato defines a range 0-64, but system hardware may limit 64 to lower max. CTX Create will throw error if there is one and be caught by CHECK_OK() during bool initialize_j2k_enc_ctx()

:lossless

Enable lossless encoding

New J2K Decompression params:

j2k-dec-cpu-thread-count=

Threads to use on the CPU. Default is CMPTO_J2K_DEC_CPU_DEFAULT (0), which is threads equal to all cores.
Comprimato defines a max range 0-64, but system hardware may limit 64 to lower max. CTX will throw error if there is one and be caught by CHECK_OK() during bool initialize_j2k_dec_ctx().

j2k-dec-img-limit=

Default number of images to be decompressed by the CPU. Default is 0, which is the same as CMPTO DEFAULT
img-limit cannot exceed thread-count, unless thread-count=0. Checks for this are done by bool initialize_j2k_dec_ctx() and will set img-limit = cpu-thread-count if exceeded.
Comprimato defines a max range 0-64, but system hardware may limit 64 to lower max. CTX will throw error if there is one and be caught by CHECK_OK() during bool initialize_j2k_dec_ctx().

j2k-dec-use-cpu

Param will only be available if CUDA libraries are compiled.
Explicitly use the CPU to decompress images.

j2k-dec-use-cuda

Param will only be available if CUDA libraries are compiled.
Explicitly use CUDA to decompress images.

`-c cmpto_j2k:help` print out when CUDA is Present

[Cmpto J2K enc.] Using Codec version: <INFO>
J2K compress platform support:
	CPU .... yes
	CUDA ... yes
J2K compress usage:
	-c cmpto_j2k:platform=cuda[:mem_limit=<m>][:tile_limit=<t>][:rate=<r>][:quality=<q>][:pool_size=<p>][:mct][:lossless] [--cuda-device <c_index>]
	-c cmpto_j2k:platform=cpu[:thread_count=<t>][:img_limit=<i>][:rate=<r>][:quality=<q>][:pool_size=<p>][:mct][:lossless]
where:
	<p> - Platform device for the encoder to use, default: cuda
	<m> - CUDA device memory limit (in bytes), default: 1000000000ULL
	<t> - Number of tiles encoded at one moment by GPU (less to reduce latency, more to increase performance, 0 means infinity). default: 1
	<c_index> - CUDA device(s) to use (comma separated)
	<t> - Number of threads to use on the CPU. 0 is all available. default: 0
	<i> - Number of images which can be encoded at one moment by CPU. Maximum allowed limit is thread_count. 0 is default limit. default: 0
	<r> - Target bitrate
	<q> - Quality in range [0-1]. default: 0.7
	<p> - Total number of frames encoder can hold at one moment. Should be greater than tile_limit or img_limit. default: 4
	mct - Use MCT
	lossless - Enable lossless compression. default: disabled

`-c cmpto_j2k:help` print out when only CPU is Present

[Cmpto J2K enc.] Using Codec version: <INFO>
J2K compress platform support:
	CPU .... yes
	CUDA ... no
J2K compress usage:
	-c cmpto_j2k[:thread_count=<t>][:img_limit=<i>][:rate=<r>][:quality=<q>][:pool_size=<p>][:mct][:lossless]
where:
	<t> - Number of threads to use on the CPU. 0 is all available. default: 0
	<i> - Number of images which can be encoded at one moment by CPU. Maximum allowed limit is thread_count. 0 is default limit. default: 0
	<r> - Target bitrate
	<q> - Quality in range [0-1]. default: 0.7
	<p> - Total number of frames encoder can hold at one moment. Should be greater than img_limit. default: 8
	mct - Use MCT
	lossless - Enable lossless compression. default: disabled
Exit

`--param help` when CUDA is Present

* j2k-dec-use-cuda
  use CUDA to decode images

* j2k-dec-mem-limit=<limit>
  J2K max memory usage in bytes.

* j2k-dec-tile-limit=<limit>
  number of tiles decoded at moment (less to reduce latency, more to increase performance, 0 unlimited)

* j2k-dec-use-cpu
  use the CPU to decode images

* j2k-dec-cpu-thread-count=<threads>
  number of threads to use on the CPU (0 means number of threads equal to all cores)

* j2k-dec-img-limit=<limit>
  number of images which can be decoded at one moment (0 means default, thread-count is maximum limit)

* j2k-dec-queue-len=<len>
  max queue len

* j2k-dec-encoder-queue=<len>
  max number of frames held by encoder

`--param help` when only CPU is Present

* j2k-dec-cpu-thread-count=<threads>
  number of threads to use on the CPU (0 means number of threads equal to all cores)

* j2k-dec-img-limit=<limit>
  number of images which can be decoded at one moment (0 means default, thread-count is maximum limit)

* j2k-dec-queue-len=<len>
  max queue len

* j2k-dec-encoder-queue=<len>
  max number of frames held by encoder

Changes to `state_video_compress_j2k` and `state_video_decompress_j2k` construction and struct members

In addition to adding CPU support for compression and decompression, I've made changes to the way state_video)compress_j2k and state_video_decompress_j2k classes are constructed, throwing exceptions if they are unable to be initialized. This has removed the need for goto statements and explicit delete of the state_j2ks by the j2k_compress_init() and j2k_decompress_init() functions.

Parsing of compression options have been offloaded to private member, void parse_fmt(const char*) in struct state_video_compress_j2k.

Parsing of decompression parameters has been offloaded to private member void parse_params() in struct state_video_decompress_j2k.

If help is called during -c cmpto_j2k:help, the HelpRequested() exception will throw and be caught by j2k_compress_init(), resulting in static_cast<module*>(INIT_NOERR)

Once parsing has completed, private members initialize_j2k_enc_ctx() and initialize_j2k_dec_ctx() are called and attempt to complete the final initialization of the J2K context.

Any errors during this process will result in a thrown exception that will be caught by j2k_compress_init() and j2k_decompress_init(), resulting in a NULL return.

Compress Exceptions Include:

HelpRequested()
InvalidArgument()
UnableToCreateJ2KEncoderCTX()

Decompress Exceptions Include:

UnableToCreateJ2KDecoderCTX()

To account for differences in the video_frame_pool initialization when using CPU and CUDA platforms during cmpto_j2k compression, the video_frame_pool pool has been changed to std::unique_ptr<video_frame_pool> pool to allow for reconfiguration during state_video_compress_j2k construction.

Changes to `j2k_compress_init` and `j2k_decompress_init`

Since the j2k_compress and j2k_decompress constructors will throw if there is an error with any of the options or parameters, _init functions are responsible for catching those errors and returning what is needed by the module.

New j2k_compress_init implementation

static struct module * j2k_compress_init(struct module *parent, const char *opts) {
        try {
                auto *s = new state_video_compress_j2k(parent, opts);
                return &s->module_data;
        } catch (HelpRequested const& e) {
                return static_cast<module*>(INIT_NOERR);
        } catch (InvalidArgument const& e) {
                return NULL;
        } catch (UnableToCreateJ2KEncoderCTX const& e) {
                return NULL;
        } catch (...) {
                return NULL;
        }
}

New j2k_decompress_init implementation

static void * j2k_decompress_init(void) {
        try {
                auto *s = new state_video_decompress_j2k();
                return s;
        } catch (...) {
                return NULL;
        }
}

Additions of enums and structs to allow easy selection of platform to use and future expansion

enum j2k_compress_platform and enum j2k_decompress_platform have been created to allow easy selection of the platform to use. These options currently include:

NONE
CPU
CUDA

enum j2k_compress_platform {
        NONE = 0,
        CPU = 1,
#ifdef HAVE_CUDA
        CUDA = 2,
#endif // HAVE_CUDA
};

enum j2k_decompress_platform {
        NONE = 0,
        CPU = 1,
#ifdef HAVE_CUDA
        CUDA = 2,
#endif // HAVE_CUDA
};

struct j2k_compress_platform_info_t has been created to hold a friendly name (ex. "cpu") and its corresponding j2k_compress_platform type.

struct j2k_compress_platform_info_t {
	const char* name;
	j2k_compress_platform platform;
};

auto cpu = j2k_compress_platform_info_t{"cpu", j2k_compress_platform::cpu};

The function j2k_compress_platform get_platform_from_name(std::string) has also been added with the purpose of searching constexpr auto compress_platforms = array{}; for friendly names and returning an associated platform.

// Supported Platforms for Compressing J2K
constexpr auto compress_platforms = std::array {
    j2k_compress_platform_info_t{"none", j2k_compress_platform::NONE},
    j2k_compress_platform_info_t{"cpu", j2k_compress_platform::CPU},
#ifdef HAVE_CUDA
    j2k_compress_platform_info_t{"cuda", j2k_compress_platform::CUDA}
#endif
};

static j2k_compress_platform get_platform_from_name(std::string name) {
    std::transform(name.cbegin(), name.cend(), name.begin(), [](unsigned char c) { return std::tolower(c); });

    auto matches = [&name](const auto& p) { return name.compare(p.name) == 0; };

    if (const auto& it = std::find_if(compress_platforms.begin(), compress_platforms.end(), matches) ; it != compress_platforms.end()) {
        return it->platform;
    }

    return j2k_compress_platform::NONE;
}

These enums and struct are built with the consideration that OpenCL, since it's also supported by Comprimato, can be implemented in the future.

General Changes

Previously anonymous structs containing information about the ug_codec and associated CMPTO_ codec have been named struct Codec and codec information for both compress and decompress have been changed to constexpr arrays.

// example from video_compress/cmpto_j2k.cpp
struct Codec {
        codec_t ug_codec;
        enum cmpto_sample_format_type cmpto_sf;
        void (*convert)(unsigned char *dst_buffer, unsigned char *src_buffer, unsigned int width, unsigned int height);
};

constexpr auto codecs = std::array{
        Codec{UYVY, CMPTO_422_U8_P1020, nullptr},
        Codec{v210, CMPTO_422_U10_V210, nullptr},
        Codec{RGB, CMPTO_444_U8_P012, nullptr},
        Codec{BGR, CMPTO_444_U8_P210, nullptr},
        Codec{RGBA, CMPTO_444_U8_P012Z, nullptr},
        Codec{R10k, CMPTO_444_U10U10U10_MSB32BE_P210, nullptr},
        Codec{R12L, CMPTO_444_U12_MSB16LE_P012, rg48_to_r12l},
};

The <algorithm> header has been included and find_if is now used to search for matching codec rather than range-based for loop with if statements in j2k_decompress_reconfigure() and configure_with().

auto matches = [&](const Codec& codec) { return codec.ug_codec == desc.color_spec; };

if (const auto& codec = std::find_if(codecs.begin(), codecs.end(), matches) ; codec != codecs.end()) {
	/// rest of code
}

Standardized on using log_msg() for message reporting, rather than MSG()

log_msg(LOG_LEVEL_ERROR, "%s Failed to find suitable pixel format\n", MOD_NAME);

Removed some C-style casts and replaced with static_cast<T>
Made some #define statements constexpr
Removed some using directives
Refactored void print_dropped to be platform-aware in reporting hints

void print_dropped(unsigned long long int dropped, const j2k_decompress_platform& platform);

Moved #include, #define, constexpr, structs, functions, predeclarations around.
struct Opts created instead of using anonymous struct and split into General, CUDA, and CPU options. Availability of these options are determined with `#ifdef HAVE_CUDA# directives.

Testing

Testing of this implementation has been done on:

Ubuntu 22.04LTS using Comprimato SDK 2.8.0 and 2.8.1
MacBook Pro (M1), running MacOS 14.4 using Comprimato SDK v2.8.0.

MartinPulec · 2024-05-28T14:53:49Z

Hi, I would have some remarks to the pull request.

Most importantly, the pull request is a bit huge compared to functional changes - it almost rewrite both files (2/3 of code is touched), even the unrelated parts of code. Also the commits can be easily squashed into one, because there is first big commit and then just smaller changes of this one.

As there are together functional changes, refactoring and code movement, it is really hard to check what has been exactly changed, so it almost cannot be audited for regressions etc. (I personally usually separate even refactoring and functional commits.) Would it be possible to reduce the extent? You can look (or use) eg. this commit - it modifies your changes in a way that it at least leaves the code in place and reverts few unrelated changes. The advantage should be obvious if you look at git diff, that your changes are more readable and can be reviewed.

For me will be acceptable if you'd squash all to a single commit (including the suggested changes). If you'd then like to do some further changes, I'd perhaps prefer separate pull requests. I'll sum up my opinions about the non-essential changes:

struct Codec - actually I don't see a reason for creating a named type, except than for also std::array. Well, slightly ugly thing about array is that in order to template arguments to be deduced, the aggregate type name must be usually given explicitly in the initializer list
find_if - why not, on the other hand, there is no big advantage of the change
log_msg - well, as for functional change, I'll perhaps see as more important to consistently use the module prefix string; I don't see much bad about MSG() - it actually enforces the use of that prefix
c-style casts - well, maybe, but it is not worth doing the change
moving the stuff around - well, from my perspective definitely not if done "by the way" mixed with other changes and not done in a separate commit, because it hides the actual changes done
remove using directives - well maybe, I'd agree with using whole std namespace but actually I see using std::something rather as a matter of opinion than a problem that should be fixed

Last, I see problematic the conditional compilation with HAVE_CUDA - in the original code the only CUDA depending code is the CUDA host allocator. Actually the proposed pull request requires CUDA toolkit for the CUDA codec variant to be used. That was actually not required before - without CUDA, the codec worked (just the buffers was not in the DMA transfer capable region). I don't know exactly our uses' use cases, but I can imagine that someone compiles UG without having the CUDA toolkit - it may not introduce much performance penalty. But don't worry about this particular case, I can fix this after the merge by myself.

ATrivialAtomic · 2024-05-29T17:57:39Z

Hi @MartinPulec!

Thanks for taking the time to review this pull request. This is my first one, so my apologies if it's a bit unconventional. I'll be sure to keep your notes in mind, regarding separating functional and refactoring changes, as well as squashing minor changes into one commit, for the next PR.

I read through your commit and agree, it's significantly more readable and rolls back some of the non-essential changes. I'm happy to use that as my starting point to re-review, test, and submit if you'd prefer I do that.

I had one question that I wanted your input on since I was going back-and-forth with myself about it. Do you agree with :platform=<cpu,cuda> or should it be named something else? I'm not tied to the platform keyword, but it's what made sense to me when I was testing out how best to name and implement.

Regarding the HAVE_CUDA conditional, thanks for catching that. Looking back, all of my testing on Linux used systems with the CUDA toolkit, so I missed that part of my testing. If you'd prefer to fix that after the merge, I can leave all those conditionals as-is.

In terms of the non-essential changes, here's some of the reasoning with why I made those changes. I'm happy to submit separate pull requests if you think any of them really make much of a difference -- don't want to waste your time submitting non-important PRs.

find_if - No specific reason other than I thought easier to read than the range-based for loop when combined with a lambda. Agreed, not particularly important!
log_msg - I used libavcodec.cpp in compress and decompress as my reference since it was the most recently updated at the time of me working on this PR. I noticed that log_msg was used more than MSG(), so I standardized on that. I'm not particularly tied to this change, just wanted to try and use something that seemed a bit more recent and standardized. Using %s for the module prefix is nice because it reduced chances I'd mistype and put [CMPTO J2K enc] instead of [CMPTO J2K dec], but I can definitely see benefits to enforcing the prefix usage.
using std::* - Other than the namespace std, there was no reason I removed these other than it being a matter of opinion. Not particular important either!

MartinPulec · 2024-06-03T12:06:56Z

src/video_compress/cmpto_j2k.cpp

+static void usage() {
+        col() << "J2K compress platform support:\n";
+        col() << "\tCPU .... yes\n";
+#ifdef HAVE_CUDA


instead of insisting on HAVE_CUDA, I think it is better to use cmpto_version::technology bit array

MartinPulec · 2024-06-03T14:35:59Z

It is totally fine, I believe that your code is great, but the only problem is really that I want to leave the things traceable, for which it is good to separate functional and non-functional changes, I believe. For the other changes it depends - some are ones that I'd agree, the others I would as well, but I would change it just when changing that code anyways.

I had one question that I wanted your input on since I was going back-and-forth with myself about it. Do you agree with :platform=<cpu,cuda> or should it be named something else? I'm not tied to the platform keyword, but it's what made sense to me when I was testing out how best to name and implement.

I believe that the keyword platform is totally fine. I've just looked into the headers and at Comprimato, they denote this as "technology" (CMPTO_TECHNOLOGY_{CPU,CUDA}. But both words sound more or less equivalent in this content.

Regarding the HAVE_CUDA conditional, thanks for catching that. Looking back, all of my testing on Linux used systems with the CUDA toolkit, so I missed that part of my testing. If you'd prefer to fix that after the merge, I can leave all those conditionals as-is.

I have no problem doing it by myself but I don't insist on that. Anyways, if you decide to do it by yourself, you can look at the above code comment.

find_if

Sure - I've no particular objections against that. If you wish, I have no problem but it will be nicer if it was in a separate commit. Anyways, just in a case, it doesn't need the std::array necessarily, std::begin/std::end to with the C-array as the argument would work as well.

log_msg - I used libavcodec.cpp in compress and decompress as my reference since it was the most recently updated at the time of me working on this PR. I noticed that log_msg was used more than MSG(), so I standardized on that. I'm not particularly tied to this change, just wanted to try and use something that seemed a bit more recent and standardized. Using %s for the module prefix is nice because it reduced chances I'd mistype and put [CMPTO J2K enc] instead of [CMPTO J2K dec], but I can definitely see benefits to enforcing the prefix usage.

Ok, well, no serious problem with this, indeed. But just a small explanation - I've actually started using MSG() later on - it is actually a macro and if you look at its definition, it actually enforces the module name implicitly. ~~On the other hand, it also forces the MOD_NAME to be the C string literal.~~ fixed

I understand that using the macro(s) is slightly controversial in C++ (even these MOD_NAME, which clang-tidy wants to be replaced with a constexpr var), so I don't insist on that, either.

using std::* - Other than the namespace std, there was no reason I removed these other than it being a matter of opinion. Not particular important either!

Sure, I've consulted it with a colleague on Friday and he also advocated the use of the identifiers including the std namespace. We may consult it later to unify the style but it just haven't been so far (UltraGrid is written in C historically and the C++ coding standards are not much defined).

To sum up the points 1-3, as there is no rule defined, feel free to choose. I also understand that you may want to make this consistent across the file. Provided that it will be in a separate commit, we'd accept it.

Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to GH-375

…er have_cuda conditionals j2k_compress_platform now uses CMPTO_TECHNOLOGY_{CPU,CUDA} instead of 1, 2 bool supports_cmpto_technology(int) function created for checking if supported technology version is supported on system Added NoCmptoTechnologyFound exception for error reporting

…over cuda conditionals Remove #ifdef HAVE_CUDA j2k_decompress_platform now uses CMPTO_TECHNOLOGY_{CPU,CUDA} instead of 1, 2 bool supports_cmpto_technology(int) function created for checking if supported technology version is supported on system

This matches cpu_allocator naming and helps to be explicit about what allocator is being used during video_frame_pool creation during `bool state_video_compress_j2k::initialize_j2k_enc_ctx()`

ATrivialAtomic · 2024-06-18T18:27:11Z

Hi @MartinPulec!

I've made the requested changes, if you'd like to review them and provide feedback. Here's a summary of what has been done.

Squashed all previous commits to Initial cmpto implementation
Pulled your commit into my branch and used that as a starting point for all changes moving forward.
Implemented cmpto_version::technology checks via a new function bool supports_cmpto_technology(int) and replaced #ifdef HAVE_CUDA directives in all cases except for cmpto_j2k_enc_cuda_host_buffer_data_allocator
Standardized on using MSG() instead of log_msg() for message reporting in most cases.
Renamed using allocator = in video_compress file to using cuda_allocator = to match using cpu_allocator = naming and to be explicit about what type of allocator is being used when std::unique_ptr<video_frame_pool> is initialized.

One thing I noticed during the MSG() refactor is that fmt is using "%s", which means that MSG() calls will require a space at the head of the fmt string to print a space between MOD_NAME and the rest of the arguments. Should I add space to the head of all the MSG() calls, or is there a chance "%s" will be changed to "%s " in the MSG() call?

MSG(INFO, "no space at head") = [Cmpto J2K enc.]no space at head
MSG(INFO, " space at head") = [Cmpto J2K enc.] space at head

Thanks!

Implementing Commit [9577372](CESNET@9577372) from master

Implement commit [9e05752](CESNET@9e05752)

Mirror [779021b](CESNET@779021b) from master

Implementation of [c2e7811}(CESNET@c2e7811) from master

Implementing [ca71f59](CESNET@ca71f59) from master

Implementing [4061f8d](CESNET@4061f8d) from master

Implementing [c2cebd3](CESNET@c2cebd3) from master

Implement [af5d584](CESNET@af5d584) and [3adb9a4](CESNET@3adb9a4)

Implement [930abe5](CESNET@930abe5) from master

Implement [39c9c40](CESNET@39c9c40) from master

Implement [cc6b820](CESNET@cc6b820) from master

Implement [1cffc72](CESNET@1cffc72) from master

Implement [7b91ebb](CESNET@7b91ebb) from master

Implement [94afd6c](CESNET@94afd6c) from master

Implement [9304717](CESNET@9304717) from master

Implement the following from master [fa93411](CESNET@fa93411) [4f3add7](CESNET@4f3add7) [876870f](CESNET@876870f) [ad7929b](CESNET@ad7929b) [95dea89](CESNET@95dea89)

[b1ff4c6](CESNET@b1ff4c6) [e37e58c](CESNET@e37e58c) [94afd6c](CESNET@94afd6c)

…lution

Remove duplicate #include, variables, functions, etc Class member 'pool' is a unique_ptr. Changed calls to pool from . to -> Renamed req_tile_limit and req_mem_limit to cuda_tile_limit and cuda_mem_limit to match class member name

…resolution Merge temp branch used to resolve remaining sync conflicts. See commit c307088 for details.

j2k_decompress_reconfigure now checks state_decompress_j2k class member platform to see if it is set to j2k_decompress_platform::CPU or j2k_decompress_platform::CUDA prior to calling cmpto_j2k_dec_ctx_cfg_add_cuda_device / cmpto_j2k_dec_ctx_cfg_add_cpu

Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to CESNETGH-375

Allow the MOD_NAME to be a variable (like (constexpr const char *)). Using non-standard extension, the standard one would be __VA_OPT__. Although it is supported with MSVC 2019/2022, it requires the compiler flag /Zc:preprocessor. This version doesn't require that so use it for now. The MSVC is used to compile the CUDA code and AJA wrapper so not to complicate the things now. This syntax is supported for both GNU and MSVC: 1. https://stackoverflow.com/a/78185169 2. https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html refer to GH-375

MartinPulec reviewed Jun 3, 2024

View reviewed changes

ATrivialAtomic force-pushed the wip-cmpto-j2k-cpu branch from be9e24a to 525a0f4 Compare June 13, 2024 19:20

Initial cmpto implementation

5df9aae

ATrivialAtomic force-pushed the wip-cmpto-j2k-cpu branch from 525a0f4 to 5df9aae Compare June 13, 2024 21:56

MartinPulec and others added 5 commits June 14, 2024 10:45

reduce merge request diff

4fb6d5f

Standardize on MSG() over log_msg() for most cases of message reporting

95b9a7e

Use cuda_allocator naming over allocator for using statement

fbdeef3

This matches cpu_allocator naming and helps to be explicit about what allocator is being used during video_frame_pool creation during `bool state_video_compress_j2k::initialize_j2k_enc_ctx()`

ATrivialAtomic added 16 commits September 6, 2024 15:26

Free data when enc fails

08969b5

Implementing Commit [9577372](CESNET@9577372) from master

Fixed mct comparison

1ea3b5a

Implement commit [9e05752](CESNET@9e05752)

Mirror Commit 779021b

9ae3bb2

Mirror [779021b](CESNET@779021b) from master

Implement Commit c2e7811 from master

aea5ffa

Implementation of [c2e7811}(CESNET@c2e7811) from master

Implementing ca71f59 from master

10b4c69

Implementing [ca71f59](CESNET@ca71f59) from master

Implementing 4061f8d from master

c133739

Implementing [4061f8d](CESNET@4061f8d) from master

Implementing c2cebd3 from master

0b57147

Implementing [c2cebd3](CESNET@c2cebd3) from master

Implement af5d584 and 3adb9a4 from master

bf7e84d

Implement [af5d584](CESNET@af5d584) and [3adb9a4](CESNET@3adb9a4)

Implement 930abe5 from master

2bd4c09

Implement [930abe5](CESNET@930abe5) from master

Implement 39c9c40 from master

cc0d31c

Implement [39c9c40](CESNET@39c9c40) from master

Implement cc6b820 from master

1c4c1df

Implement [cc6b820](CESNET@cc6b820) from master

Implement 1cffc72 from master

c9e111f

Implement [1cffc72](CESNET@1cffc72) from master

Implement 7b91ebb from master

a612bbe

Implement [7b91ebb](CESNET@7b91ebb) from master

Implement 94afd6c from master

0fdabb4

Implement [94afd6c](CESNET@94afd6c) from master

Implement 9304717 from master

bd44e8a

Implement [9304717](CESNET@9304717) from master

Implement fa93411, 4f3add7, 876870f, ad7929b, 95dea89

f893b43

Implement the following from master [fa93411](CESNET@fa93411) [4f3add7](CESNET@4f3add7) [876870f](CESNET@876870f) [ad7929b](CESNET@ad7929b) [95dea89](CESNET@95dea89)

ATrivialAtomic and others added 6 commits September 6, 2024 19:00

Implement b1ff4c6, e37e58c, 94afd6c from master

a7eba41

[b1ff4c6](CESNET@b1ff4c6) [e37e58c](CESNET@e37e58c) [94afd6c](CESNET@94afd6c)

Merge branch 'wip-cmpto-j2k-cpu' into wip-cmpto-j2k-cpu-conflict-reso…

7aeac4c

…lution

Resolve remaining merge issues

c307088

Remove duplicate #include, variables, functions, etc Class member 'pool' is a unique_ptr. Changed calls to pool from . to -> Renamed req_tile_limit and req_mem_limit to cuda_tile_limit and cuda_mem_limit to match class member name

Merge pull request #2 from ATrivialAtomic/wip-cmpto-j2k-cpu-conflict-…

392b3b4

…resolution Merge temp branch used to resolve remaining sync conflicts. See commit c307088 for details.

Merge branch 'CESNET:master' into wip-cmpto-j2k-cpu

d3f1ae4

ATrivialAtomic force-pushed the wip-cmpto-j2k-cpu branch from ea3e356 to 28b98a9 Compare September 10, 2024 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing CPU Support for Comprimato J2K Compress and Decompress #375

Implementing CPU Support for Comprimato J2K Compress and Decompress #375

ATrivialAtomic commented Mar 20, 2024

MartinPulec commented May 28, 2024

ATrivialAtomic commented May 29, 2024

MartinPulec Jun 3, 2024

MartinPulec commented Jun 3, 2024 •

edited

Loading

ATrivialAtomic commented Jun 18, 2024

Implementing CPU Support for Comprimato J2K Compress and Decompress #375

Are you sure you want to change the base?

Implementing CPU Support for Comprimato J2K Compress and Decompress #375

Conversation

ATrivialAtomic commented Mar 20, 2024

Additions to Compression Options and Decompression Parameters

J2K Compression Options Syntax | CUDA

J2K Compression Options Syntax | CPU

New J2K Compression Options:

New J2K Decompression params:

-c cmpto_j2k:help print out when CUDA is Present

-c cmpto_j2k:help print out when only CPU is Present

--param help when CUDA is Present

--param help when only CPU is Present

Changes to state_video_compress_j2k and state_video_decompress_j2k construction and struct members

Changes to j2k_compress_init and j2k_decompress_init

New j2k_compress_init implementation

New j2k_decompress_init implementation

Additions of enums and structs to allow easy selection of platform to use and future expansion

General Changes

Testing

MartinPulec commented May 28, 2024

ATrivialAtomic commented May 29, 2024

MartinPulec Jun 3, 2024

Choose a reason for hiding this comment

MartinPulec commented Jun 3, 2024 • edited Loading

ATrivialAtomic commented Jun 18, 2024

`-c cmpto_j2k:help` print out when CUDA is Present

`-c cmpto_j2k:help` print out when only CPU is Present

`--param help` when CUDA is Present

`--param help` when only CPU is Present

Changes to `state_video_compress_j2k` and `state_video_decompress_j2k` construction and struct members

Changes to `j2k_compress_init` and `j2k_decompress_init`

MartinPulec commented Jun 3, 2024 •

edited

Loading