Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the ISAL library have an API compatible with Rocksdb CRC32C #292

Open
damoncui1993 opened this issue Jul 24, 2024 · 6 comments
Open

Comments

@damoncui1993
Copy link

My business previously used CRC32C from the Rocksdb library for data validation, but I found that the ISAL library has better performance. Therefore, I am considering using this library for speed improvement. However, I found that the calculated value of ISAL's CRC32_gzip-refl function is inconsistent with the original CRC32C result, which makes the business unusable. Do ISAL have an API that is compatible with Rocksdb's CRC32C function?

@damoncui1993
Copy link
Author

damoncui1993 commented Jul 24, 2024

#include <boost/crc.hpp>
#include <iostream>
#include <vector>
#include <chrono>
#include <cstdlib>
#include <ctime>  
#include "isa-l.h" 
#include "base/crc32c.h"
#include <memory>  // Add this line

// 使用 ISA-L 的 crc32_gzip_refl 计算 CRC32
uint32_t crc32_gzip_refl(uint32_t crc, const void *buf, size_t size) {
    return crc32_gzip_refl(crc, buf, size);
}

int main() {
    std::srand(std::time(nullptr));
    std::vector<uint8_t> data(100000000); 
    for (auto& byte : data) {
        byte = std::rand() % 256; 
    }

    //  Rocksdb CRC32
    auto start_custom = std::chrono::high_resolution_clock::now();
    uint32_t crc_custom = base::crc32c::Extend(0, reinterpret_cast<const char*>(data.data()), data.size());
    auto end_custom = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double, std::milli> elapsed_custom = end_custom - start_custom;
    std::cout << "Rocksdb CRC32: " << crc_custom << ", Time: " << elapsed_custom.count() << " ms" << std::endl;

    // ISA-L crc32_gzip
    auto start_isal_gzip = std::chrono::high_resolution_clock::now();
    uint32_t crc_isal_gzip = crc32_gzip_refl(0, data.data(), data.size());
    auto end_isal_gzip = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double, std::milli> elapsed_isal_gzip = end_isal_gzip - start_isal_gzip;
    std::cout << "ISA-L CRC32_GZIP: " << crc_isal_gzip << ", Time: " << elapsed_isal_gzip.count() << " ms" << std::endl;

    return 0;
}

Result:
Rocksdb CRC32: 500236219, Time: 0.125811 ms
ISA-L CRC32_GZIP: 517025312, Time: 0.019541 ms

@rhpvorderman
Copy link
Contributor

Did you also check crc32_ieee?

@rhpvorderman
Copy link
Contributor

And does it have to be backwards-compatible? Otherwise I recommend using XXHash for data validation. (Note: I am not an ISA-L maintainer, just a compression enthusiast.)

@damoncui1993
Copy link
Author

And does it have to be backwards-compatible? Otherwise I recommend using XXHash for data validation. (Note: I am not an ISA-L maintainer, just a compression enthusiast.)

Thank you very much for your suggestions, but unfortunately, our business has recorded the CRC data of existing data and requires a new CRC algorithm that is fully compatible with the former, otherwise it will result in data mismatch,

@pablodelara
Copy link
Contributor

Hi @damoncui1993. I think you need to use the crc32_iscsi() function, but you'll need to invert the initial value (if 0, pass "0xFFFFFFFF) and the output value:
uint32_t res_crc = ~crc_iscsi(buf, len, 0xFFFFFFFF);

@damoncui1993
Copy link
Author

Hi @damoncui1993. I think you need to use the crc32_iscsi() function, but you'll need to invert the initial value (if 0, pass "0xFFFFFFFF) and the output value: uint32_t res_crc = ~crc_iscsi(buf, len, 0xFFFFFFFF);

Thank you very much for your suggestion. This function works perfectly!! ):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants