Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host function generic_hash() with blake2b/blake3 support. #4412

Closed

Conversation

koxu1996
Copy link

@koxu1996 koxu1996 commented Nov 13, 2023

In order to support Zero Knowledge software, cryptography related utils should be implemented as a host functions, otherwise execution cost would be too high. For example running Risc0 VM proof verification for simple sum of squares takes more than 6000 CSPR. The underlying hash used: sha256.

This PR adds host function called generic_hash(input, type) with support for the following types:

  • HashAlgoType::Blake2b - used existing blake2b implementation,
  • HashAlgoType::Blake3b - blake3 library was introduced.

We expect to add more algorithms, i.e. sha256, keccak, poseidon.

Example usage

Smart contract:

#![no_std]
#![no_main]

extern crate alloc;

use alloc::format;
use casper_contract::contract_api::{crypto, runtime};
use casper_types::HashAlgoType;

#[no_mangle]
pub extern "C" fn call() {
    let input = "casper".as_bytes();
    runtime::print(&format!("Input: {:?}", input));

    let blake2b = crypto::generic_hash(input, HashAlgoType::Blake2b);
    runtime::print(&format!("Blake2b hash: {:?}", blake2b));

    let blake3 = crypto::generic_hash(input, HashAlgoType::Blake3);
    runtime::print(&format!("Blake3 hash: {:?}", blake3));
}

Node output:

Input: [99, 97, 115, 112, 101, 114]
Blake2b hash: [163, 40, 63, 203, 149, 170, 125, 252, 231, 127, 90, 198, 44, 250, 234, 42, 1, 199, 20, 158, 150, 188, 25, 205, 21, 106, 255, 40, 204, 65, 45, 196]
Blake3 hash: [122, 158, 116, 51, 175, 213, 20, 50, 237, 63, 251, 230, 53, 2, 233, 86, 36, 159, 195, 109, 254, 108, 189, 210, 233, 1, 93, 219, 226, 1, 50, 217]

Considerations

1. AssemblyScript support

Is AssemblyScript binding required? I am pretty sure no one uses it.

Update: AssemblyScript is broken since a long time. I see no reason to include bindings for something that does not work.

2. Gas cost

For now, I used the same cost for generic_hash() as defined for blake2() - default fixed value 200. I can do benchmarks for WASM bytecode execution, but is there any conversion ratio between CPU time and execution cost?

Update: There will be new execution engine that will affect (lower) gas usage. It will be the better to make benchmarks then.

3. Place for core hashes implementation

Blake2b is implemented in types package, which does not seem to be reasonable. I added blake3 to new module crypto in execution engine, but I am open to discussing its location.

4. Single vs multiple host function

Do we want to introduce separate host functions for each hash algorithm?

5. Place for generic_hash() in contract API

I was thinking of using runtime, but finally decided to create new crypto module.

Will be used to introduce hashing method for runtime.
Placed in new `crypto` module under `runtime`.
Will be used to switch between different hashing algorithms.
Note: no gas charging was introduced yet.
Inspired by `FunctionIndex::Blake2b`, which is also not researched
in terms of execution cost.
Note: new module `crypto` was introduced.
execution_engine/Cargo.toml Outdated Show resolved Hide resolved
execution_engine/src/core/runtime/externals.rs Outdated Show resolved Hide resolved
smart_contracts/contract/src/contract_api/crypto.rs Outdated Show resolved Hide resolved
Blake3 = 1,
}

impl ToBytes for HashAlgoType {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can skip ToBytes/FromBytes for this type as it's easier to pass it by value. Same with impl AsRef I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed both, and introduced TryFrom<u8> - see 2c70942.


Btw. handling error was crazy, as compiler could not determine that casper_types::bytesrepr::Error should be converted into casper_execution_engine::core::execution::error, and then into casper_wasmi_core::trap::Trap. Finally I found working line:

let hash_algo_type = HashAlgoType::try_from(hash_algo_type as u8).map_err(|e| Error::from(e))?;

types/src/crypto.rs Outdated Show resolved Hide resolved
koxu1996 and others added 3 commits November 15, 2023 18:53
Improved type documentation.

Co-authored-by: Michał Papierski <[email protected]>
We can avoid working with bytes and serialization,
as `HashAlgoType` enum is represented by simple `u8`.
After passing `hash_algo_type` directly, we have one less parameter.
Copy link
Collaborator

@mpapierski mpapierski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two remaining comments and looks good otherwise 👍

type Error = bytesrepr::Error;

fn try_from(value: u8) -> Result<Self, bytesrepr::Error> {
FromPrimitive::from_u8(value).ok_or(bytesrepr::Error::Formatting)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use ApiError::InvalidArgument here to make it more obvious, since Formatting error may be confusing as there's no serialization involved

Copy link
Author

@koxu1996 koxu1996 Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree Formatting error might be confusing. Why I did that? I based on ExecutionResultTag and few other enums in execution_result.rs, but now I see those TryFrom's are always used internally inside ExecutionResult::from_bytes().

  2. I think ApiError should not be present in base Casper types, as no API is involved here. Therefore I think we can just keep it simple and make () the error - 85885d0:

image

  1. We can return ApiError::InvalidArgument inside function handler - it makes sense - but it complicates the code a little bit. Nevertheless, done in d064e2c.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional, please take in mind that user will not able to face this error, because using HashAlgoType enum as an argument forces the correct value.

}
})?;

let result = if digest.len() > out_size as usize {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I just noticed - we've done hashing already, but we're checking the size after the fact. Since the output of the hash is constant, we could also move this check above the checked_memory_slice line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the output is always expected to be 32 bytes, those hashes use different constants for their digest size:

  • blake2: BLAKE2B_DIGEST_LENGTH
  • blake3: BLAKE3_DIGEST_LENGTH

Unless we implement all the necessary hash functions (e.g., sha256, keccak, poseidon) and generalize it with something like GENERIC_HASH_DIGEST_LENGTH, I would prefer to avoid using hardcoded pre-checks.

It can be simple `()` instead of `bytesrepr::Error::Formatting`
(the one commonly used for serialization related issues).
Previous deserializiation error was confusing.

NOTE: The code seems too complex for such a basic thing like
returning API error.
@koxu1996
Copy link
Author

I just confirmed that host function is still working correctly after all changes and fixes.

Detected with `cargo fmt`.
@EdHastingsCasperAssociation
Copy link
Collaborator

EdHastingsCasperAssociation commented Feb 27, 2024

Michal's approval is sufficient for you to move forward with this at the functional level. As far as inclusion into 1.6, @marc-casperlabs is the principal engineer on that rel; I've had no direct involvement with it.

If it can't go into 1.6 for timeline reasons, we can repoint it to 2.0 and then I'll review it.

@EdHastingsCasperAssociation
Copy link
Collaborator

You should repoint this to feat-2.0.

@EdHastingsCasperAssociation
Copy link
Collaborator

@mpapierski should this be repointed to feat-2.0 ?

@mpapierski
Copy link
Collaborator

mpapierski commented Sep 6, 2024

@mpapierski should this be repointed to feat-2.0 ?

I talked to @asladeofgreen and @jonas089 and for now, there's no concrete use case for generic hash functions, but we can revisit this once we have use for this internally or externally from the community. My understanding is that this PR was done as part of research.

@mpapierski mpapierski closed this Sep 6, 2024
@devendran-m
Copy link
Contributor

Related issue (created to take this forward) #4892

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants