Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime sha256 should work for arbitrary strings even with length >= 128 #1056

Open
anton-trunov opened this issue Nov 18, 2024 · 8 comments
Assignees
Labels
feature: builtins Builtin functions kind: bug Something isn't working or isn't right
Milestone

Comments

@anton-trunov
Copy link
Member

anton-trunov commented Nov 18, 2024

sha256 at run-time uses the SHA256U TVM instruction, which only works with data with up to 1023 bits, so strings of length 128 bytes basically get truncated. This is confusing for smart-contract authors because it breaks the abstraction of the String type. Since we allow strings with more than 127 characters, those sha256 should work on those.

See also, #1085

@anton-trunov anton-trunov added kind: bug Something isn't working or isn't right feature: builtins Builtin functions labels Nov 18, 2024
@anton-trunov anton-trunov added this to the v1.6.0 milestone Nov 18, 2024
@novusnota
Copy link
Member

novusnota commented Nov 18, 2024

The statically known strings is one story, where we can accurately compute their hashes with sha256_sync() no matter their length. And FunC can do the same with their "..."H string. But the run-time strings is a different case.

Do you mean to check the length of the string at runtime somehow?

@anton-trunov
Copy link
Member Author

Do you mean to check the length of the string at runtime somehow?

the implementation should traverse all the chunks of the string

@anton-trunov
Copy link
Member Author

related issue: #1085

@imartemy1524
Copy link

imartemy1524 commented Dec 8, 2024

Maybe we should implement HASHEXT_SHA256 TVM instruction for sha256 to handle strings with length >=128?

This going solve all related issues.

@imartemy1524
Copy link

imartemy1524 commented Dec 9, 2024

BTW writing such function in TACT which gonna support hashing of strings any length is not that hard, here is an example of a simple fift function, which computes sha256 hash of any string of any length:

asm fun sha256(data: String): Int{
    1 PUSHINT      // pusing the counter of the references to the stack
                   // s0 - counter, s1 - slice
    WHILE:<{
        OVER       // copying the last slice to the top of the stack
                   // s0 - slice (copy), s1 - counter, s2 - slice
        SREFS      // counting the refs in the slice s0 and putting it (the counter) to s0
                   // s0 - number (of refs), s1 - counter, s2 - slice
        0 NEQINT   // comparing the number or refs, if 0 then exit the loop
                   // s0 - bool, s1 - counter, s2 - slice
    }>DO<{
                   // s0 - counter, s1 - slice
        OVER       // copying the slice from s1 to the top of the stack (to s0)
                   // s0 - slice, s1 - counter, s2 - slice
        LDREF      // loading the reference, from s0 (the original data variable/last stack element), not s0 - original, s1 - result
                   // s0 - slice, s1 - cell (ref), s2 - counter, s3 - slice
        s0 POP     // removing the original slice from the stack (we don't need it anymore, only ref needed)
                   // s0 - cell (ref), s1 - counter, s2 - slice
        CTOS       // convert s0 the cell to slice (still in s0)
                   // s0 - slice (ref), s1 - counter, s2 - slice
        s0 s1 XCHG // putting to s1 the value of current slice (it was in s0 prev), and counter to s0
                   // s0 - counter, s1 - slice (ref), s2 - slice
        INC        // increment the counter (in s0)
    }>

    HASHEXT_SHA256 // call the sha256 function
}

P.S. I used FIFT because I don't know an easy way to do manual stack manipulation with tact/func

@novusnota
Copy link
Member

novusnota commented Dec 9, 2024

BTW writing such function in TACT which gonna support hashing of strings any length is not that hard, here is an example of a simple fift function, which computes sha256 hash of any string of any length:

// ... code above ...

P.S. I used FIFT because I don't know an easy way to do manual stack manipulation with tact/func

Neat approach! However, the Fift-specific constructs like WHILE:<{ … }>DO<{ … }> (and similar) are probably going to be removed in the future Tact versions, as we're thinking about our own flavor of assembly for TVM.

Option 1

So, one option would be to move that function over to FunC and then bind to it from Tact:

int sha256(slice data) impure asm """
    1 PUSHINT      // pusing the counter of the references to the stack
                   // s0 - counter, s1 - slice
    WHILE:<{
        OVER       // copying the last slice to the top of the stack
                   // s0 - slice (copy), s1 - counter, s2 - slice
        SREFS      // counting the refs in the slice s0 and putting it (the counter) to s0
                   // s0 - number (of refs), s1 - counter, s2 - slice
        0 NEQINT   // comparing the number or refs, if 0 then exit the loop
                   // s0 - bool, s1 - counter, s2 - slice
    }>DO<{
                   // s0 - counter, s1 - slice
        OVER       // copying the slice from s1 to the top of the stack (to s0)
                   // s0 - slice, s1 - counter, s2 - slice
        LDREF      // loading the reference, from s0 (the original data variable/last stack element), not s0 - original, s1 - result
                   // s0 - slice, s1 - cell (ref), s2 - counter, s3 - slice
        s0 POP     // removing the original slice from the stack (we don't need it anymore, only ref needed)
                   // s0 - cell (ref), s1 - counter, s2 - slice
        CTOS       // convert s0 the cell to slice (still in s0)
                   // s0 - slice (ref), s1 - counter, s2 - slice
        s0 s1 XCHG // putting to s1 the value of current slice (it was in s0 prev), and counter to s0
                   // s0 - counter, s1 - slice (ref), s2 - slice
        INC        // increment the counter (in s0)
    }>

    HASHEXT_SHA256 // call the sha256 function
""";
import "./sha256.fc";

@name(sha256)
native onchainSha256(data: String): Int;

Option 2

Another approach would be to replace those WHILE:<{ things with something like this:

/// Main function that combines it all together
fun onchainSha256(data: String): Int {
    shaPush(data);
    while (shaShallProceed()) {
        shaOperate();
    }
    return shaHashExt();
}

/// Gets the data onto stack, then pushes the 1 there too
/// Since there's no return value bound, both will remain on stack
asm fun shaPush(data: String) { 1 PUSHINT }

/// The while loop clause, which captures the Bool from the top of the stack
asm fun shaShallProceed(): Bool { OVER SREFS 0 NEQINT }

/// The body of the while loop
asm fun shaOperate() {
    OVER LDREF s0 POP CTOS s0 s1 XCHG INC
}

/// HASHEXT_SHA256
asm fun shaHashExt(): Int { HASHEXT_SHA256 }

Conclusion of sorts

In any case, those or any other options won't appear in the user code directly, only in the emitted/generated code. That's because handling of sha256() is done at compile-time and we, currently, simply emit the FunC code to handle it for us iff we cannot compute the SHA256 right away, of course.

P.S.: Didn't test either of those options yet, just quickly wrote them here directly as a showcase :)
P.P.S.: Both actually work, wow

@novusnota
Copy link
Member

The onchainSha256() example is now featured in docs: https://docs.tact-lang.org/book/assembly-functions/#onchainsha256

@i582
Copy link
Contributor

i582 commented Jan 22, 2025

But if we will use such a function (as onchainSha256), it seems that we will increase gas consumption due to additional instructions 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature: builtins Builtin functions kind: bug Something isn't working or isn't right
Projects
None yet
Development

No branches or pull requests

4 participants