Skip to content

Protocols

Thomas H. Austin edited this page Oct 12, 2022 · 7 revisions

File Hash

There are 3 hashes for a file on blobber:

  • Actual File Hash

    it is used to verify the checksum of downloaded file on clients. It is the hash of original file. It is reference_objects.actual_file_hash in database.

  • Content Hash

    it is used to verify the checksum of uploaded data blocks on blobber server. It is hash of data blocks that is sharded by ErasureEncoder on client and uploaded on a blobber. It is reference_objects.content_hash in database.

  • Challenge Hash

    it is used to verify challenge on validator based on challenge protocol. It is reference_objects.merkle_root in database.

Hash Method

  • Actual File Hash is computed by sha256

  • Content Hash is computed by MerkleTree.

A hash tree or Merkle tree is a tree in which every leaf node is labelled with the cryptographic hash of a data block, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes.

For better performance on memory and persistence, we have a CompactMerkleTree here.

What is CompactMerkleTree?

It is a MerkleTree which nodes are labelled and removed as soon as possible.

In MerkleTree, all leaf nodes are added and keep in memory before computing hash. It computes hash from leaf nodes to root node level by level.

In CompactMerkleTree, they will be computed instantly once two child nodes are added, save the binary hash on their parent node. The two child nodes are removed from memory.

For example:

  • the size of data block is chunk_size.
  • a data block can be released from memory once it's hash is computed and added into MerkleTree as leaf node.
  • In Memory means it has to be kept in memory for computing hash.
  • Out Memory means it is safe to release from memory or is not added in memory yet.
  1. Trees in memory when 3 hash leaves are added MerkleTree vs CompactMerkleTree
  2. Trees in memory when 5 hash leaves are added MerkleTree vs CompactMerkleTree
  3. Trees in memory when 7 hash leaves are added MerkleTree vs CompactMerkleTree

CompactMerkleTree is better than MerkleTree to persist tree to disk or keep in memory.

Challenge Hash is also computed by MerkleTree. For 1.8 Outsurcing Attack, we have a FixedMerkleTree here.

1.8 Outsourcing Attack There is a need to make sure each storage server is doing their job and committing resources, rather than pretend to offer storage, instead of outsourcing it to another storage server. Our protocol avoids this outsourcing attack by ensuring that the content provided for verification is 64 kB and the content required to create this verified content is the full file fragment. This is done as follows. The file is divided into n 64 kB fragments based on n storage servers. Each of these 64 kB fragments is further divided into 64-byte chunks, so there are 1024 such chunks in each 64 kB block that can be addressed using an index of 1 to 1024. The data at each of these indexes across the blocks is treated as a continuous message and hashed. Then the 1024 hashes serve as the leaf hashes of the Merkle Tree. The root of this Merkle tree is used to roll up the file hashes further up to directory/allocation level. The Merkle proof provides the path from the leaf to the file root and from the file root to the allocation level. In this model, in order to pass a challenge for a file for a given index (between 1 and 1024), a dishonest storage server first needs to download all the content and do the chaining to construct the leaf hash. This discourages them to outsource the content and fake a challenge response.

What is FixedMerkleTree?

It is a MerkleTree in which every leaf node is labelled with the MerkleTree hash of a data block, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes. It always has fixed 1024 leaf nodes. That is why we call it FixedMerkleTree.

See detail on exmaple: Challenge Hash

NOTE:

Challenge Hash was computed hash with MerkleTree in which every leaf node is labelled with sha3, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes. sha3 asks fully load block..1-n,block..2-n...block..1024-n on memory before computing hash for leaf node L..n. I has the same issue as sha1 in Actual File Hash and Content Hash. That is why every leaf node is labelled with CompactMerkleTree instead of sha3 now.

Clone this wiki locally