-
Notifications
You must be signed in to change notification settings - Fork 23
Protocols
There are 3 hashes for a file on blobber:
- Actual File Hash
it is used to verify the checksum of downloaded file on clients. It is hash of original file. It is
reference_objects.actual_file_hash
in database. - Content Hash
it is used to verify the checksum of uploaded data blocks on blobber server. It is hash of data blocks that is sharded by
ErasureEncoder
on client and uploaded on a blobber. It isreference_objects.content_hash
in database. - Challenge Hash
it is used to verfiy challenge on validator based on challenge protocol. It is
reference_objects.merkle_root
in database.
Actual File Hash
and Content Hash
are computed by MerkleTree.
A hash tree or Merkle tree is a tree in which every leaf node is labelled with the cryptographic hash of a data block, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes.
For better performance on memory and persistence, we have a CompactMerkleTree
here.
It is a MerkleTree
which nodes are labelled and removed as soon as possible.
In MerkleTree
, all leaf nodes are added and keep in memory before computing hash. It computes hash from leaf nodes to root node level by level.
In CompactMerkleTree
, they will be computed instantly once two child nodes are added, save the binary hash on their parent node. The two child nodes are removed from memory.
For example:
- the size of data block is chunk_size.
- a data block can be released from memory once it's hash is computed and added into MerkleTree as leaf node.
-
In Memory
means it has to be kept in memory for computing hash. -
Out Memory
means it is safe to release from memory or is not added in memory yet.
- Trees in memory when 3 hash leaves are added
- Trees in memory when 5 hash leaves are added
- Trees in memory when 7 hash leaves are added
CompactMerkleTree
is better than MerkleTree
to persist to disk or keep in memory.
NOTE:
Acutal File Hash and Content Hash were computed hash with
sha1
.sha1
asks fully load content on memory before computing hash. It asks higher meomory usage for large file. And it also blocks chunked upload feature. InMerkelTree
, a file can be read and computed chunk by chunk. That is why we update them with CompactMerkleTree now.
Challenge Hash
is also computed by MerkleTree
. For 1.8 Outsurcing Attack, we have a FixedMerkleTree
here.
1.8 Outsourcing Attack There is a need to make sure each storage server is doing their job and committing resources, rather than pretend to offer storage, instead of outsourcing it to another storage server. Our protocol avoids this outsourcing attack by ensuring that the content provided for verification is 64 kB and the content required to create this verified content is the full file fragment. This is done as follows. The file is divided into n 64 kB fragments based on n storage servers. Each of these 64 kB fragments is further divided into 64-byte chunks, so there are 1024 such chunks in each 64 kB block that can be addressed using an index of 1 to 1024. The data at each of these indexes across the blocks is treated as a continuous message and hashed. Then the 1024 hashes serve as the leaf hashes of the Merkle Tree. The root of this Merkle tree is used to roll up the file hashes further up to directory/allocation level. The Merkle proof provides the path from the leaf to the file root and from the file root to the allocation level. In this model, in order to pass a challenge for a file for a given index (between 1 and 1024), a dishonest storage server first needs to download all the content and do the chaining to construct the leaf hash. This discourages them to outsource the content and fake a challenge response.
It is a MerkleTree
in which every leaf node is labelled with the MerkleTree
hash of a data block, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes. It always has fixed 1024 leaf nodes. That is why we call it FixedMerkleTree
.
See detail on exmaple:
NOTE:
Challenge Hash
was computed hash withMerkleTree
in which every leaf node is labelled withsha3
, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes.sha3
asks fully loadblock..1-n
,block..2-n
...block..1024-n
on memory before computing hash for leaf nodeL..n
. I has the same issue assha1
inActual File Hash
andContent Hash
. That is why every leaf node is labelled withCompactMerkleTree
instead ofsha3
now.