Block placement (Layouts) discussion #119

Bulat-Ziganshin · 2022-08-14T13:49:18Z

Bulat-Ziganshin
Aug 14, 2022

Continuation of #85 and part of #70 efforts.

If we use ECC with K+M encoding (K data blocks and M parity blocks in each EC group), a dataset contains S*K data blocks and S*M parity blocks. Here we discuss various ways to distribute them over N nodes and currently we support (and discuss here) only the N=K+M case.

Note that the block order of the original data has special meaning - it's the typical retrieval order (in particular, currently we support only sequential download of the entire dataset). For simplicity, we extend dataset with empty blocks to the nearest S*K blocks size. OTOH, parity block order isn't important and we can choose it arbitrarily.

Layout is defined by two functions providing mapping of:

node number to indexes of data and parity blocks it should store (used by contract to fill slots)
EC group number to indexes of data and parity blocks it includes (used by uploader to construct ECC, used by downloader to recover data)

We have the following requirements to Layouts, in order of decreasing importance:

each EC group is distributed over all nodes (in order to maximize reliability)
(almost) each EC group contains adjacent data blocks, i.e. blocks i+1..., i+K (in order to maximize ECC recovery performance)
simple calculation of the functions describing Layout

Extra requirement that will allow us to develop extensible contracts:

Layout functions should still work when extra data blocks (and corresponding parity blocks) are added to the dataset

Bulat-Ziganshin · 2022-08-14T14:18:57Z

Bulat-Ziganshin
Aug 14, 2022
Author

Layout 1 ("horizontal"):

First S data blocks are placed on the first node, the next S blocks on the second node, and so on. Then, the first S parity blocks are placed on the node K+1 .... last S parity blocks are placed on the node K+M.
The first EC group includes the first block placed on each node, the second EC group includes the second block placed on each node, and so on.

Layout 2 ("vertical"):

First K data blocks are distributed over the first K nodes, next K data blocks are distributed again over the same nodes, and so on. Then, the first M parity blocks are distributed over the last M nodes, the next M parity blocks are distributed again over the same nodes, and so on.
The first EC group includes the first K data blocks and the first M parity blocks, the second EC group contains the next K data blocks and the next M parity blocks, and so on. Actually, it can be described exactly like the first layout - the first EC group is made from first block placed to each node, and so on.

Layout 2R ("vertical rotated"):

like Layout 2, but nodes in each EC group are rotated in order to spread parity blocks among all nodes (in order to evenly distribute load and improve download performance)

Layout 1R is also possible.

0 replies

Bulat-Ziganshin · 2022-08-17T20:01:10Z

Bulat-Ziganshin
Aug 17, 2022
Author

Finally, about the Diagonal layout. Its definition is up to Dmitry, but my understanding of his idea:

currently Codex sequentially numerates both data and parity blocks. F.e. if we have 80 data blocks and add 20 parity blocks, they will get numbers 81..100
Dmitry is obsessed with support for extending Datasets, and if we add 8 blocks to this dataset, they will get numbers 101..108 and corresponding parity blocks are 109..110 (let's consider 8+2 ECC)
so the goal is to invent such a Layout that allows to have single formula both for original and added blocks for computing which blocks should be included in each EC group (and ideally use single formula for per-node block placement too)

I'm not sure it's possible at all, and anyway it will be less efficient than the Striped layout for data recovery. So, my alternative proposal is to drop requirement to have single numeration for both block types and have separate sequences - one for data blocks and another for parity blocks:

so, in the example above we have datablocks 1..80 and parity blocks 1..20
added datablocks get numbers 81..88 and their parity block get numbers 21..22
if somewhere in API we need single numbering for technical reasons, we can use negative numbers for parity blocks

This approach allows simple definition of formulas, supports adding new data to datasets, and allows us to use Striped and Striped Rotated layouts which are the most efficient ones (in terms of sequential download and recovery of single lost block)

0 replies

Bulat-Ziganshin · 2022-08-18T12:40:45Z

Bulat-Ziganshin
Aug 18, 2022
Author

Detailed review of Layout 1 ("horizontal" aka Sequential).

Node number to indexes of data and parity blocks it should store (used by contract to fill slots):

for node in 0 .. K-1:
  data[node*S + group]   for group in 0 .. S-1
for node in K .. K+M-1:
  parity[(node-K)*S + group]   for group in 0 .. S-1

EC group number to indexes of data and parity blocks it includes (used by uploader to construct ECC, used by downloader to recover data):

for group in 0..S-1:
  data[node*S + group]   for node in 0 .. K-1
  parity[(node-K)*S + group]   for node in K .. K+M-1

NOTE

Since the current code places both block types in the same array, the mapping is even simpler:

for node in 0 .. K+M-1
  for group in 0..S-1:
    block[node*S + group]

and similarly for "vertical" layout:

for group in 0..S-1:
  for node in 0 .. K+M-1
    block[group*(K+M) + node]

0 replies

Bulat-Ziganshin · 2022-08-18T12:52:54Z

Bulat-Ziganshin
Aug 18, 2022
Author

Detailed review of Layout 2 ("vertical" aka Striped)

Node number to indexes of data and parity blocks it should store (used by contract to fill slots):

for node in 0 .. K-1:
  data[group*K + node]   for group in 0 .. S-1
for node in K .. K+M-1:
  parity[group*M + (node - K)]   for group in 0 .. S-1

EC group number to indexes of data and parity blocks it includes (used by uploader to construct ECC, used by downloader to recover data):

for group in 0..S-1:
  data[group*K + node]   for node in 0 .. K-1
  parity[group*M + (node - K)]   for node in K .. K+M-1

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex Storage

Block placement (Layouts) discussion #119

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Codex Storage

Block placement (Layouts) discussion #119

Bulat-Ziganshin Aug 14, 2022

Replies: 4 comments

Bulat-Ziganshin Aug 14, 2022 Author

Bulat-Ziganshin Aug 17, 2022 Author

Bulat-Ziganshin Aug 18, 2022 Author

Detailed review of Layout 1 ("horizontal" aka Sequential).

NOTE

Bulat-Ziganshin Aug 18, 2022 Author

Detailed review of Layout 2 ("vertical" aka Striped)

Bulat-Ziganshin
Aug 14, 2022

Bulat-Ziganshin
Aug 14, 2022
Author

Bulat-Ziganshin
Aug 17, 2022
Author

Bulat-Ziganshin
Aug 18, 2022
Author

Bulat-Ziganshin
Aug 18, 2022
Author