Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use binary diff to persist finalized states #7005

Open
wants to merge 19 commits into
base: feature/differential-archive
Choose a base branch
from

Conversation

nazarhussain
Copy link
Contributor

Motivation

Reduce the storage requirement and improve performance for the state regeneration for archival node.

Description

  • Implement strategies for different slots
  • Use snapshot, binary diff and empty slots for state archive

Steps to test or reproduce

  • Run all tests

@nazarhussain nazarhussain self-assigned this Aug 6, 2024
Copy link

codecov bot commented Aug 6, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 50.90%. Comparing base (9f4bf50) to head (7eb7c17).

Additional details and impacted files
@@                       Coverage Diff                        @@
##           feature/differential-archive    #7005      +/-   ##
================================================================
- Coverage                         50.91%   50.90%   -0.01%     
================================================================
  Files                               594      594              
  Lines                             39609    39602       -7     
  Branches                           2245     2254       +9     
================================================================
- Hits                              20165    20160       -5     
+ Misses                            19444    19442       -2     

Copy link
Contributor

github-actions bot commented Aug 6, 2024

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 28d7b6c Previous: 20c18ad Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 1.7984 ms/op 1.8176 ms/op 0.99
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 42.914 us/op 40.225 us/op 1.07
BLS verify - blst 878.64 us/op 882.42 us/op 1.00
BLS verifyMultipleSignatures 3 - blst 1.3895 ms/op 1.3030 ms/op 1.07
BLS verifyMultipleSignatures 8 - blst 2.0775 ms/op 2.0903 ms/op 0.99
BLS verifyMultipleSignatures 32 - blst 4.6599 ms/op 4.4013 ms/op 1.06
BLS verifyMultipleSignatures 64 - blst 8.5829 ms/op 8.2994 ms/op 1.03
BLS verifyMultipleSignatures 128 - blst 16.806 ms/op 15.802 ms/op 1.06
BLS deserializing 10000 signatures 618.64 ms/op 599.13 ms/op 1.03
BLS deserializing 100000 signatures 6.2025 s/op 6.1757 s/op 1.00
BLS verifyMultipleSignatures - same message - 3 - blst 926.62 us/op 998.52 us/op 0.93
BLS verifyMultipleSignatures - same message - 8 - blst 1.1016 ms/op 1.1268 ms/op 0.98
BLS verifyMultipleSignatures - same message - 32 - blst 1.6647 ms/op 1.6973 ms/op 0.98
BLS verifyMultipleSignatures - same message - 64 - blst 2.4793 ms/op 2.5039 ms/op 0.99
BLS verifyMultipleSignatures - same message - 128 - blst 4.0686 ms/op 4.2070 ms/op 0.97
BLS aggregatePubkeys 32 - blst 17.269 us/op 19.802 us/op 0.87
BLS aggregatePubkeys 128 - blst 61.390 us/op 64.850 us/op 0.95
compute 91.587 ms/op
apply 1.8089 ms/op
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 68.368 ms/op 67.537 ms/op 1.01
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 51.169 ms/op 56.161 ms/op 0.91
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 41.555 ms/op 31.897 ms/op 1.30
getSlashingsAndExits - default max 70.032 us/op 72.419 us/op 0.97
getSlashingsAndExits - 2k 369.58 us/op 278.27 us/op 1.33
proposeBlockBody type=full, size=empty 6.4591 ms/op 5.0896 ms/op 1.27
isKnown best case - 1 super set check 930.00 ns/op 464.00 ns/op 2.00
isKnown normal case - 2 super set checks 903.00 ns/op 458.00 ns/op 1.97
isKnown worse case - 16 super set checks 1.0300 us/op 452.00 ns/op 2.28
InMemoryCheckpointStateCache - add get delete 4.4710 us/op 2.6270 us/op 1.70
validate api signedAggregateAndProof - struct 1.5540 ms/op 1.9772 ms/op 0.79
validate gossip signedAggregateAndProof - struct 1.6634 ms/op 1.9286 ms/op 0.86
validate gossip attestation - vc 640000 1.0165 ms/op 995.59 us/op 1.02
batch validate gossip attestation - vc 640000 - chunk 32 133.16 us/op 120.55 us/op 1.10
batch validate gossip attestation - vc 640000 - chunk 64 119.62 us/op 104.29 us/op 1.15
batch validate gossip attestation - vc 640000 - chunk 128 102.86 us/op 99.233 us/op 1.04
batch validate gossip attestation - vc 640000 - chunk 256 90.552 us/op 91.738 us/op 0.99
pickEth1Vote - no votes 877.34 us/op 856.03 us/op 1.02
pickEth1Vote - max votes 4.3734 ms/op 5.8577 ms/op 0.75
pickEth1Vote - Eth1Data hashTreeRoot value x2048 9.5574 ms/op 14.583 ms/op 0.66
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 12.938 ms/op 19.847 ms/op 0.65
pickEth1Vote - Eth1Data fastSerialize value x2048 366.25 us/op 379.25 us/op 0.97
pickEth1Vote - Eth1Data fastSerialize tree x2048 3.1212 ms/op 3.8007 ms/op 0.82
bytes32 toHexString 563.00 ns/op 623.00 ns/op 0.90
bytes32 Buffer.toString(hex) 420.00 ns/op 455.00 ns/op 0.92
bytes32 Buffer.toString(hex) from Uint8Array 506.00 ns/op 643.00 ns/op 0.79
bytes32 Buffer.toString(hex) + 0x 425.00 ns/op 469.00 ns/op 0.91
Object access 1 prop 0.31300 ns/op 0.32800 ns/op 0.95
Map access 1 prop 0.31200 ns/op 0.36200 ns/op 0.86
Object get x1000 4.7980 ns/op 5.5030 ns/op 0.87
Map get x1000 5.3170 ns/op 5.9860 ns/op 0.89
Object set x1000 27.267 ns/op 23.827 ns/op 1.14
Map set x1000 18.231 ns/op 19.903 ns/op 0.92
Return object 10000 times 0.28140 ns/op 0.30000 ns/op 0.94
Throw Error 10000 times 2.5704 us/op 2.8146 us/op 0.91
toHex 98.381 ns/op 112.51 ns/op 0.87
Buffer.from 92.549 ns/op 104.45 ns/op 0.89
shared Buffer 63.785 ns/op 71.059 ns/op 0.90
fastMsgIdFn sha256 / 200 bytes 1.9750 us/op 1.9710 us/op 1.00
fastMsgIdFn h32 xxhash / 200 bytes 413.00 ns/op 449.00 ns/op 0.92
fastMsgIdFn h64 xxhash / 200 bytes 421.00 ns/op 468.00 ns/op 0.90
fastMsgIdFn sha256 / 1000 bytes 6.0880 us/op 5.9190 us/op 1.03
fastMsgIdFn h32 xxhash / 1000 bytes 517.00 ns/op 573.00 ns/op 0.90
fastMsgIdFn h64 xxhash / 1000 bytes 502.00 ns/op 535.00 ns/op 0.94
fastMsgIdFn sha256 / 10000 bytes 50.008 us/op 50.664 us/op 0.99
fastMsgIdFn h32 xxhash / 10000 bytes 1.8610 us/op 1.9870 us/op 0.94
fastMsgIdFn h64 xxhash / 10000 bytes 1.3050 us/op 1.3820 us/op 0.94
send data - 1000 256B messages 9.8038 ms/op 9.6131 ms/op 1.02
send data - 1000 512B messages 12.866 ms/op 14.072 ms/op 0.91
send data - 1000 1024B messages 20.213 ms/op 22.236 ms/op 0.91
send data - 1000 1200B messages 23.946 ms/op 21.958 ms/op 1.09
send data - 1000 2048B messages 28.921 ms/op 30.813 ms/op 0.94
send data - 1000 4096B messages 26.429 ms/op 27.409 ms/op 0.96
send data - 1000 16384B messages 60.495 ms/op 72.365 ms/op 0.84
send data - 1000 65536B messages 229.05 ms/op 288.78 ms/op 0.79
enrSubnets - fastDeserialize 64 bits 1.2020 us/op 1.2920 us/op 0.93
enrSubnets - ssz BitVector 64 bits 505.00 ns/op 577.00 ns/op 0.88
enrSubnets - fastDeserialize 4 bits 330.00 ns/op 350.00 ns/op 0.94
enrSubnets - ssz BitVector 4 bits 509.00 ns/op 557.00 ns/op 0.91
prioritizePeers score -10:0 att 32-0.1 sync 2-0 131.68 us/op 134.93 us/op 0.98
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 167.37 us/op 132.57 us/op 1.26
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 344.77 us/op 207.04 us/op 1.67
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 387.95 us/op 407.99 us/op 0.95
prioritizePeers score 0:0 att 64-1 sync 4-1 528.13 us/op 523.57 us/op 1.01
array of 16000 items push then shift 1.2883 us/op 1.2955 us/op 0.99
LinkedList of 16000 items push then shift 6.3450 ns/op 6.4490 ns/op 0.98
array of 16000 items push then pop 76.887 ns/op 102.61 ns/op 0.75
LinkedList of 16000 items push then pop 6.1680 ns/op 6.3220 ns/op 0.98
array of 24000 items push then shift 1.8865 us/op 1.8808 us/op 1.00
LinkedList of 24000 items push then shift 6.3040 ns/op 6.4040 ns/op 0.98
array of 24000 items push then pop 107.68 ns/op 111.06 ns/op 0.97
LinkedList of 24000 items push then pop 6.2760 ns/op 6.1930 ns/op 1.01
intersect bitArray bitLen 8 5.3800 ns/op 5.3680 ns/op 1.00
intersect array and set length 8 40.415 ns/op 39.347 ns/op 1.03
intersect bitArray bitLen 128 26.171 ns/op 26.139 ns/op 1.00
intersect array and set length 128 590.91 ns/op 582.76 ns/op 1.01
bitArray.getTrueBitIndexes() bitLen 128 1.8850 us/op 1.9440 us/op 0.97
bitArray.getTrueBitIndexes() bitLen 248 3.6350 us/op 3.2020 us/op 1.14
bitArray.getTrueBitIndexes() bitLen 512 6.3050 us/op 5.5860 us/op 1.13
Buffer.concat 32 items 989.00 ns/op 1.1300 us/op 0.88
Uint8Array.set 32 items 1.9480 us/op 1.4380 us/op 1.35
Buffer.copy 1.8950 us/op 2.5710 us/op 0.74
Uint8Array.set - with subarray 2.3890 us/op 3.6650 us/op 0.65
Uint8Array.set - without subarray 1.6130 us/op 2.0500 us/op 0.79
getUint32 - dataview 408.00 ns/op 411.00 ns/op 0.99
getUint32 - manual 341.00 ns/op 342.00 ns/op 1.00
Set add up to 64 items then delete first 1.8369 us/op 1.8255 us/op 1.01
OrderedSet add up to 64 items then delete first 2.8508 us/op 2.8260 us/op 1.01
Set add up to 64 items then delete last 2.1154 us/op 2.0983 us/op 1.01
OrderedSet add up to 64 items then delete last 3.2520 us/op 3.0972 us/op 1.05
Set add up to 64 items then delete middle 2.1147 us/op 2.0547 us/op 1.03
OrderedSet add up to 64 items then delete middle 4.6509 us/op 4.4751 us/op 1.04
Set add up to 128 items then delete first 4.0601 us/op 4.0743 us/op 1.00
OrderedSet add up to 128 items then delete first 6.1183 us/op 6.3079 us/op 0.97
Set add up to 128 items then delete last 4.0578 us/op 3.9434 us/op 1.03
OrderedSet add up to 128 items then delete last 6.3456 us/op 5.9308 us/op 1.07
Set add up to 128 items then delete middle 4.0639 us/op 3.9992 us/op 1.02
OrderedSet add up to 128 items then delete middle 11.771 us/op 12.103 us/op 0.97
Set add up to 256 items then delete first 8.0011 us/op 8.0880 us/op 0.99
OrderedSet add up to 256 items then delete first 12.187 us/op 12.659 us/op 0.96
Set add up to 256 items then delete last 7.9716 us/op 7.7839 us/op 1.02
OrderedSet add up to 256 items then delete last 12.585 us/op 11.803 us/op 1.07
Set add up to 256 items then delete middle 7.9120 us/op 7.6456 us/op 1.03
OrderedSet add up to 256 items then delete middle 34.537 us/op 34.147 us/op 1.01
transfer serialized Status (84 B) 1.5030 us/op 1.5400 us/op 0.98
copy serialized Status (84 B) 1.2690 us/op 1.3270 us/op 0.96
transfer serialized SignedVoluntaryExit (112 B) 1.6910 us/op 1.7750 us/op 0.95
copy serialized SignedVoluntaryExit (112 B) 1.3430 us/op 1.6430 us/op 0.82
transfer serialized ProposerSlashing (416 B) 2.6350 us/op 2.8140 us/op 0.94
copy serialized ProposerSlashing (416 B) 1.6970 us/op 2.9380 us/op 0.58
transfer serialized Attestation (485 B) 2.0840 us/op 2.7590 us/op 0.76
copy serialized Attestation (485 B) 1.6430 us/op 2.4210 us/op 0.68
transfer serialized AttesterSlashing (33232 B) 2.2610 us/op 2.5260 us/op 0.90
copy serialized AttesterSlashing (33232 B) 4.9690 us/op 4.9890 us/op 1.00
transfer serialized Small SignedBeaconBlock (128000 B) 2.4650 us/op 3.2880 us/op 0.75
copy serialized Small SignedBeaconBlock (128000 B) 12.452 us/op 10.956 us/op 1.14
transfer serialized Avg SignedBeaconBlock (200000 B) 3.2700 us/op 3.0430 us/op 1.07
copy serialized Avg SignedBeaconBlock (200000 B) 18.503 us/op 15.776 us/op 1.17
transfer serialized BlobsSidecar (524380 B) 2.8820 us/op 3.7470 us/op 0.77
copy serialized BlobsSidecar (524380 B) 69.831 us/op 74.591 us/op 0.94
transfer serialized Big SignedBeaconBlock (1000000 B) 2.7290 us/op 3.5880 us/op 0.76
copy serialized Big SignedBeaconBlock (1000000 B) 232.64 us/op 186.03 us/op 1.25
pass gossip attestations to forkchoice per slot 2.3359 ms/op 2.4717 ms/op 0.95
forkChoice updateHead vc 100000 bc 64 eq 0 463.67 us/op 431.87 us/op 1.07
forkChoice updateHead vc 600000 bc 64 eq 0 2.5464 ms/op 2.6716 ms/op 0.95
forkChoice updateHead vc 1000000 bc 64 eq 0 4.1552 ms/op 4.2338 ms/op 0.98
forkChoice updateHead vc 600000 bc 320 eq 0 2.4982 ms/op 2.5821 ms/op 0.97
forkChoice updateHead vc 600000 bc 1200 eq 0 2.5598 ms/op 2.7021 ms/op 0.95
forkChoice updateHead vc 600000 bc 7200 eq 0 2.8200 ms/op 3.2161 ms/op 0.88
forkChoice updateHead vc 600000 bc 64 eq 1000 9.6268 ms/op 9.7068 ms/op 0.99
forkChoice updateHead vc 600000 bc 64 eq 10000 9.4140 ms/op 9.5604 ms/op 0.98
forkChoice updateHead vc 600000 bc 64 eq 300000 11.599 ms/op 11.835 ms/op 0.98
computeDeltas 500000 validators 300 proto nodes 2.9482 ms/op 3.0339 ms/op 0.97
computeDeltas 500000 validators 1200 proto nodes 3.1244 ms/op 3.2527 ms/op 0.96
computeDeltas 500000 validators 7200 proto nodes 3.0624 ms/op 3.2553 ms/op 0.94
computeDeltas 750000 validators 300 proto nodes 4.8483 ms/op 4.7079 ms/op 1.03
computeDeltas 750000 validators 1200 proto nodes 4.9530 ms/op 4.6277 ms/op 1.07
computeDeltas 750000 validators 7200 proto nodes 4.6816 ms/op 4.6213 ms/op 1.01
computeDeltas 1400000 validators 300 proto nodes 8.9782 ms/op 8.2808 ms/op 1.08
computeDeltas 1400000 validators 1200 proto nodes 8.6113 ms/op 8.6303 ms/op 1.00
computeDeltas 1400000 validators 7200 proto nodes 8.5267 ms/op 8.4612 ms/op 1.01
computeDeltas 2100000 validators 300 proto nodes 12.627 ms/op 12.383 ms/op 1.02
computeDeltas 2100000 validators 1200 proto nodes 12.368 ms/op 12.574 ms/op 0.98
computeDeltas 2100000 validators 7200 proto nodes 12.564 ms/op 12.701 ms/op 0.99
altair processAttestation - 250000 vs - 7PWei normalcase 1.4177 ms/op 1.4097 ms/op 1.01
altair processAttestation - 250000 vs - 7PWei worstcase 2.1474 ms/op 2.1595 ms/op 0.99
altair processAttestation - setStatus - 1/6 committees join 68.474 us/op 70.380 us/op 0.97
altair processAttestation - setStatus - 1/3 committees join 138.98 us/op 150.64 us/op 0.92
altair processAttestation - setStatus - 1/2 committees join 190.06 us/op 214.73 us/op 0.89
altair processAttestation - setStatus - 2/3 committees join 258.31 us/op 273.34 us/op 0.95
altair processAttestation - setStatus - 4/5 committees join 387.42 us/op 395.84 us/op 0.98
altair processAttestation - setStatus - 100% committees join 467.04 us/op 473.67 us/op 0.99
altair processBlock - 250000 vs - 7PWei normalcase 4.1454 ms/op 4.1178 ms/op 1.01
altair processBlock - 250000 vs - 7PWei normalcase hashState 23.671 ms/op 27.553 ms/op 0.86
altair processBlock - 250000 vs - 7PWei worstcase 29.219 ms/op 39.538 ms/op 0.74
altair processBlock - 250000 vs - 7PWei worstcase hashState 84.211 ms/op 74.982 ms/op 1.12
phase0 processBlock - 250000 vs - 7PWei normalcase 1.8451 ms/op 2.5842 ms/op 0.71
phase0 processBlock - 250000 vs - 7PWei worstcase 22.876 ms/op 25.861 ms/op 0.88
altair processEth1Data - 250000 vs - 7PWei normalcase 251.76 us/op 253.55 us/op 0.99
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 4.9130 us/op 4.2250 us/op 1.16
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 20.604 us/op 22.396 us/op 0.92
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 6.7370 us/op 9.3330 us/op 0.72
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 4.9050 us/op 6.9700 us/op 0.70
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 82.740 us/op 73.743 us/op 1.12
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 769.38 us/op 578.77 us/op 1.33
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 684.08 us/op 863.18 us/op 0.79
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 713.57 us/op 726.62 us/op 0.98
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 1.9826 ms/op 2.3107 ms/op 0.86
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 1.2683 ms/op 1.1897 ms/op 1.07
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 2.9752 ms/op 3.3296 ms/op 0.89
Tree 40 250000 create 180.76 ms/op 179.01 ms/op 1.01
Tree 40 250000 get(125000) 115.96 ns/op 115.12 ns/op 1.01
Tree 40 250000 set(125000) 551.95 ns/op 543.76 ns/op 1.02
Tree 40 250000 toArray() 9.9545 ms/op 11.279 ms/op 0.88
Tree 40 250000 iterate all - toArray() + loop 10.001 ms/op 16.708 ms/op 0.60
Tree 40 250000 iterate all - get(i) 38.004 ms/op 44.622 ms/op 0.85
MutableVector 250000 create 7.4233 ms/op 11.756 ms/op 0.63
MutableVector 250000 get(125000) 5.7430 ns/op 5.6290 ns/op 1.02
MutableVector 250000 set(125000) 171.14 ns/op 183.90 ns/op 0.93
MutableVector 250000 toArray() 2.7073 ms/op 3.4570 ms/op 0.78
MutableVector 250000 iterate all - toArray() + loop 2.8033 ms/op 3.6338 ms/op 0.77
MutableVector 250000 iterate all - get(i) 1.4094 ms/op 1.3878 ms/op 1.02
Array 250000 create 2.3202 ms/op 3.2475 ms/op 0.71
Array 250000 clone - spread 1.2153 ms/op 1.3461 ms/op 0.90
Array 250000 get(125000) 0.56800 ns/op 0.55600 ns/op 1.02
Array 250000 set(125000) 0.58600 ns/op 0.56700 ns/op 1.03
Array 250000 iterate all - loop 76.583 us/op 78.628 us/op 0.97
effectiveBalanceIncrements clone Uint8Array 300000 14.155 us/op 17.002 us/op 0.83
effectiveBalanceIncrements clone MutableVector 300000 311.00 ns/op 313.00 ns/op 0.99
effectiveBalanceIncrements rw all Uint8Array 300000 165.59 us/op 166.35 us/op 1.00
effectiveBalanceIncrements rw all MutableVector 300000 55.006 ms/op 61.039 ms/op 0.90
phase0 afterProcessEpoch - 250000 vs - 7PWei 72.362 ms/op 77.672 ms/op 0.93
Array.fill - length 1000000 2.5017 ms/op 2.7915 ms/op 0.90
Array push - length 1000000 14.755 ms/op 14.964 ms/op 0.99
Array.get 0.24925 ns/op 0.25873 ns/op 0.96
Uint8Array.get 0.33539 ns/op 0.33315 ns/op 1.01
phase0 beforeProcessEpoch - 250000 vs - 7PWei 17.966 ms/op 19.209 ms/op 0.94
altair processEpoch - mainnet_e81889 321.61 ms/op 303.99 ms/op 1.06
mainnet_e81889 - altair beforeProcessEpoch 16.645 ms/op 20.380 ms/op 0.82
mainnet_e81889 - altair processJustificationAndFinalization 14.478 us/op 13.417 us/op 1.08
mainnet_e81889 - altair processInactivityUpdates 5.4378 ms/op 5.7530 ms/op 0.95
mainnet_e81889 - altair processRewardsAndPenalties 55.763 ms/op 50.289 ms/op 1.11
mainnet_e81889 - altair processRegistryUpdates 2.6460 us/op 2.3950 us/op 1.10
mainnet_e81889 - altair processSlashings 757.00 ns/op 977.00 ns/op 0.77
mainnet_e81889 - altair processEth1DataReset 717.00 ns/op 765.00 ns/op 0.94
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.4945 ms/op 877.50 us/op 1.70
mainnet_e81889 - altair processSlashingsReset 3.5410 us/op 2.6000 us/op 1.36
mainnet_e81889 - altair processRandaoMixesReset 4.5990 us/op 6.6790 us/op 0.69
mainnet_e81889 - altair processHistoricalRootsUpdate 837.00 ns/op 984.00 ns/op 0.85
mainnet_e81889 - altair processParticipationFlagUpdates 2.4490 us/op 3.2970 us/op 0.74
mainnet_e81889 - altair processSyncCommitteeUpdates 1.0020 us/op 975.00 ns/op 1.03
mainnet_e81889 - altair afterProcessEpoch 79.859 ms/op 78.820 ms/op 1.01
capella processEpoch - mainnet_e217614 1.2807 s/op 1.2454 s/op 1.03
mainnet_e217614 - capella beforeProcessEpoch 70.895 ms/op 71.270 ms/op 0.99
mainnet_e217614 - capella processJustificationAndFinalization 21.224 us/op 20.158 us/op 1.05
mainnet_e217614 - capella processInactivityUpdates 16.417 ms/op 12.839 ms/op 1.28
mainnet_e217614 - capella processRewardsAndPenalties 274.88 ms/op 251.90 ms/op 1.09
mainnet_e217614 - capella processRegistryUpdates 12.968 us/op 16.558 us/op 0.78
mainnet_e217614 - capella processSlashings 990.00 ns/op 905.00 ns/op 1.09
mainnet_e217614 - capella processEth1DataReset 771.00 ns/op 983.00 ns/op 0.78
mainnet_e217614 - capella processEffectiveBalanceUpdates 13.550 ms/op 11.878 ms/op 1.14
mainnet_e217614 - capella processSlashingsReset 9.1310 us/op 6.0470 us/op 1.51
mainnet_e217614 - capella processRandaoMixesReset 9.2740 us/op 7.1890 us/op 1.29
mainnet_e217614 - capella processHistoricalRootsUpdate 1.2480 us/op 1.8630 us/op 0.67
mainnet_e217614 - capella processParticipationFlagUpdates 4.7130 us/op 5.4320 us/op 0.87
mainnet_e217614 - capella afterProcessEpoch 209.28 ms/op 314.19 ms/op 0.67
phase0 processEpoch - mainnet_e58758 467.56 ms/op 472.64 ms/op 0.99
mainnet_e58758 - phase0 beforeProcessEpoch 103.14 ms/op 127.19 ms/op 0.81
mainnet_e58758 - phase0 processJustificationAndFinalization 21.388 us/op 32.478 us/op 0.66
mainnet_e58758 - phase0 processRewardsAndPenalties 27.030 ms/op 42.717 ms/op 0.63
mainnet_e58758 - phase0 processRegistryUpdates 7.3110 us/op 14.369 us/op 0.51
mainnet_e58758 - phase0 processSlashings 966.00 ns/op 1.1070 us/op 0.87
mainnet_e58758 - phase0 processEth1DataReset 814.00 ns/op 1.2580 us/op 0.65
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 962.53 us/op 1.3288 ms/op 0.72
mainnet_e58758 - phase0 processSlashingsReset 4.2490 us/op 5.6440 us/op 0.75
mainnet_e58758 - phase0 processRandaoMixesReset 6.8840 us/op 9.6410 us/op 0.71
mainnet_e58758 - phase0 processHistoricalRootsUpdate 792.00 ns/op 1.0820 us/op 0.73
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.3990 us/op 9.7610 us/op 0.45
mainnet_e58758 - phase0 afterProcessEpoch 70.713 ms/op 96.441 ms/op 0.73
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.0084 ms/op 1.6814 ms/op 0.60
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.3017 ms/op 2.3713 ms/op 0.55
altair processInactivityUpdates - 250000 normalcase 18.317 ms/op 23.520 ms/op 0.78
altair processInactivityUpdates - 250000 worstcase 15.925 ms/op 22.529 ms/op 0.71
phase0 processRegistryUpdates - 250000 normalcase 8.3020 us/op 13.760 us/op 0.60
phase0 processRegistryUpdates - 250000 badcase_full_deposits 299.30 us/op 368.14 us/op 0.81
phase0 processRegistryUpdates - 250000 worstcase 0.5 125.44 ms/op 192.50 ms/op 0.65
altair processRewardsAndPenalties - 250000 normalcase 35.710 ms/op 63.725 ms/op 0.56
altair processRewardsAndPenalties - 250000 worstcase 40.103 ms/op 56.079 ms/op 0.72
phase0 getAttestationDeltas - 250000 normalcase 10.382 ms/op 11.228 ms/op 0.92
phase0 getAttestationDeltas - 250000 worstcase 10.390 ms/op 7.7475 ms/op 1.34
phase0 processSlashings - 250000 worstcase 84.084 us/op 93.152 us/op 0.90
altair processSyncCommitteeUpdates - 250000 119.30 ms/op 118.24 ms/op 1.01
BeaconState.hashTreeRoot - No change 669.00 ns/op 705.00 ns/op 0.95
BeaconState.hashTreeRoot - 1 full validator 126.84 us/op 121.75 us/op 1.04
BeaconState.hashTreeRoot - 32 full validator 923.96 us/op 1.4721 ms/op 0.63
BeaconState.hashTreeRoot - 512 full validator 8.1877 ms/op 11.412 ms/op 0.72
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 107.23 us/op 158.17 us/op 0.68
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.4065 ms/op 1.8565 ms/op 1.30
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 32.646 ms/op 29.440 ms/op 1.11
BeaconState.hashTreeRoot - 1 balances 100.78 us/op 124.94 us/op 0.81
BeaconState.hashTreeRoot - 32 balances 1.1536 ms/op 1.4404 ms/op 0.80
BeaconState.hashTreeRoot - 512 balances 7.0100 ms/op 9.8619 ms/op 0.71
BeaconState.hashTreeRoot - 250000 balances 135.00 ms/op 248.71 ms/op 0.54
aggregationBits - 2048 els - zipIndexesInBitList 19.945 us/op 37.214 us/op 0.54
byteArrayEquals 32 47.119 ns/op 53.345 ns/op 0.88
Buffer.compare 32 16.233 ns/op 16.785 ns/op 0.97
byteArrayEquals 1024 1.2574 us/op 1.2965 us/op 0.97
Buffer.compare 1024 23.019 ns/op 25.098 ns/op 0.92
byteArrayEquals 16384 19.984 us/op 20.481 us/op 0.98
Buffer.compare 16384 210.97 ns/op 197.46 ns/op 1.07
byteArrayEquals 123687377 148.76 ms/op 160.15 ms/op 0.93
Buffer.compare 123687377 5.9337 ms/op 6.1955 ms/op 0.96
byteArrayEquals 32 - diff last byte 46.355 ns/op 47.800 ns/op 0.97
Buffer.compare 32 - diff last byte 15.389 ns/op 16.810 ns/op 0.92
byteArrayEquals 1024 - diff last byte 1.2405 us/op 1.2838 us/op 0.97
Buffer.compare 1024 - diff last byte 23.817 ns/op 26.563 ns/op 0.90
byteArrayEquals 16384 - diff last byte 19.394 us/op 20.241 us/op 0.96
Buffer.compare 16384 - diff last byte 171.19 ns/op 211.35 ns/op 0.81
byteArrayEquals 123687377 - diff last byte 151.44 ms/op 154.47 ms/op 0.98
Buffer.compare 123687377 - diff last byte 6.6229 ms/op 5.2619 ms/op 1.26
byteArrayEquals 32 - random bytes 4.9650 ns/op 4.9930 ns/op 0.99
Buffer.compare 32 - random bytes 16.199 ns/op 16.051 ns/op 1.01
byteArrayEquals 1024 - random bytes 4.9140 ns/op 4.9610 ns/op 0.99
Buffer.compare 1024 - random bytes 16.175 ns/op 16.107 ns/op 1.00
byteArrayEquals 16384 - random bytes 4.8770 ns/op 4.9560 ns/op 0.98
Buffer.compare 16384 - random bytes 16.476 ns/op 15.742 ns/op 1.05
byteArrayEquals 123687377 - random bytes 7.6100 ns/op 7.7200 ns/op 0.99
Buffer.compare 123687377 - random bytes 19.560 ns/op 19.770 ns/op 0.99
regular array get 100000 times 30.687 us/op 31.152 us/op 0.99
wrappedArray get 100000 times 30.647 us/op 31.045 us/op 0.99
arrayWithProxy get 100000 times 10.547 ms/op 9.7454 ms/op 1.08
ssz.Root.equals 45.218 ns/op 43.086 ns/op 1.05
byteArrayEquals 43.776 ns/op 42.806 ns/op 1.02
Buffer.compare 8.8500 ns/op 9.1160 ns/op 0.97
shuffle list - 16384 els 5.2503 ms/op 5.5635 ms/op 0.94
shuffle list - 250000 els 78.681 ms/op 82.024 ms/op 0.96
processSlot - 1 slots 10.392 us/op 11.763 us/op 0.88
processSlot - 32 slots 2.2877 ms/op 3.2720 ms/op 0.70
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 40.536 ms/op 37.073 ms/op 1.09
getCommitteeAssignments - req 1 vs - 250000 vc 1.7346 ms/op 1.7576 ms/op 0.99
getCommitteeAssignments - req 100 vs - 250000 vc 3.4208 ms/op 3.5043 ms/op 0.98
getCommitteeAssignments - req 1000 vs - 250000 vc 3.6997 ms/op 3.7736 ms/op 0.98
findModifiedValidators - 10000 modified validators 232.90 ms/op 229.70 ms/op 1.01
findModifiedValidators - 1000 modified validators 150.26 ms/op 139.75 ms/op 1.08
findModifiedValidators - 100 modified validators 137.47 ms/op 146.43 ms/op 0.94
findModifiedValidators - 10 modified validators 158.72 ms/op 134.33 ms/op 1.18
findModifiedValidators - 1 modified validators 115.70 ms/op 138.11 ms/op 0.84
findModifiedValidators - no difference 143.84 ms/op 153.75 ms/op 0.94
compare ViewDUs 2.7014 s/op 3.0420 s/op 0.89
compare each validator Uint8Array 1.0446 s/op 1.4135 s/op 0.74
compare ViewDU to Uint8Array 694.65 ms/op 769.88 ms/op 0.90
migrate state 1000000 validators, 24 modified, 0 new 569.87 ms/op 508.02 ms/op 1.12
migrate state 1000000 validators, 1700 modified, 1000 new 811.97 ms/op 797.79 ms/op 1.02
migrate state 1000000 validators, 3400 modified, 2000 new 944.36 ms/op 1.0984 s/op 0.86
migrate state 1500000 validators, 24 modified, 0 new 590.09 ms/op 640.35 ms/op 0.92
migrate state 1500000 validators, 1700 modified, 1000 new 792.18 ms/op 828.75 ms/op 0.96
migrate state 1500000 validators, 3400 modified, 2000 new 961.03 ms/op 1.1481 s/op 0.84
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.8800 ns/op 6.8800 ns/op 0.85
state getBlockRootAtSlot - 250000 vs - 7PWei 526.57 ns/op 908.60 ns/op 0.58
computeProposers - vc 250000 5.8563 ms/op 6.5108 ms/op 0.90
computeEpochShuffling - vc 250000 80.466 ms/op 84.922 ms/op 0.95
getNextSyncCommittee - vc 250000 95.051 ms/op 124.43 ms/op 0.76
computeSigningRoot for AttestationData 19.871 us/op 19.859 us/op 1.00
hash AttestationData serialized data then Buffer.toString(base64) 1.1964 us/op 1.2247 us/op 0.98
toHexString serialized data 787.77 ns/op 932.04 ns/op 0.85
Buffer.toString(base64) 145.50 ns/op 193.59 ns/op 0.75
block root to RootHex using toHex 118.37 ns/op 139.49 ns/op 0.85
block root to RootHex using toRootHex 77.567 ns/op 88.903 ns/op 0.87

by benchmarkbot/action

@nazarhussain nazarhussain marked this pull request as ready for review August 13, 2024 11:40
@nazarhussain nazarhussain requested a review from a team as a code owner August 13, 2024 11:40
Copy link
Contributor

@twoeths twoeths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great, I guess I need much more time to review through. Need to add much more comments on how the strategy works and the comment for classes/methods to make sure we have a maintainable code. I suggest to have this style of comment to make it easier to understand for everyone

* Persist states every some epochs to

also need more insight for this work:

  • how much disc space does it need for storing states with the current approach, compared to the default config of the new approach
  • need benchmark results to come up with the default config
    • the time to compute state diff and apply it, and the disc space needed
    • need to download mainnet/holesky states for a good estimate

@nazarhussain
Copy link
Contributor Author

how much disc space does it need for storing states with the current approach, compared to the default config of the new approach

There is still some work left to optimize the diff size:

  1. Split the state into state and balances array. As balances array has more frequent entropy.
  2. Use some kind of compression on binary data. e.g. gzip or snappy

Afterwards we can have more realistic disk space estimation.

@twoeths
Copy link
Contributor

twoeths commented Aug 21, 2024

how much disc space does it need for storing states with the current approach, compared to the default config of the new approach

There is still some work left to optimize the diff size:

  1. Split the state into state and balances array. As balances array has more frequent entropy.
  2. Use some kind of compression on binary data. e.g. gzip or snappy

Afterwards we can have more realistic disk space estimation.

could you add TODOs in the description?

@twoeths
Copy link
Contributor

twoeths commented Aug 21, 2024

Split the state into state and balances array. As balances array has more frequent entropy.

@nazarhussain validators takes the most space in state bytes and it's mostly unchanged so please give it a try too. We leveraged it when loading state from Uint8Array given a seed state

@nazarhussain nazarhussain changed the base branch from feature/differential-archive to nh/state-diff October 23, 2024 12:05
@nazarhussain nazarhussain changed the base branch from nh/state-diff to feature/differential-archive October 23, 2024 12:06
}

async storeHistoricalState(slot: number, stateBytes: Uint8Array): Promise<void> {
return this.api.storeHistoricalState(slot, stateBytes, this.stateArchiveMode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should pass partialState, balances and whatever through worker thread boundary, instead of serializing on main thread and deserialize it on worker thread again because deserializing state takes a lot of memory and inefficient
state root and slot could be passed through worker thread boundary too

activeSlot: intermediateStateArchive.slot,
activeStateSize: formatBytes(StateArchiveSSZType.serialize(activeStateArchive).byteLength),
diffSlot: intermediateStateArchive.slot,
diffStateSize: formatBytes(StateArchiveSSZType.serialize(intermediateStateArchive).byteLength),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not serialize just to have its length, same to above
there is value_serializedSize api for it


export function stateBytesToStateArchive(stateBytes: Uint8Array, forkConfig: ChainForkConfig): StateArchive {
const slot = getStateSlotFromBytes(stateBytes);
return stateToStateArchive(forkConfig.getForkTypes(slot).BeaconState.deserialize(stateBytes), forkConfig);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not performant to deserialize to state object here, if we have state ViewDU on main thread we should be able to compute partialState and balances there before passing through thread boundary

but if we have the full state bytes on the main thread we need to find a way to extract a full state bytes to partial state bytes and balances state bytes anyway. Maybe we can think about a ultility function to do that right on state bytes?

throw e;
}
blockCount++;
if (Buffer.compare(state.hashTreeRoot(), block.message.stateRoot) !== 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do this inside the for loop for every block because if it happens there it will not reach here


return measure(metrics?.stateTransitionTime, async () => {
let state = config.getForkTypes(toSlot).BeaconState.deserializeToViewDU(lastFullStateBytes);
syncPubkeyCache(state, pubkey2index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync pubkeys is expensive, we should not do it if really necessary
since we don't verify signatures, there's a chance we don't have to do it please double check

case HistoricalStateStorageType.Snapshot: {
return measure(metrics?.loadSnapshotStateTime, async () => {
const stateArchive = await db.hierarchicalStateArchiveRepository.get(slot);
return stateArchive ? stateArchiveToStateBytes(stateArchive, config) : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converting a Uint8Array to state archive object to full state bytes seems not efficient
we can think about an util to work on Uint8Array directly instead
it's tricky but if we have enough unit tests for different forks I think should be good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants