-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SRE Testing: Investigate some aberrant behavior #4768
Comments
Slack Discussions : https://casper-labs-team.slack.com/archives/C01ULKL8G8J/p1719463830035189 |
Did further investigations into the SSE and custom SSE exporter.
ex.
Switch Block
and thereafter till the time get into sync
|
@alsrdn Alex S will look into |
Re-running the same investigation with latest 2.0 commit Network: 5 +2 |
Ran test with a 60 + 10 network. With this medium sized network - this issue is not seen. Switch Block |
100 Validators + 1 - large network
Switch Block |
Below is the test output ,
Engineering: please look under the section "some different behaviours observed" to investigate the aberrant behavior , if it has any substance.
SRE @AJ : Please look at the SSE event not reporting data further and update the comments on the ticket
Network: 100 + 20
Name: ajith-condor-big (8s)
Branch: feat-2.0
Commit: ee9c6de
Config:
minimum_block_time = '8192 ms'
block_gas_limit = 3_300_000_000_000
native_mint_lane = [0, 1024, 1024, 65_000_000_000, 325]
native_auction_lane = [1, 2048, 2048, 362_500_000_000, 75]
wasm_lanes = [[2, 1_048_576, 2048, 1_000_000_000_000, 1], [3, 344_064, 1024, 100_000_000_000, 10], [4, 172_032, 1024, 50_000_000_000, 15], [5, 12_288, 512, 1_500_000_000, 25]]
Tests: Mixed Load of V1 and V2 Transfers, WASM Transfers, Auction
Observations:
time spent at switch block now under 5seconds
allocated memory increases as new deploys in buffer
allocated memory spikes by +1G at switch block
at load when average memory was 1.1G, switchblock spiked to 2+ Gig
block timestamps were stable and under 8s through the loading and execution period
was able to hit 325 native transfers per block consistently
also observed Auction Lane saturated at 75 periodically (as per load and config)
also observed V1 wasm deploys allocated to Lane 3 at 10 (as per load and config)
some different behaviours observed
at switch block - the SSE events seem to be not reporting data as per grafana
seeing 2mins of missing data - (trying to investigate the SSE event outputs)
the casper-node log messages show blocks and deploys executions at this period
after all the pending deploys in the network were executed the block timestamps were periodically spiking to 10 to 20s
1 validate node stalled and later synced to tip in idle network
at later switch block eras the finalisation times starting spiking to 18+ seconds
some nodes were running in 'debug' mode so it could have an IO impact on the node itself
Grafana: link
Dumps: http://genesis.casperlabs.io/ajith-condor-big/casper-node-dumps/ajith-condor-big/27062024_0401/dump_download_list.html
DB Snapshot:
genesis.casperlabs.io/release-test-db-snapshots/ajith-condor-big/db.json
genesis.casperlabs.io/release-test-db-snapshots/ajith-condor-big/db.tar.zst
Nodes (debug) - 172.142.59.225, 172.142.71.238, 172.142.94.215
The text was updated successfully, but these errors were encountered: