Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transaction Service Status] Batch status and memo writes to DB. #3026

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

fkouteib
Copy link

Problem

Transaction status service issues individual writes to DB backend for each transaction memo, and each account/pubkey update within a transaction. This seems inefficient at least from CPU execution time perspective, even if the DB backend is aggregating and batching IO updates to disk.

Summary of Changes

  • Batch transaction status writes before writing them to DB backend.
  • Batch transaction memo writes before writing them to DB backend.
  • Perform batching at transaction batch level (typically 64 tx).

@fkouteib
Copy link
Author

fkouteib commented Sep 29, 2024

This data was collected using a synthetic workload, from an internal test, and a 2-node local cluster setup running on a single physical node (Linux). This test runs in about a minute. The transaction mix is largely test transactions that have 35 account pubkeys including the fee payer [these land in TSS in 64 tx batches] and single vote transactions [they land in a batch of 1]. This was measured execution time for the match segment that handles transaction status batches in transaction status service.

Stat Baseline (exec time in us) batched (exec time in us)
min 22 16
max 226,770 49,679
mean 517.01 186.23
median 233 87
std dev 2581.53 795.16
datapoints 112,066 124,027

I also ran the same synthetic workload and internal test on a small multi-node distributed private test cluster, and compared high level disk io metrics for num sectors written and num writes completed and the ratio were pretty close, and seemed to me the writes must generally be 128k commands based on the ratios averaged out.

@bw-solana
Copy link

disk io metrics for num sectors written and num writes completed and the ratio were pretty close

This is interesting given the high level timings seem significantly better after this change. Do we think this is just due to reduction in CPU time from merging writes internally?

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of suggestions.

I'm also wondering if we have an existing bencher for some of these operations. If not, might be nice to add one and confirm it shows benefits relative to the unbatched behavior.

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
@@ -9618,6 +9653,7 @@ pub mod tests {
.map(|key| (key, true)),
TransactionStatusMeta::default(),
counter,
None,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a unit test that confirms the new batching behavior

@bw-solana
Copy link

@lijunwangs - I'm thinking you're the best person to review these changes while Tyera is out. Let me know if there is a better candidate

@lijunwangs
Copy link

This data was collected using a synthetic workload, from an internal test, and a 2-node local cluster setup running on a single physical node (Linux). This test runs in about a minute. The transaction mix is largely test transactions that have 35 account pubkeys including the fee payer [these land in TSS in 64 tx batches] and single vote transactions [they land in a batch of 1]. This was measured execution time for the match segment that handles transaction status batches in transaction status service.

Stat Baseline (exec time in us) batched (exec time in us)
min 22 16
max 226,770 49,679
mean 517.01 186.23
median 233 87
std dev 2581.53 795.16
datapoints 112,066 124,027
I also ran the same synthetic workload and internal test on a small multi-node distributed private test cluster, and compared high level disk io metrics for num sectors written and num writes completed and the ratio were pretty close, and seemed to me the writes must generally be 128k commands based on the ratios averaged out.

What is the exact metrics being used for this data?

@fkouteib
Copy link
Author

fkouteib commented Oct 2, 2024

What is the exact metrics being used for this data?

@lijunwangs It's not an official metric that uploads to the metrics db, but I posted the debug code used for timing it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants