Investigate Max Throughput for Read Only Transactions #1634

ericpassmore · 2023-09-13T18:43:30Z

Summary

Running the performance load test only able to get Max Read Only Tps of 91. Ran an API node with 32 read only threads.

The load test tried the following tps

"500": "FAIL",
"251": "FAIL",
"121": "FAIL",
"61": "PASS",
"91": "PASS",
"111": "FAIL",
"101": "FAIL"

Nodeos Version

Running on ubuntu 22.04 with Nodeos build from tag API-early-v5.0-b73c28d51

Read Only Transaction

{
    "initAccounts": ["payloadless"],
    "abiFile": "unittests/test-contracts/payloadless/payloadless.abi",
    "apiEndpoint": "/v1/chain/send_read_only_transaction",
    "actions": [
        {
            "actionName": "doit",
            "actionData": {
            },
            "actionAuthAcct": "payloadless",
            "authorization": {}
        }
    ]
}

Command

tests/PerformanceHarnessScenarioRunner.py findMax testApiOpMode 
--max-tps-to-test 500 
--test-iteration-min-step 10 
--test-iteration-duration-sec 10 
--final-iterations-duration-sec 10 
overrideBasicTestConfig -v 
--tps-limit-per-generator 250 
--api-nodes-read-only-threads 32 
--account-name "payloadless" 
--abi-file payloadless.abi --wasm-file payloadless.wasm 
--contract-dir unittests/test-contracts/payloadless 
--user-trx-data-file tests/PerformanceHarness/readOnlyTrxData.json 
--chain-state-db-size-mb 200

Load

Load was almost always less then one.

The text was updated successfully, but these errors were encountered:

ericpassmore · 2023-09-13T19:51:05Z

Runnning get_info with locust with a mainnet snapshot on same hardware was able to get 2,500 tps with read-only-threads = 4. Much bigger numbers.

linh2931 · 2023-09-13T20:53:25Z

What was read-only-read-window-time-us configured? The default 60,000 us might be too small.

oschwaldp-oci · 2023-09-13T21:32:06Z

What was read-only-read-window-time-us configured? The default 60,000 us might be too small.

@linh2931 -- Is there a value you'd recommend for read-only-read-window-time-us vs the default of 60,000?

heifner · 2023-09-13T21:44:06Z

32 threads * 60ms * 2 blocks_per_second = 3840ms.
3840ms / 91 TPS = 42ms per read-only trx.
Something seems wrong. @oschwaldp-oci do the logs contain average time for the trxs?

oschwaldp-oci · 2023-09-13T21:53:24Z

Very strange, this almost seems like something is limiting it to 91 as I just ran this in CICD here and got the same results. Different machine...same results?

Ran: ./tests/PerformanceHarnessScenarioRunner.py findMax testApiOpMode --max-tps-to-test 500 --test-iteration-min-step 10 --test-iteration-duration-sec 10 --final-iterations-duration-sec 10 overrideBasicTestConfig -v --tps-limit-per-generator 250 --api-nodes-read-only-threads 4 --account-name payloadless --abi-file payloadless.abi --wasm-file payloadless.wasm --contract-dir unittests/test-contracts/payloadless --user-trx-data-file tests/PerformanceHarness/readOnlyTrxData.json

Full Performance Test Report: {
  "perfTestsBegin": "2023-09-13T21:37:59.591754",
  "perfTestsFinish": "2023-09-13T21:47:28.796087",
  "perfTestsDuration": "0:09:29.204333",
  "operationalMode": "API Node Operational Mode",
  "InitialMaxTpsAchieved": 91,
  "LongRunningMaxTpsAchieved": 91,
  "tpsTestStart": "2023-09-13T21:37:59.665678",
  "tpsTestFinish": "2023-09-13T21:47:28.796079",
  "tpsTestDuration": "0:09:29.130401",
  "InitialSearchScenariosSummary": {
    "500": "FAIL",
    "251": "FAIL",
    "121": "FAIL",
    "61": "PASS",
    "91": "PASS",
    "111": "FAIL",
    "101": "FAIL"
  },

heifner · 2023-09-13T21:56:01Z

If only using 1 thread then that is 60ms * 2 blocks = 120ms. 120ms / 91 = 1.3ms per read-only trx. Seems about right.

linh2931 · 2023-09-13T22:06:35Z

Is there a value you'd recommend for read-only-read-window-time-us vs the default of 60,000?

We recommended 165000 for API operators. But as Kevin and Peter noted, something else must gone wrong. I will take a look at this too.

heifner · 2023-09-13T23:41:46Z

Peter showed me an example trace from one of these and it was 27ms elapsed time. Sure seems like something is causing these to only run on one thread or preventing them from running in parallel.

heifner · 2023-09-13T23:42:52Z

Runnning get_info with locust with a mainnet snapshot on same hardware was able to get 2,500 tps with read-only-threads = 4. Much bigger numbers.

get_info is completely different than read-only trxs. They take very different execution paths.

linh2931 · 2023-09-13T23:52:11Z

I can only get to 91 TPS on main branch too. I am going to run on selected PR branches to pinpoint which one causes the problem.

heifner · 2023-09-14T01:08:20Z

Something odd is happening. If you run read_only_trx_test.py and look at the log you see this:

debug 2023-09-14T00:50:59.022 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4180us, total all threads 0us
debug 2023-09-14T00:50:59.238 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 3505us, total all threads 443us
debug 2023-09-14T00:50:59.395 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4460us, total all threads 218us
debug 2023-09-14T00:50:59.430 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4502us, total all threads 182us
debug 2023-09-14T00:50:59.495 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4272us, total all threads 257us
debug 2023-09-14T00:50:59.521 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 5870us, total all threads 358us
debug 2023-09-14T00:50:59.537 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 5586us, total all threads 315us
debug 2023-09-14T00:50:59.622 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 3824us, total all threads 0us
debug 2023-09-14T00:50:59.658 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 5199us, total all threads 320us
debug 2023-09-14T00:50:59.763 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4050us, total all threads 280us
debug 2023-09-14T00:50:59.828 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 4199us, total all threads 309us
debug 2023-09-14T00:50:59.966 nodeos    producer_plugin.cpp:2808      switch_to_write_wind ] Read-only threads 128, read window 5670us, total all threads 0us

This is with: read-only-threads = 128, http-threads = 128, read-only-write-window-time-us = 10000, read-only-read-window-time-us = 490000

Also for one sample only 141 out of 9400 read-only trxs ran off the main thread.

Clearly it is switching way too often. Likely it thinks there are no read-only trxs to execute.

heifner · 2023-09-14T01:26:19Z

On Peter's machine the average elapsed time for the read-only trx test case was 19.5us. That is a rather small amount of time. It could be our queue contention and scheduling overhead just doesn't have time keep anything queued. Also the main thread is setup to have a higher priority, so maybe it just is picking up all the work. To test this possibility we need a much heavier (slower) read-only trx test. Can we modify the test read-only trx to loop for say something like 10ms to see if that provides expected results?

linh2931 · 2023-09-14T02:01:21Z

Thanks Kevin. I was worried my recent changes had broken read-only trx processing. I tested the PR branch of the first performance harness which enabled read-only trx (#1229) and my first major change for WASM optimization (#1414). Both could only get to 91 TPS.

linh2931 · 2023-09-14T11:43:26Z

I hardcoded number of read-only threads to 0; I could still only get 91 TPS. Something might be wrong with performance harness reporting?

ericpassmore · 2023-09-14T13:53:24Z

Runnning get_info with locust with a mainnet snapshot on same hardware was able to get 2,500 tps with read-only-threads = 4. Much bigger numbers.

get_info is completely different than read-only trxs. They take very different execution paths.

Ha! get_info and do_it are apples and oranges. Not only that my test was running different test data, and different config. I added the comment for my notes. I was thinking my localhost setup was bad, and was an attempt to prove that my localhost did not have a bad ulimit or or bad setup.

oschwaldp-oci · 2023-09-14T16:44:56Z

With the changes proposed in #1637 I was able to get the following results:

  "perfTestsBegin": "2023-09-14T15:24:31.207915",
  "perfTestsFinish": "2023-09-14T16:33:16.646302",
  "perfTestsDuration": "1:08:45.438387",
  "operationalMode": "API Node Operational Mode",
  "InitialMaxTpsAchieved": 11001,
  "LongRunningMaxTpsAchieved": 8501,
  "tpsTestStart": "2023-09-14T15:24:31.220469",
  "tpsTestFinish": "2023-09-14T16:33:16.646293",
  "tpsTestDuration": "1:08:45.425824",
  "InitialSearchScenariosSummary": {
    "25000": "FAIL",
    "12501": "FAIL",
    "6001": "PASS",
    "9501": "PASS",
    "11001": "PASS",
    "12001": "FAIL",
    "11501": "FAIL"
  },
  "LongRunningSearchScenariosSummary": {
    "11001": "FAIL",
    "10501": "FAIL",
    "10001": "FAIL",
    "9501": "FAIL",
    "9001": "FAIL",
    "8501": "PASS"
  },

running: ./tests/PerformanceHarnessScenarioRunner.py findMax testApiOpMode --max-tps-to-test 25000 overrideBasicTestConfig -v --api-nodes-read-only-threads 4 --account-name payloadless --abi-file payloadless.abi --wasm-file payloadless.wasm --contract-dir unittests/test-contracts/payloadless --user-trx-data-file tests/PerformanceHarness/readOnlyTrxData.json

oschwaldp-oci · 2023-09-14T17:40:18Z

As a further data point, results of running a performance run with transfer transactions, instead of read-only, I saw similar performance on my machine...which suggests something is keeping read-only from truly taking advantage of having extra threads available.

  "perfTestsBegin": "2023-09-14T16:51:00.929474",
  "perfTestsFinish": "2023-09-14T17:36:02.976930",
  "perfTestsDuration": "0:45:02.047456",
  "operationalMode": "Block Producer Operational Mode",
  "InitialMaxTpsAchieved": 9001,
  "LongRunningMaxTpsAchieved": 8501,
  "tpsTestStart": "2023-09-14T16:51:00.941475",
  "tpsTestFinish": "2023-09-14T17:36:02.976921",
  "tpsTestDuration": "0:45:02.035446",
  "InitialSearchScenariosSummary": {
    "25000": "FAIL",
    "12501": "FAIL",
    "6001": "PASS",
    "9501": "FAIL",
    "8001": "PASS",
    "9001": "PASS"
  },
  "LongRunningSearchScenariosSummary": {
    "9001": "FAIL",
    "8501": "PASS"
  },

running: ./tests/PerformanceHarnessScenarioRunner.py findMax testBpOpMode --max-tps-to-test 25000

ericpassmore · 2023-09-25T21:10:14Z

We have a good perf env setup with github runner. Closing. Issue #1662 tracks any improvements in code base.

ericpassmore self-assigned this Sep 13, 2023

enf-ci-bot added the triage label Sep 13, 2023

ericpassmore added 👍 lgtm and removed triage labels Sep 13, 2023

oschwaldp-oci mentioned this issue Sep 14, 2023

[PH] Fix issues with processing read-only transactions. #1637

Merged

ericpassmore mentioned this issue Sep 20, 2023

Investigate and fix throughput issues related to read only transactions and HTTP #1662

Closed

oschwaldp-oci self-assigned this Sep 21, 2023

oschwaldp-oci mentioned this issue Sep 22, 2023

[PH] Performance Harness Updates #1667

Merged

ericpassmore closed this as completed Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Max Throughput for Read Only Transactions #1634

Investigate Max Throughput for Read Only Transactions #1634

ericpassmore commented Sep 13, 2023

ericpassmore commented Sep 13, 2023

linh2931 commented Sep 13, 2023

oschwaldp-oci commented Sep 13, 2023

heifner commented Sep 13, 2023

oschwaldp-oci commented Sep 13, 2023

heifner commented Sep 13, 2023

linh2931 commented Sep 13, 2023

heifner commented Sep 13, 2023

heifner commented Sep 13, 2023

linh2931 commented Sep 13, 2023

heifner commented Sep 14, 2023 •

edited

Loading

heifner commented Sep 14, 2023 •

edited

Loading

linh2931 commented Sep 14, 2023

linh2931 commented Sep 14, 2023

ericpassmore commented Sep 14, 2023

oschwaldp-oci commented Sep 14, 2023

oschwaldp-oci commented Sep 14, 2023

ericpassmore commented Sep 25, 2023

Investigate Max Throughput for Read Only Transactions #1634

Investigate Max Throughput for Read Only Transactions #1634

Comments

ericpassmore commented Sep 13, 2023

Summary

Nodeos Version

Read Only Transaction

Command

Load

ericpassmore commented Sep 13, 2023

linh2931 commented Sep 13, 2023

oschwaldp-oci commented Sep 13, 2023

heifner commented Sep 13, 2023

oschwaldp-oci commented Sep 13, 2023

heifner commented Sep 13, 2023

linh2931 commented Sep 13, 2023

heifner commented Sep 13, 2023

heifner commented Sep 13, 2023

linh2931 commented Sep 13, 2023

heifner commented Sep 14, 2023 • edited Loading

heifner commented Sep 14, 2023 • edited Loading

linh2931 commented Sep 14, 2023

linh2931 commented Sep 14, 2023

ericpassmore commented Sep 14, 2023

oschwaldp-oci commented Sep 14, 2023

oschwaldp-oci commented Sep 14, 2023

ericpassmore commented Sep 25, 2023

heifner commented Sep 14, 2023 •

edited

Loading

heifner commented Sep 14, 2023 •

edited

Loading