Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batcheval: 24.3 microbench regression in BenchmarkRefreshRange #134420

Open
Tracked by #133682
tbg opened this issue Nov 6, 2024 · 1 comment
Open
Tracked by #133682

batcheval: 24.3 microbench regression in BenchmarkRefreshRange #134420

tbg opened this issue Nov 6, 2024 · 1 comment
Assignees
Labels
branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-investigation Further steps needed to qualify. C-label will change. T-kv KV Team

Comments

@tbg
Copy link
Member

tbg commented Nov 6, 2024

profs.zip

./dev bench --ignore-cache --stream-output --bench-mem ./pkg/kv/kvserver/batcheval/ --filter 'RefreshRange/mixed-case/refresh_window=.0.00,0.00.' --test-args '-test.memprofile=mem.pb.gz -test.cpuprofile=cpu.pb.gz -test.benchtime=50000x -test.cpu=8'

Clearly the new one indeed is much slower (I think the spreadsheet says ~75%, and I verified that), but it's unclear why. The additional time seems to be spent in iterator.SeekGE.

old

image

new

image

There's something weird about the refresh_window=[N,N] case. Note how [0,0] is way slower than [0,anything-else]. Similar, how [50,50] is by far the slowest of the [50,*] family, etc.

$ ./dev bench --ignore-cache --stream-output --bench-mem ./pkg/kv/kvserver/batcheval/ --filter 'RefreshRange/mixed-case/' --test-args '-test.memprofile=mem.pb.gz -test.cpuprofile=cpu.pb.gz -test.benchtime=3s -test.cpu=8' 2>&1 | grep --line-buffered -E '^Benchmark' | tee bench.txt
BenchmarkRefreshRange/mixed-case/refresh_window=[0.00,0.00]-8              16135            222172 ns/op            1325 B/op         20 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[0.00,50.00]-8            209877             16824 ns/op            1398 B/op         23 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[0.00,75.00]-8            208724             16817 ns/op            1399 B/op         23 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[0.00,95.00]-8            170229             20697 ns/op            1398 B/op         23 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[0.00,99.00]-8            180392             19752 ns/op            1398 B/op         23 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[50.00,50.00]-8               25         132782851 ns/op            7538 B/op         38 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[50.00,75.00]-8           200251             17568 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[50.00,95.00]-8           168374             21018 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[50.00,99.00]-8           177073             20014 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[75.00,75.00]-8               26         128251366 ns/op            4628 B/op         38 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[75.00,95.00]-8           168642             20962 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[75.00,99.00]-8           177039             20013 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[95.00,95.00]-8           113584             31360 ns/op            1321 B/op         20 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[95.00,99.00]-8           211741             16561 ns/op            1404 B/op         24 allocs/op
BenchmarkRefreshRange/mixed-case/refresh_window=[99.00,99.00]-8             1128           3128315 ns/op            1359 B/op         20 allocs/op

The mixed-case benchmark family roughly writes 1 million (separate) keys in random order, key i at timestamp 5ns*(1+i). The engine is set up with an LBaseMaxBytes of 256 to create a more interesting LSM. Then, it benchmarks RefreshRange with a timestamp range that covers the given percentage of cases.

For example, [0,0] translates to refreshing from ts=5ns to ts=5ns+1 logical. It makes zero sense to me that that would take 222us per op, whereas refreshing over essentially all of the data would take around 20us.

But here's the kicker, the above table is from b20c617, i.e. the "old" commit in the microbench regression. There's little point in musing about why it's gotten almost twice as bad (and it has - ~388 micros for the [0,0] case) if we don't understand either result in the first place.

Jira issue: CRDB-44073

This comment was marked as resolved.

@blathers-crl blathers-crl bot added the X-blathers-untriaged blathers was unable to find an owner label Nov 6, 2024
@tbg tbg changed the title RefreshRange @tbg batcheval: 24.3 microbench regression in BenchmarkRefreshRange Nov 6, 2024
@tbg tbg added branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-investigation Further steps needed to qualify. C-label will change. T-kv KV Team labels Nov 6, 2024
@tbg tbg self-assigned this Nov 6, 2024
@yuzefovich yuzefovich removed the X-blathers-untriaged blathers was unable to find an owner label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-investigation Further steps needed to qualify. C-label will change. T-kv KV Team
Projects
None yet
Development

No branches or pull requests

2 participants