Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple benchmarking #74

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add simple benchmarking #74

wants to merge 3 commits into from

Conversation

carlosmn
Copy link
Contributor

@carlosmn carlosmn commented Oct 5, 2015

Starting a function name with 'bench_' will now also store it and make
clar run it, but only when invoked with the '-b' flag.

In order to reduce how much clar overhead we measure, we put the timing
code around the function call, which means we end up timing every
function, but we only store the elapsed time in benchmark mode. The
overhead from calling the timer is minimal, so it shouldn't present an
issue to add these calls for every test.

This is not meant for micro-benchmarks, but for situations where we
expect the operations to take at least tens or hundreds of milliseconds
and we expect the operations not to be easily mapped into the "repeat N
times" method to suss out the noise, so we don't include this
functionality. This is for "perform very expensive operation" to track
high-scale changes though time.

Thus we also provide cl_reset_timer() to allow the function under test
to perform specific setup steps that it does not want to count towards
the timing.


This is the simplest thing I could come up with which would be useful for us to test libgit2 performance over time, without going into microbenchmarking, but for the "merge thousand-entry trees" situation. The output is machine-readable, so we can keep historical records in some system. /cc @ethomson

The thing with global_is_bench is ugly but we're now in two different modes, so the number of tests in each case is different, and it seemed less terrible than looking into _clar from the outside, or searching for clar flags in main.c.

@ethomson
Copy link
Member

So my thought here is that this is generally useful - and clar shouldn't know the difference between a "benchmark" and a "test". It should just run things and time them. If some consumer wants to use clar to run both benchmarks and tests, then I think that there are a couple of options:

  1. The consumer could build two clars - one with tests and one with benchmarks.
  2. The consumer could build a single clar and use naming to sort this out itself.

@carlosmn
Copy link
Contributor Author

I think we can avoid having two different clars built by passing in a benchmark flag; that shouldn't be too bad. There's a few reasons I chose bench_ over just timing whatever we would be running. One of them is the discoverability of the tests which we think of as benchmarks. As I've mentioned I don't think most tests would tell us much if we see a change in their runtime, so for detecting regressions in core areas, you would have to know which specific tests are informative. If we let clar know beforehand, it can just say.

We can certainly solve this by putting all the benchmarks as test_bench_ tests so we can run ./libgit2_clar -b -sbench, but this moves the bench code away from the test code for a particular suite, whereas with bench_ we can have the index code next to similar index testing code, instead of in a parallel tree.

But it's not that big a deal at the end of the day. Having benchmarks be normal tests does mean they get tested the same as the rest even when we're not benchmarking.

@ethomson
Copy link
Member

Sorry for the long delay - this looks like what I had in mind! This seems to be missing the clar/time.h though.

@carlosmn
Copy link
Contributor Author

I did notice it was missing earlier, but completely forgot to add it the second time around as well :/ It's there now.

clar/time.h Outdated
struct timespec tp;

if (clock_gettime(CLOCK_MONOTONIC, &tp) == 0) {
return (double tp.tv_sec + (double tp.tv_nsec / 1.0E9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My compiler is unhappy with this. :(

@ethomson
Copy link
Member

I really like this. My only feedback is that I might like to see the benchmark reports printed at the end of the entire execution, instead of the end of an execution for each suite. eg:

Benchmark blame::buffer::delete_crosses_hunk_boundary 0.000000
Benchmark blame::buffer::replace_line 0.000000
Benchmark blame::buffer::add_lines_at_end 0.000000
..
Benchmark blame::getters::byindex 0.000000
Benchmark blame::getters::byline 0.000000
....
Benchmark blame::harder::m 0.000000
Benchmark blame::harder::c 0.000000
Benchmark blame::harder::cc 0.000000
Benchmark blame::harder::ccc 0.000000

Seems a little weird for output (weirder still with -v, which is usually the way I run).

@carlosmn
Copy link
Contributor Author

Yeah, that makes sense.

Starting a function name with 'bench_' will now also store it and make
clar run it, but only when invoked with the '-b' flag.

In order to reduce how much clar overhead we measure, we put the timing
code around the function call, which means we end up timing every
function, but we only store the elapsed time in benchmark mode. The
overhead from calling the timer is minimal, so it shouldn't present an
issue to add these calls for every test.

This is not meant for micro-benchmarks, but for situations where we
expect the operations to take at least tens or hundreds of milliseconds
and we expect the operations not to be easily mapped into the "repeat N
times" method to suss out the noise, so we don't include this
functionality. This is for "perform very expensive operation" to track
high-scale changes though time.

Thus we also provide cl_reset_timer() to allow the function under test
to perform specific setup steps that it does not want to count towards
the timing.
Instead of separating the benchmark tests from "normal" tests, store
timings for all successful tests and let the user decide which tests
they'd like to run via the -s option.
Instead of intermixing benchmark results with the suite outputs, keep
them all until the end.
@carlosmn
Copy link
Contributor Author

@ethomson I finally got around to finishing this up; the timing results are now shown at the end of a multi-suite run.

@ethomson ethomson changed the base branch from master to main December 23, 2020 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants