Add simple benchmarking #74

carlosmn · 2015-10-05T10:28:39Z

Starting a function name with 'bench_' will now also store it and make
clar run it, but only when invoked with the '-b' flag.

In order to reduce how much clar overhead we measure, we put the timing
code around the function call, which means we end up timing every
function, but we only store the elapsed time in benchmark mode. The
overhead from calling the timer is minimal, so it shouldn't present an
issue to add these calls for every test.

This is not meant for micro-benchmarks, but for situations where we
expect the operations to take at least tens or hundreds of milliseconds
and we expect the operations not to be easily mapped into the "repeat N
times" method to suss out the noise, so we don't include this
functionality. This is for "perform very expensive operation" to track
high-scale changes though time.

Thus we also provide cl_reset_timer() to allow the function under test
to perform specific setup steps that it does not want to count towards
the timing.

This is the simplest thing I could come up with which would be useful for us to test libgit2 performance over time, without going into microbenchmarking, but for the "merge thousand-entry trees" situation. The output is machine-readable, so we can keep historical records in some system. /cc @ethomson

The thing with global_is_bench is ugly but we're now in two different modes, so the number of tests in each case is different, and it seemed less terrible than looking into _clar from the outside, or searching for clar flags in main.c.

ethomson · 2015-10-21T19:37:37Z

So my thought here is that this is generally useful - and clar shouldn't know the difference between a "benchmark" and a "test". It should just run things and time them. If some consumer wants to use clar to run both benchmarks and tests, then I think that there are a couple of options:

The consumer could build two clars - one with tests and one with benchmarks.
The consumer could build a single clar and use naming to sort this out itself.

carlosmn · 2015-10-21T21:16:19Z

I think we can avoid having two different clars built by passing in a benchmark flag; that shouldn't be too bad. There's a few reasons I chose bench_ over just timing whatever we would be running. One of them is the discoverability of the tests which we think of as benchmarks. As I've mentioned I don't think most tests would tell us much if we see a change in their runtime, so for detecting regressions in core areas, you would have to know which specific tests are informative. If we let clar know beforehand, it can just say.

We can certainly solve this by putting all the benchmarks as test_bench_ tests so we can run ./libgit2_clar -b -sbench, but this moves the bench code away from the test code for a particular suite, whereas with bench_ we can have the index code next to similar index testing code, instead of in a parallel tree.

But it's not that big a deal at the end of the day. Having benchmarks be normal tests does mean they get tested the same as the rest even when we're not benchmarking.

ethomson · 2015-11-17T22:38:25Z

Sorry for the long delay - this looks like what I had in mind! This seems to be missing the clar/time.h though.

carlosmn · 2015-11-17T22:49:38Z

I did notice it was missing earlier, but completely forgot to add it the second time around as well :/ It's there now.

ethomson · 2015-11-17T22:54:49Z

clar/time.h

+	struct timespec tp;
+
+	if (clock_gettime(CLOCK_MONOTONIC, &tp) == 0) {
+		return (double tp.tv_sec + (double tp.tv_nsec / 1.0E9;


My compiler is unhappy with this. :(

ethomson · 2015-11-17T23:01:37Z

I really like this. My only feedback is that I might like to see the benchmark reports printed at the end of the entire execution, instead of the end of an execution for each suite. eg:

Benchmark blame::buffer::delete_crosses_hunk_boundary 0.000000
Benchmark blame::buffer::replace_line 0.000000
Benchmark blame::buffer::add_lines_at_end 0.000000
..
Benchmark blame::getters::byindex 0.000000
Benchmark blame::getters::byline 0.000000
....
Benchmark blame::harder::m 0.000000
Benchmark blame::harder::c 0.000000
Benchmark blame::harder::cc 0.000000
Benchmark blame::harder::ccc 0.000000

Seems a little weird for output (weirder still with -v, which is usually the way I run).

carlosmn · 2015-11-18T00:01:13Z

Yeah, that makes sense.

Starting a function name with 'bench_' will now also store it and make clar run it, but only when invoked with the '-b' flag. In order to reduce how much clar overhead we measure, we put the timing code around the function call, which means we end up timing every function, but we only store the elapsed time in benchmark mode. The overhead from calling the timer is minimal, so it shouldn't present an issue to add these calls for every test. This is not meant for micro-benchmarks, but for situations where we expect the operations to take at least tens or hundreds of milliseconds and we expect the operations not to be easily mapped into the "repeat N times" method to suss out the noise, so we don't include this functionality. This is for "perform very expensive operation" to track high-scale changes though time. Thus we also provide cl_reset_timer() to allow the function under test to perform specific setup steps that it does not want to count towards the timing.

Instead of separating the benchmark tests from "normal" tests, store timings for all successful tests and let the user decide which tests they'd like to run via the -s option.

Instead of intermixing benchmark results with the suite outputs, keep them all until the end.

carlosmn · 2015-12-22T14:00:34Z

@ethomson I finally got around to finishing this up; the timing results are now shown at the end of a multi-suite run.

carlosmn force-pushed the lies branch from d7e727b to 877ff08 Compare November 17, 2015 22:49

ethomson reviewed Nov 17, 2015
View reviewed changes

carlosmn force-pushed the lies branch from 877ff08 to 5a419de Compare November 17, 2015 22:59

carlosmn added 3 commits December 22, 2015 13:59

Report timings for all tests

5e925f4

Instead of separating the benchmark tests from "normal" tests, store timings for all successful tests and let the user decide which tests they'd like to run via the -s option.

Show timing results at the end

7ea7444

Instead of intermixing benchmark results with the suite outputs, keep them all until the end.

carlosmn force-pushed the lies branch from 5a419de to 7ea7444 Compare December 22, 2015 13:59

ethomson changed the base branch from master to main December 23, 2020 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simple benchmarking #74

Add simple benchmarking #74

carlosmn commented Oct 5, 2015

ethomson commented Oct 21, 2015

carlosmn commented Oct 21, 2015

ethomson commented Nov 17, 2015

carlosmn commented Nov 17, 2015

ethomson Nov 17, 2015

ethomson commented Nov 17, 2015

carlosmn commented Nov 18, 2015

carlosmn commented Dec 22, 2015

Add simple benchmarking #74

Are you sure you want to change the base?

Add simple benchmarking #74

Conversation

carlosmn commented Oct 5, 2015

ethomson commented Oct 21, 2015

carlosmn commented Oct 21, 2015

ethomson commented Nov 17, 2015

carlosmn commented Nov 17, 2015

ethomson Nov 17, 2015

Choose a reason for hiding this comment

ethomson commented Nov 17, 2015

carlosmn commented Nov 18, 2015

carlosmn commented Dec 22, 2015