Introduce finite test runs, repeatability, reporting, and fine-tuned controls #4

adesjardins-stripe · 2022-06-03T14:05:51Z

Preamble

This PR introduces some substantial changes McTester's core logic. The intention is to make test configurations more intuitive, repeatable, and informative. Some of logic may have been changed without fully appreciating its original intent, so please let me know if I have undermined an important feature and I'll work to make these changes more harmonious.

Overview

The most significant change surrounds request cadence. Instead of using a sleep-burst cycle, I made use of a rate limiter to govern request timings precisely. This rate is unbounded by default but can be configured using the rps option. When specified, each thread will send requests at the given rate. For instance, setting -conncount=8 and -rps=1000 will result in a total rate of 8000 requests per second.

I will briefly summarize each of the other changes.

Finite Test Runs

The -duration option will tell the application to run for a precise period of time before terminating by itself.

Value Comparisons

Values are still "random" but are now deterministic, seeded by their key before being generated. This allows for the value returned from a Get to be validated against what should have been Set. If any invalid data is returned, implying a key collision, this is logged.

Run Reports

For finite runs, a JSON report is emitted to STDOUT detailing the run configuration used, timing metrics, and some basic stats such as "Get Hits".

Repeatability

The RNG seed is now a configurable using the -rng option. By default, this is set using the same method as before. When set, a test run will send the exact same sequence of requests with the same keys and values in each thread as before (synchronization between threads cannot be controlled, of course). The seed value is included in the config portion of the run report.

Cache Warming

When the -warm option is provided, the given percent of keys will be set prior to beginning the core test loop.

Get-Set-Delete Ratio

Before, every failed 'Get' would result in a subsequent 'Set', and a 'Delete' percentage could be configured separately. Now, statistically, all requests will adhere to the specified Get-Set-Delete proportions.

Misc

I incorporated the changes made by @feihu-stripe (here), which include key prefix stripping, unix socket support, custom IP targets, and pipelined Gets (for now, just repeating the same query n times). I also reformatted some things, like reordering argument lists lexicographically as they were starting to get a bit long and removing some comments that no longer seemed applicable.

Limitations

These changes were, for the most part, only applied to the "basic" test protocol and were not ported to the "server" variant. Ideally, I think we would abstract the core logic so that both protocols would remain in sync without needing to copy anything.

I added two third party dependencies, but both are OSS with MIT licenses.

Let me know what you think. We can iterate on this work to settle on something that appeals to both of us. There is always room for further development, of course!

…controls

dormando · 2022-06-03T18:13:58Z

Thanks! I'll have to take some time with this, I think most of the changes are easy acceptances.

For future reference, it's much easier to review sets of commits that build into something rather than blobs like this; probably 80% of the changes I could accept without argument, then we'd iterate on the rest. Instead I have to sort of mentally map what's what while looking through the bigger change.

I haven't looked at it closely yet, but offhand; an earlier version of this tested the value based on the key (or maybe the server binary still does that? I'm forgetting) like you have here, but the performance cratered. While this doesn't need to reach mc-crusher levels of performance I do need it to run in the low millions of rps, or high hundreds of thousands. Were you measuring peak performance of the tester while working on this?

edit: by "tested the value based on the key" - I mean used the key as the random seed then re-generated the value to test against.

adesjardins-stripe · 2022-06-07T19:58:50Z

protocol.go

@@ -17,6 +17,7 @@ import (
 	"io"
 	"net"
 	"strconv"
+	"strings"


I believe all changes in protocol.go were pulled in from the PR made by @feihu-stripe.

adesjardins-stripe · 2022-06-07T20:00:16Z

cmd/server/main.go

@@ -135,7 +135,7 @@ func main() {
 			log.Fatal(err)
 		}

-		resp, err := http.Post(*delAddr + "/delete", "Content-Type: application/json", bytes.NewReader(data))
+		resp, err := http.Post(*delAddr+"/delete", "Content-Type: application/json", bytes.NewReader(data))


These are all formatting changes enforced by the IDE. They seem reasonable but I can drop them.

I just wish it were a separate commit; a "gofmt" cleanup that I could just merge straight away and make the rest easier to page through. I know you folks are busy but normally I would've asked for at minimum formatting changes to be split out on their own.

adesjardins-stripe · 2022-06-07T20:01:20Z

cmd/server/loader_basic.go

 	"math/rand"
 	"time"
+
+	"github.com/dgryski/go-pcgr"
+	mct "github.com/memcached/mctester"


These changes are related to Fei's PR as well. I did not make further changes to the server application.

adesjardins-stripe · 2022-06-07T20:07:51Z

cmd/basic/main.go

+	clientFlags := flag.Uint("clientflags", 0, "(32bit unsigned) client flag bits to set on miss")
 	connCount := flag.Int("conncount", 1, "number of client connections to establish")
-	reqPerSleep := flag.Int("reqpersleep", 1, "number of requests to issue when client wakes up")
-	reqBundlePerConn := flag.Int("reqbundles", 1, "number of times to wake up and send requests before disconnecting (-1 for unlimited)")
-	sleepPerBundle := flag.Duration("sleepperbundle", time.Millisecond*1, "time to sleep between request bundles (accepts Ns, Nms, etc)")
-	deletePercent := flag.Int("deletepercent", 0, "percentage of queries to issue as deletes instead of gets (0-1000)")
+	duration := flag.Duration("duration", 0, "length of time that the test will run (0 for unlimited)")
+	keyLength := flag.Int("keylength", 10, "number of random characters to append to key")
 	keyPrefix := flag.String("keyprefix", "mctester:", "prefix to append to all generated keys")
 	keySpace := flag.Int("keyspace", 1000, "number of unique keys to generate")
-	keyLength := flag.Int("keylength", 10, "number of random characters to append to key")
+	pipelines := flag.Uint("pipelines", 1, "(32bit unsigned) number of GET requests to stack within the same syscall")
+	delRatio := flag.Int("ratiodel", 0, "proportion of requests that should be sent as `deletes`")
+	getRatio := flag.Int("ratioget", 90, "proportion of requests that should be sent as `gets`")
+	setRatio := flag.Int("ratioset", 10, "proportion of requests that should be sent as `sets`")
+	rngSeed := flag.Int64("rngseed", time.Now().UnixNano(), "seed value used when initializing RNG")
+	rps := flag.Int("rps", 0, "target number of requests per second (0 for unlimited)")
+	server := flag.String("server", "127.0.0.1:11211", "`ip:port` for Memcached instance under test")
+	socket := flag.String("socket", "", "domain socket used for connections")
+	stripKeyPrefix := flag.Bool("stripkeyprefix", false, "remove key prefix before comparing with response")
 	keyTTL := flag.Uint("ttl", 180, "TTL to set with new items")
+	valueSize := flag.Uint("valuesize", 1000, "size of value (in bytes) to store on miss")
+	warmPercent := flag.Int("warm", 90, "percent of keys to `set` in Memcached before testing begins")
 	useZipf := flag.Bool("zipf", false, "use Zipf instead of uniform randomness (slow)")
 	zipfS := flag.Float64("zipfS", 1.01, "zipf S value (general pull toward zero) must be > 1.0")
 	zipfV := flag.Float64("zipfV", float64(*keySpace/2), "zipf V value (pull below this number")
-	valueSize := flag.Uint("valuesize", 1000, "size of value (in bytes) to store on miss")
-	clientFlags := flag.Uint("clientflags", 0, "(32bit unsigned) client flag bits to set on miss")


Added flags:

duration

pipelines

del-get-set ratios

rngSeed

rps

server

socket

stripKeyPrefix

warmPercent

Removed flags:

reqPerSleep

reqBundlePerConn

sleepPerBundle

deletePercent

Also, sorted all flags lexicographically based on the flag name (not the variable assigned). This should match what a user sees when running ./basic -h.

adesjardins-stripe · 2022-06-07T20:14:10Z

cmd/basic/main.go

-// Basic persistent load test, using text protocol:
-// - list of servers to connect to, pct of each.
-// - zipf or uniform random
-// - requests per connect (-1 for unlim)
-// - gets per etc
-// - multiget or not
-// - set or add to replace
-// - delete frequency
-// - set size range
-// - variances: peak/antipeak load
-// - variances: how often to change item sizes
-type BasicLoader struct {
-	servers               []string
-	stopAfter             time.Time
-	desiredConnCount      int
-	requestsPerSleep      int
-	requestBundlesPerConn int
-	sleepPerBundle        time.Duration
-	setValueSizes         []int
-	deletePercent         int
-	keyLength             int
-	keyPrefix             string
-	keySpace              int
-	keyTTL                uint
-	useZipf               bool
-	zipfS                 float64 // (> 1, generally 1.01-2) pulls the power curve toward 0)
-	zipfV                 float64 // v (< keySpace) puts the main part of the curve before this number
-	valueSize             uint
-	clientFlags           uint
+type Config struct {
+	ClientFlags    uint
+	ConnCount      int
+	DelRatio       int
+	Duration       time.Duration
+	GetRatio       int
+	KeyLength      int
+	KeyPrefix      string
+	KeySpace       int
+	KeyTTL         uint
+	Pipelines      uint
+	RngSeed        int64
+	RPS            int
+	Servers        []string
+	SetRatio       int
+	Socket         string
+	StripKeyPrefix bool
+	UseZipf        bool
+	ValueSize      uint
+	WarmPercent    int
+	ZipfS          float64 // (> 1, generally 1.01-2) pulls the power curve toward 0)
+	ZipfV          float64 // v (< keySpace) puts the main part of the curve before this number
+	tachymeter     *tachymeter.Tachymeter


Removed the comment as it no longer reflected the code accurately and the features are documented in the flags.

Renamed BasicLoader to Config.

Fields are sorted and exported so that they appear in the JSON report.

adesjardins-stripe · 2022-06-07T20:38:47Z

cmd/basic/main.go

+		iterStart := time.Now()
+		if iterStart.Sub(start) > conf.Duration {
+			break
+		}


A single for loop now which terminates after the specified duration has elapsed.

adesjardins-stripe · 2022-06-07T20:40:02Z

cmd/basic/main.go

+		if conf.UseZipf {
+			subRS.Seed(int64(zipRS.Uint64()))
+		} else {
+			subRS.Seed(conf.RngSeed + int64(randR.Intn(conf.KeySpace)))


Added the global seed here so that every run no longer has the exact same keys (unless the same global seed is used).

adesjardins-stripe · 2022-06-07T20:40:48Z

cmd/basic/main.go

+
+		key := mct.RandString(&subRS, conf.KeyLength, conf.KeyPrefix)
+		valSeed := new(big.Int).SetBytes([]byte(key)).Int64()
+		subRS.Seed(valSeed)


A bit ugly, but just having the values seeded by their keys.

adesjardins-stripe · 2022-06-07T20:42:37Z

cmd/basic/main.go

+		valSeed := new(big.Int).SetBytes([]byte(key)).Int64()
+		subRS.Seed(valSeed)
+
+		switch rng := randR.Intn(conf.DelRatio + conf.SetRatio + conf.GetRatio); {


Core worker logic simply branches probabilistically according to the delete-set-get ratios (respectively).

adesjardins-stripe · 2022-06-07T20:49:24Z

cmd/basic/main.go

+			}
+		case rng < (conf.DelRatio + conf.SetRatio):
+			value := mct.RandBytes(&subRS, int(conf.ValueSize))
+			rl.Take()


rl.Take() is where the rate limiter pauses execution for the prescribed amount of time.

adesjardins-stripe · 2022-06-07T20:51:42Z

For future reference, it's much easier to review sets of commits that build into something rather than blobs like this; probably 80% of the changes I could accept without argument, then we'd iterate on the rest.

Very understandable, and I apologize. This work was completed as a small project with this feature set in mind. I will try to add several comments to help clarify the different aspects of the PR. If it remains intractable, I might be able to break it up.

In terms of performance, I benchmarked the original code against my changes—with and without value comparisons. Value size has a clear effect on performance when comparisons are included. At a size of 5, I see a roughly 3% drop in performance which is due to my overall changes rather than the comparison itself. However, at a size of 1000, the performance impact approaches 20%, which is definitely substantial.

To mitigate this, I will add a flag to enable/disable the comparisons, so they won't have an impact if they're not desired. Something else to consider is using a partial comparison—only check the first n digits instead of the whole string. This should still catch any key collisions without the dramatic performance impact.

I didn't reach millions of RPS (100,000 on one machine and 300,000 on another), but I expect this is just due to hardware differences.

adesjardins-stripe · 2022-06-07T21:20:13Z

The flag has been implemented and performance is as expected when validation is disabled.

While testing it, I noticed that it appears that value comparisons aren't causing the performance impact; it's generating the expected value on every get that slows things down.

For an overall performance uplift, it may be worth pre-generating all keys and values and storing them rather then generating them each loop.

dormando · 2022-06-07T22:06:07Z

For an overall performance uplift, it may be worth pre-generating all keys and values and storing them rather then generating them each loop.

I swear I did exactly this in this repo already. I need to double check... I went through the same path of "oh long values take a while to generate", "a local HT lookup isn't too bad", etc. Hopefully I didn't leave that sitting in a branch somewhere.

adesjardins-stripe · 2022-06-08T15:28:34Z

I added key/value pre-generation in a branch as a test. There's a moderate performance improvement (~10%) when value comparisons are enabled and values are long (1000+). Otherwise, the difference isn't really meaningful.

https://github.com/adesjardins-stripe/mctester/pull/1/files

dormando · 2022-06-09T18:33:03Z

I can't find the branch that I thought I had, and the server mode doesn't seem to have a value checker either.

I do have some notes about testing other methods, ie key is the checksum of the value, but it doesn't look like I tried any of it.

re: the rate limiter... while my variables were extremely poorly named (reqspersleep, bundles, etc), I was actually using the prescriptive pacing in specific test scenarios. Also the original get/set/delete pattern is deliberate since it doesn't exercise things like extstore very well if you're doing blind set/deletes as a ratio.

Given the structure of the repo has different test commands with a central library for code reuse/abstraction, what would you think about moving this whole schebang to a new command? Then the discussion can be around which parts to pull into the middle library (which wouldn't be much of a rush and can happen over time). The "server" structure is similar in that the pacing/etc are abstracted out of its core so if we want to port features over there that'd probably be the forcing function for moving code into the library.

dormando · 2022-06-09T18:35:17Z

I'd still give the command another full readthrough/audit for issues but we could skip the whole discussion around functionality changes :)

adesjardins-stripe · 2022-06-09T18:51:32Z

Just to be clear, do you mean copying my new routine from cmd/basic into an adjacent directory (cmd/ratelimited, say) and then leaving "basic" exactly as it was (aside from some minor changes related to Fei's PR)?

That sounds perfectly reasonable to me—and probably what I should have done to begin with!

I can push that change here or I can break it up into 2 PRs:

The changes made by Fei that pertain to the protocol (prefix stripping, pipelining, etc...) from his old PR, along with those minor formatting changes.
Introduction of the new test command, which should just be the addition of the new main file and the report file.

dormando · 2022-06-09T20:41:56Z

Yup, exactly. I went out of my way to make it easy to do this, so it'd make sense to do this :) I've been doing this so long the only consistent thing I see in people's test patterns is... they're not super consistent. There are also sometimes tests or optimizations that only make sense in some conditions or for specific features. Putting them all into the same high level tool just makes the tool unusably complicated; thus people write more tools.

So for mctester the goal is more about coverage while allowing shared code so I can test in more conditions that're important to people, yet still have bench test code that makes sense for me as a developer. IE; the request pacing specifics lets me test pipelining/batching inside of memcached purposefully, while it doesn't end up reflecting user workloads that well.

dormando · 2022-06-09T20:42:22Z

Fwiw a new PR would be best, with the changes split up.

edit: though they wouldn't have to be two distinct PR's.

adesjardins-stripe · 2022-06-10T16:59:34Z

I think I will make it 2 PRs (as long as that's ok). The first will be the common changes introduced by Fei and the second will be the new test routine.

adesjardins-stripe added 2 commits June 2, 2022 11:54

Introduce finite test runs, repeatability, reporting, and fine-tuned …

9af203e

…controls

Generate keys using the global seed instead of the worker seed

0f3347f

adesjardins-stripe commented Jun 7, 2022

View reviewed changes

Add flag for enabling value comparisons after gets.

9dd544f

Add global RNG seed when using Zipf

433f036

adesjardins-stripe closed this Jun 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce finite test runs, repeatability, reporting, and fine-tuned controls #4

Introduce finite test runs, repeatability, reporting, and fine-tuned controls #4

adesjardins-stripe commented Jun 3, 2022

dormando commented Jun 3, 2022 •

edited

Loading

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

dormando Jun 9, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe Jun 7, 2022

adesjardins-stripe commented Jun 7, 2022

adesjardins-stripe commented Jun 7, 2022

dormando commented Jun 7, 2022

adesjardins-stripe commented Jun 8, 2022

dormando commented Jun 9, 2022

dormando commented Jun 9, 2022

adesjardins-stripe commented Jun 9, 2022

dormando commented Jun 9, 2022

dormando commented Jun 9, 2022 •

edited

Loading

adesjardins-stripe commented Jun 10, 2022

Introduce finite test runs, repeatability, reporting, and fine-tuned controls #4

Introduce finite test runs, repeatability, reporting, and fine-tuned controls #4

Conversation

adesjardins-stripe commented Jun 3, 2022

Preamble

Overview

Finite Test Runs

Value Comparisons

Run Reports

Repeatability

Cache Warming

Get-Set-Delete Ratio

Misc

Limitations

dormando commented Jun 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adesjardins-stripe commented Jun 7, 2022

adesjardins-stripe commented Jun 7, 2022

dormando commented Jun 7, 2022

adesjardins-stripe commented Jun 8, 2022

dormando commented Jun 9, 2022

dormando commented Jun 9, 2022

adesjardins-stripe commented Jun 9, 2022

dormando commented Jun 9, 2022

dormando commented Jun 9, 2022 • edited Loading

adesjardins-stripe commented Jun 10, 2022

dormando commented Jun 3, 2022 •

edited

Loading

dormando commented Jun 9, 2022 •

edited

Loading