Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K6 seems to time decompression of Content-Encoding: gzip responses #2685

Open
jberryman opened this issue Sep 18, 2022 · 2 comments
Open

K6 seems to time decompression of Content-Encoding: gzip responses #2685

jberryman opened this issue Sep 18, 2022 · 2 comments
Labels
breaking change for PRs that need to be mentioned in the breaking changes section of the release notes bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 new-http issues that would require (or benefit from) a new HTTP API performance ux

Comments

@jberryman
Copy link

Brief summary

It appears that k6 includes the time it takes to decompress a gzip’ed response body in http_req_duration, when not discardResponseBodies.

This caused me a lot of time debugging, as I was attempting to optimize our compressed code path and wasn't seeing the results from k6 align with my micro benchmarks.

The time to decompress a response body is useful though (one might be trying to optimize for page load time if serving html, e.g.), but otherwise I think this is a surprising default. Can I suggest including this separately?

loosely related: #2586

k6 version

0.34.0

OS

linux

Docker version and image (if applicable)

No response

Steps to reproduce the problem

Large request with Accept-Encoding: gzip, then try w/ discardResponseBodies

Expected behaviour

Don't time body decompression or validation, or include it separately.

Actual behaviour

Decompression is included in reported latency.

@jberryman jberryman added the bug label Sep 18, 2022
@na-- na-- added evaluation needed proposal needs to be validated or tested before fully implementing it in k6 breaking change for PRs that need to be mentioned in the breaking changes section of the release notes new-http issues that would require (or benefit from) a new HTTP API ux performance labels Sep 19, 2022
@na--
Copy link
Member

na-- commented Sep 19, 2022

🤔 Indeed, the time spend decompressing the body will be measured simply because the body is decompressed on-the-fly, while it's being read from the http.Response's Body (which is a io.ReadCloser and likely a stream) by this function.

I am wondering though, this decompression should happen pretty quickly on modern machines 🤔 I see we are using a mixture of github.com/klauspost/compress, github.com/andybalholm/brotli, and the stdlib's compress/gzip and compress/zlib. So maybe we should switch everything we can to the very optimized klauspost/compress library and run some benchmarks 🤔

switch compression {
case CompressionTypeDeflate:
decoder, err = zlib.NewReader(rc)
case CompressionTypeGzip:
decoder, err = gzip.NewReader(rc)
case CompressionTypeZstd:
decoder, err = zstd.NewReader(rc)
case CompressionTypeBr:
decoder = brotli.NewReader(rc)

I also see you are using a pretty old k6 version (v0.34.0, released ~1y ago). I suggest updating to the latest one (v0.40.0, released a couple of weeks ago) and seeing if you notice any improvements. In general, on-the-fly decompression should be pretty unnoticeable if you haven't undersized the machine k6 will run on.

In any case, to get back on topic - changing how k6 does the http_req_duration measurement will be both a breaking change (a new k6 version will start measuring the same things a bit differently) and will also have negative performance implications. We'll need more RAM for every compressed response, since we'll first need to read the compressed body from the wire, store it and then decompress it separately. Given both of these negative consequences (breaking change + performance regression), I don't think we'll ever merge something like this in the current k6/http API.

That said, we plan to create a new k6 HTTP API (#2461) that addresses some of the many issues with the current one, hopefully soon... 🤞 That's why I won't close this issue outright - we might figure out some way to address it and satisfy your use case without making performance or UX compromises.

The big problem I can see here is that k6 automatically decompresses the response bodies and there is no way to disable that behavior. If that part is configurable and can be disabled, then the measurement of http_req_duration will be what you want. But, I assume you still need to access the uncompressed bodies, otherwise you'd have just used discardResponseBodies? So, if we add a way to disable the automatic body decompression in the new HTTP API, then we'll also need to add an optimized API (i.e. Go code, not pure JS) for manually decompressing them afterwards. Which might not be a bad idea for other reasons, but it's still one more API we'll need to maintain, so more complex than just adding an option somewhere... 🤔

Still, all of this is worth considering, so thank you for opening this issue! 🙇

@jberryman
Copy link
Author

on-the-fly decompression should be pretty unnoticeable

I'll just reiterate that this isn't the case for our API. We saw some test cases get what looked like 15% slower with Accept-Encoding: gzip, microbenchmarks showed libdeflate ought to be twice as fast as zlib for all our response bodies, yet k6 results just showed 2-3% improvement when migrating to libdeflate

hasura-bot pushed a commit to hasura/graphql-engine that referenced this issue Sep 21, 2022
…E IMPROVEMENT)

See: grafana/k6#2685

It might be interesting to think about taking into consideration decompression time when thinking about performance, but In general I think doing so is surprising and I wasted a lot of time trying to figure out why my optimizations to the compression codepath weren't improving things  to the degree I expected

The downside here is we lose error reporting, so you'll need to only set
discardResponseBodies: true after the query has been tested.

PR-URL: hasura/graphql-engine-mono#5940
GitOrigin-RevId: 82a589a59b93f10ffb5391e4a3190459fb6e613b
hasura-bot pushed a commit to hasura/graphql-engine that referenced this issue Mar 26, 2024
…E IMPROVEMENT)

See: grafana/k6#2685

It might be interesting to think about taking into consideration decompression time when thinking about performance, but In general I think doing so is surprising and I wasted a lot of time trying to figure out why my optimizations to the compression codepath weren't improving things  to the degree I expected

The downside here is we lose error reporting, so you'll need to only set
discardResponseBodies: true after the query has been tested.

PR-URL: hasura/graphql-engine-mono#5940
GitOrigin-RevId: 3b088b8e66cec9fde603d3c8613724d6e03f01c5
hasura-bot pushed a commit to hasura/graphql-engine that referenced this issue Apr 3, 2024
…E IMPROVEMENT)

See: grafana/k6#2685

It might be interesting to think about taking into consideration decompression time when thinking about performance, but In general I think doing so is surprising and I wasted a lot of time trying to figure out why my optimizations to the compression codepath weren't improving things  to the degree I expected

The downside here is we lose error reporting, so you'll need to only set
discardResponseBodies: true after the query has been tested.

PR-URL: hasura/graphql-engine-mono#5940
GitOrigin-RevId: 3b088b8e66cec9fde603d3c8613724d6e03f01c5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change for PRs that need to be mentioned in the breaking changes section of the release notes bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 new-http issues that would require (or benefit from) a new HTTP API performance ux
Projects
None yet
Development

No branches or pull requests

2 participants