use gzip to compress self profile json #1966

s7tya · 2024-08-22T13:00:28Z

We're currently not compressing the results from the /perf/processed-self-profile endpoint, which sometimes leads to downloading files over 800MB. To reduce load times, this PR proposes adding gzip compression, which is supported by Perfetto. Compressing an 850MB JSON file takes about 4.4 seconds, but the size of compressed data is about 30 MB so this should somewhat improve performance.

Also, I'm using headers::ContentType::from("application/gzip".parse::<mime::Mime>().unwrap()) because the mime crate doesn't support application/gzip and it seems to be unmaintained.
hyperium/mime#136

Kobzol

Thank you! Left a few comments.

site/Cargo.toml

Kobzol · 2024-08-23T07:03:10Z

site/src/request_handlers/self_profile.rs

@@ -127,6 +127,8 @@ pub async fn handle_self_profile_processed_download(
            ContentType::from("image/svg+xml".parse::<mime::Mime>().unwrap())
        } else if output.filename.ends_with("html") {
            ContentType::html()
+        } else if output.filename.ends_with("gz") {


I don't think that this is the correct way to do this. If we return GZIPed data from a web server, we should return the normal file extension (in this case .json) and use the MIME corresponding to it. And then separately, we should set Content-Encoding: gzip to specify that the file is being compressed. See https://stackoverflow.com/a/59999751/1107768.

I thought there was a difference between these two approaches. Using Content-Encoding lets browsers automatically decompress the content, whereas if you use Content-Type: gzip, the responsibility falls on the Perfetto side. By using Content-Encoding, we could potentially use Brotli instead of gzip, since almost all modern browsers support it. If we didn't have to worry about WebKit, we could even use zstd, though that might be a bit overkill.

And currently we're downloading blob and then move to perfetto's website, so things work like this:

compress (server) → fetch (client) → decompress (client) → PAGE TRANSITION → load (perfetto)
compress (server) → fetch (client) → PAGE TRANSITION → decompress & load (perfetto)

In this case, I thought it would be nice if we could reduce the time we wait on the RLO page after clicking "query trace" button as there's no animation or something that tells it's working

Yeah, there is indeed a difference! We load the data that we pass to Perfetto using "normal browser logic", i.e. the fetch call, which should do the decompression for us in the background (hopefully using optimized C/C++ code).

I get your idea about this:

compress (server) → fetch (client) → decompress (client) → PAGE TRANSITION → load (perfetto) compress (server) → fetch (client) → PAGE TRANSITION → decompress & load (perfetto)

that seems interesting. So the question is whether it is better to go to Perfetto faster, and let it unGZIP the data using its decompression algorithm (presumably implemented in some C++ -> JS way, hopefully in webassembly?) or if it is faster to let the browser decompress the data using its native code (which should, I expect, be faster than in Perfetto), and only then go to Perfetto. Using the second approach, we could even use e.g. Brotli, as you said (we even have support for it already in the website).

Without measuring this, I think that the browser decompression approach could be faster, especially if we switch to Brotli. If that is the case, we could resolve the slightly longer wait time until Perfetto opens by adding some loading indicator, that shouldn't be that difficult.

Could you please do some small benchmark to evaluate the two approaches? Ideally with as a large a file as possible 😆 You could download the 600 MiB cargo trace locally, and simply load it from disk (and then compress & return it) on your local endpoint, to avoid overloading our live server. The goal of the benchmark would be to evaluate:

What time it takes to: return .gz from server, pass it to Perfetto, let it decompress it

What time it takes to: return .json from server with Content-encoding: gzip, let the browser decompress it, then pass it to Perfetto

Ideally, for both cases, it would be great to know both the duration of the actual file download (which will include also decompression time with approach 2), and also the end-to-end duration between clicking and the trace being fully loaded in Perfetto (you can use a stopwatch xD).

Alright. I've created new PR #1968 because they're much different. So mark the related conversations resolved.
and the brotli is so much faster than the current gzip one. it only takes 0.5s to compress...

site/src/self_profile.rs

Kobzol · 2024-08-23T07:05:02Z

site/src/self_profile.rs

@@ -37,9 +40,15 @@ pub fn generate(
        ProcessorType::Crox => {
            let opt = serde_json::from_str(&serde_json::to_string(&params).unwrap())
                .context("crox opts")?;
+            let data = crox::generate(self_profile_data, opt).context("crox")?;
+
+            let mut encoder = GzEncoder::new(Vec::new(), Compression::default());


Could you please experiment with the Compression levels (there is also a constructor Compression::fast() and Compression::best()) to see what is the performance/result size trade-off? Maybe we could find some better options than the defaults.

site/src/self_profile.rs

Kobzol reviewed Aug 23, 2024

View reviewed changes

s7tya mentioned this pull request Aug 23, 2024

use brotli to compress self profile response #1968

Merged

use gzip to compress self profile json

ff60e43

s7tya force-pushed the gzip-profile branch from e8b072d to ff60e43 Compare August 23, 2024 10:08

s7tya marked this pull request as draft August 23, 2024 10:20

s7tya closed this Aug 23, 2024

s7tya deleted the gzip-profile branch October 11, 2024 12:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use gzip to compress self profile json #1966

use gzip to compress self profile json #1966

s7tya commented Aug 22, 2024

Kobzol left a comment

Kobzol Aug 23, 2024

s7tya Aug 23, 2024

s7tya Aug 23, 2024

Kobzol Aug 23, 2024 •

edited

Loading

s7tya Aug 23, 2024

Kobzol Aug 23, 2024

use gzip to compress self profile json #1966

use gzip to compress self profile json #1966

Conversation

s7tya commented Aug 22, 2024

Kobzol left a comment

Choose a reason for hiding this comment

Kobzol Aug 23, 2024

Choose a reason for hiding this comment

s7tya Aug 23, 2024

Choose a reason for hiding this comment

s7tya Aug 23, 2024

Choose a reason for hiding this comment

Kobzol Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

s7tya Aug 23, 2024

Choose a reason for hiding this comment

Kobzol Aug 23, 2024

Choose a reason for hiding this comment

Kobzol Aug 23, 2024 •

edited

Loading