feat: use compression streams to decompress responses #661

kettanaito · 2024-10-16T10:15:55Z

Uses the Compression Streams API to decompress encoded fetch responses.

Why

Using the Compression Streams API has some advantages:

It's a standard API that we don't have to install/implement/keep up-to-date;
It's designed to work with web streams, unlike Undici's zlib-based approach;
It can be used consistently in Node.js and in the browser.

It does, however, have one significant downside:

No support for Brotli compression.

We may consider investing effort into web stream-based Brotli compression. From what I've seen, it's usually a C package loaded via WASM. None of the packages I found are web stream-friendly though, we'd have to abstract on top of them, which isn't ideal (Node.js still helps a lot with that via Readable.toWeb/fromWeb). Loading a WASM also can impact performance.

Blockers

Brotli compression
- Trying to use brotli-wasm, which is overall amazing, but has trouble initializating the WASM in Node.js due to two reasons:
  - Undici does not support file URL scheme (Provide a more descriptive error on fetching "file" scheme nodejs/undici#3741)
  - But brotli-wasm attempts a file fetch, to begin with, because it resolves the import export condition to the web build always: Incorrect "exports" in package.json httptoolkit/brotli-wasm#38

pimterry · 2024-10-16T15:09:16Z

@kettanaito you might be interested in https://www.npmjs.com/package/http-encoding. I'm using this to do something very similar - using it both in Node and in a webpack-bundled web app (decoding, editing & re-encoding data within HTTP Toolkit).

In my case for unrelated reasons I'm not really worried about streaming here, but it would be a very reasonable addition.

That package supports Brotli (via brotli-wasm) and also zstd (rapidly growing in real usage & client support, not supported by compression streams API AFAIK) and base64 (non-standard & weird, but surprisingly common IME), and does things like unwrapping multiple chained encodings automatically and dealing with various common-but-technically-incorrect encodings like content-encoding: text.

Depends on which direction you're going and whether you need this inline here, but if you wanted to avoid duplicating effort and make this functionality more widely available to others, I'd very happily accept a PR to extend that package to support streaming too & to use the compression streams API where available.

kettanaito · 2024-10-17T13:48:44Z

Hi, @pimterry. Thank you for sharing your insights.

The http-encoding package sounds interesting. Does it ship as ESM? I'm looking for solutions that have ESM support and can run natively in the browser without any bundling. The MSW ecosystem is, sadly, still sitting on two chairs of CJS and ESM, and with the promised multi-environment support the selection of tools we can rely on is rather slim.

The standard Compression Streams API gives us 3 out of 4 encodings we need. I would love to utilize that! That's why I'm primarily looking for the ways to use brotli-wasm in Interceptors to implement our own BrotliDecompressionStream. That is more than possible, the main challenge is to get the WASM to load in different environments.

I believe if you find httptoolkit/brotli-wasm#38 as beneficial, that should solve the blocker to adopt brotli-wasm in Interceptors. At least, the known blockers for now.

pimterry · 2024-10-18T16:29:01Z

The http-encoding package sounds interesting. Does it ship as ESM? I'm looking for solutions that have ESM support and can run natively in the browser without any bundling. The MSW ecosystem is, sadly, still sitting on two chairs of CJS and ESM, and with the promised multi-environment support the selection of tools we can rely on is rather slim.

Not ESM yet, but PRs welcome. Personally I'm mainly focus on the HTTP Toolkit use case, where my main audience is the application users not library users, so ESM hasn't been top of my todo list, but I do agree it's the right time to move over.

The http-encoding package itself isn't doing complicated at all, so ESM that shouldn't be a problem for it in isolation. It's just parsing content-encoding headers and pulling together relevant packages/Node APIs with a single consistent cross-platform entrypoints. The main challenge would be ESM-ing brotli-wasm (thanks for your help there) and any changes require for ESM compatibility in zstd-codec (I don't know if anything is required there, and not my package, but the maintainer has been very helpful & responsive in the past).

Note that HTTP encodings are still an evolving space, so I wouldn't assume you'll be "done" once you have the small set you're looking at above. Zstandard is genuinely very neat and I'd expect it to become the most popular compression for dynamic content in the near future (much faster for good results, so Brotli for static data & Zstd for dynamic is really effective model, and it's hitting widespread client support now). And just this week Chrome has shipped https://www.debugbear.com/blog/shared-compression-dictionaries in Chrome stable, which defines more new HTTP compression formats on top of Brotli & Zstandard (compression with pre-shared context, e.g. sending just a file diff from the previous version, among many other use cases - this is going to offer huge boosts for many use cases in web & APIs). That has public support from FF & Safari too, and will require quite a more work once those start being widely used.

kettanaito · 2024-10-22T11:22:50Z

I've split the decompression logic to be environment-dependent. The same fetch interceptor will load different BrotliDecompressionStream in Node.js and in the browser. For now, I've implemented only the Node.js side, leaving the browser to print a warning and do nothing.

Now that we have the environment separation, we can use brotli-wasm just for the browser side of the decompression.

kettanaito · 2024-10-22T11:39:08Z

Brotli decompression in the browser

The best course of action is to skip the Brotli decompression in the browser. There are two reasons for it.

Reason 1: Low usage

The fetch interceptor is not intended to be used in the browser. MSW relies on the Service Worker, and the fetch interceptor is only used as a fallback mechanism if the worker API is not available. Thus, not providing full feature parity between Node.js and the browser is more than expected.

Reason 2: WASM

Loading a WASM is complicated. For example, here's an error I get from webpack in our test suite:

        moduleIdentifier: '/Users/kettanaito/Projects/mswjs/interceptors/node_modules/.pnpm/[email protected]/node_modules/brotli-wasm/pkg.bundler/brotli_wasm_bg.wasm',
        moduleName: './node_modules/.pnpm/[email protected]/node_modules/brotli-wasm/pkg.bundler/brotli_wasm_bg.wasm',
        loc: '1:0',
        message: "Module parse failed: Unexpected character '\x00' (1:0)\n" +
          'The module seem to be a WebAssembly module, but module is not flagged as WebAssembly module for webpack.\n' +
          'BREAKING CHANGE: Since webpack 5 WebAssembly is not enabled by default and flagged as experimental feature.\n' +
          "You need to enable one of the WebAssembly experiments via 'experiments.asyncWebAssembly: true' (based on async modules) or 'experiments.syncWebAssembly: true' (like webpack 4, deprecated).\n" +
          `For files that transpile to WebAssembly, make sure to set the module type in the 'module.rules' section of the config (e. g. 'type: "webassembly/async"').\n` +
          '(Source code omitted for this binary file)',

This means that I need to configure my compiler to understand that a certain module is supposed to be WASM. I can certainly do that, but I won't ask my users to. They will be faced with similar errors from their compilers, and it's everyone's least favorite chore to tweak these things.

Conclusion

We are shipping full decompression support in Node.js via DecompressionStream + a custom Brotli decompression stream on top of zlib.
We are shipping GZIP + Deflate decompression support in the browser.
We are not shipping Brotli decompression support in the browser. The user will see a warning, and the decompression stream will be passthrough.

feat: use compression streams to decompress responses

d4d4939

kettanaito mentioned this pull request Oct 16, 2024

fix(fetch): support Content-Encoding response header #604

Merged

KhafraDev mentioned this pull request Oct 16, 2024

Add "decompress" response utility nodejs/undici#3423

Open

15 tasks

fix: use environment-based brotli decompression

089d00e

kettanaito added 2 commits October 22, 2024 13:52

fix(fetch): implement brotli as passthrough in the browser

858e18e

test: add response decompression tests in the browser

ee8dc4c

kettanaito merged commit a5447cd into Michael/support-fetch-content-encoding Oct 22, 2024
2 checks passed

kettanaito deleted the feat/use-compression-streams branch October 22, 2024 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use compression streams to decompress responses #661

feat: use compression streams to decompress responses #661

kettanaito commented Oct 16, 2024 •

edited

Loading

pimterry commented Oct 16, 2024

kettanaito commented Oct 17, 2024

pimterry commented Oct 18, 2024

kettanaito commented Oct 22, 2024

kettanaito commented Oct 22, 2024

feat: use compression streams to decompress responses #661

feat: use compression streams to decompress responses #661

Conversation

kettanaito commented Oct 16, 2024 • edited Loading

Why

Blockers

pimterry commented Oct 16, 2024

kettanaito commented Oct 17, 2024

pimterry commented Oct 18, 2024

kettanaito commented Oct 22, 2024

kettanaito commented Oct 22, 2024

Brotli decompression in the browser

Reason 1: Low usage

Reason 2: WASM

Conclusion

kettanaito commented Oct 16, 2024 •

edited

Loading