Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use compression streams to decompress responses #661

Merged

Conversation

kettanaito
Copy link
Member

@kettanaito kettanaito commented Oct 16, 2024

Uses the Compression Streams API to decompress encoded fetch responses.

Why

Using the Compression Streams API has some advantages:

  • It's a standard API that we don't have to install/implement/keep up-to-date;
  • It's designed to work with web streams, unlike Undici's zlib-based approach;
  • It can be used consistently in Node.js and in the browser.

It does, however, have one significant downside:

  • No support for Brotli compression.

We may consider investing effort into web stream-based Brotli compression. From what I've seen, it's usually a C package loaded via WASM. None of the packages I found are web stream-friendly though, we'd have to abstract on top of them, which isn't ideal (Node.js still helps a lot with that via Readable.toWeb/fromWeb). Loading a WASM also can impact performance.

Blockers

@pimterry
Copy link

@kettanaito you might be interested in https://www.npmjs.com/package/http-encoding. I'm using this to do something very similar - using it both in Node and in a webpack-bundled web app (decoding, editing & re-encoding data within HTTP Toolkit).

In my case for unrelated reasons I'm not really worried about streaming here, but it would be a very reasonable addition.

That package supports Brotli (via brotli-wasm) and also zstd (rapidly growing in real usage & client support, not supported by compression streams API AFAIK) and base64 (non-standard & weird, but surprisingly common IME), and does things like unwrapping multiple chained encodings automatically and dealing with various common-but-technically-incorrect encodings like content-encoding: text.

Depends on which direction you're going and whether you need this inline here, but if you wanted to avoid duplicating effort and make this functionality more widely available to others, I'd very happily accept a PR to extend that package to support streaming too & to use the compression streams API where available.

@kettanaito
Copy link
Member Author

Hi, @pimterry. Thank you for sharing your insights.

The http-encoding package sounds interesting. Does it ship as ESM? I'm looking for solutions that have ESM support and can run natively in the browser without any bundling. The MSW ecosystem is, sadly, still sitting on two chairs of CJS and ESM, and with the promised multi-environment support the selection of tools we can rely on is rather slim.

The standard Compression Streams API gives us 3 out of 4 encodings we need. I would love to utilize that! That's why I'm primarily looking for the ways to use brotli-wasm in Interceptors to implement our own BrotliDecompressionStream. That is more than possible, the main challenge is to get the WASM to load in different environments.

I believe if you find httptoolkit/brotli-wasm#38 as beneficial, that should solve the blocker to adopt brotli-wasm in Interceptors. At least, the known blockers for now.

@pimterry
Copy link

The http-encoding package sounds interesting. Does it ship as ESM? I'm looking for solutions that have ESM support and can run natively in the browser without any bundling. The MSW ecosystem is, sadly, still sitting on two chairs of CJS and ESM, and with the promised multi-environment support the selection of tools we can rely on is rather slim.

Not ESM yet, but PRs welcome. Personally I'm mainly focus on the HTTP Toolkit use case, where my main audience is the application users not library users, so ESM hasn't been top of my todo list, but I do agree it's the right time to move over.

The http-encoding package itself isn't doing complicated at all, so ESM that shouldn't be a problem for it in isolation. It's just parsing content-encoding headers and pulling together relevant packages/Node APIs with a single consistent cross-platform entrypoints. The main challenge would be ESM-ing brotli-wasm (thanks for your help there) and any changes require for ESM compatibility in zstd-codec (I don't know if anything is required there, and not my package, but the maintainer has been very helpful & responsive in the past).

Note that HTTP encodings are still an evolving space, so I wouldn't assume you'll be "done" once you have the small set you're looking at above. Zstandard is genuinely very neat and I'd expect it to become the most popular compression for dynamic content in the near future (much faster for good results, so Brotli for static data & Zstd for dynamic is really effective model, and it's hitting widespread client support now). And just this week Chrome has shipped https://www.debugbear.com/blog/shared-compression-dictionaries in Chrome stable, which defines more new HTTP compression formats on top of Brotli & Zstandard (compression with pre-shared context, e.g. sending just a file diff from the previous version, among many other use cases - this is going to offer huge boosts for many use cases in web & APIs). That has public support from FF & Safari too, and will require quite a more work once those start being widely used.

@kettanaito
Copy link
Member Author

I've split the decompression logic to be environment-dependent. The same fetch interceptor will load different BrotliDecompressionStream in Node.js and in the browser. For now, I've implemented only the Node.js side, leaving the browser to print a warning and do nothing.

Now that we have the environment separation, we can use brotli-wasm just for the browser side of the decompression.

@kettanaito
Copy link
Member Author

Brotli decompression in the browser

The best course of action is to skip the Brotli decompression in the browser. There are two reasons for it.

Reason 1: Low usage

The fetch interceptor is not intended to be used in the browser. MSW relies on the Service Worker, and the fetch interceptor is only used as a fallback mechanism if the worker API is not available. Thus, not providing full feature parity between Node.js and the browser is more than expected.

Reason 2: WASM

Loading a WASM is complicated. For example, here's an error I get from webpack in our test suite:

        moduleIdentifier: '/Users/kettanaito/Projects/mswjs/interceptors/node_modules/.pnpm/[email protected]/node_modules/brotli-wasm/pkg.bundler/brotli_wasm_bg.wasm',
        moduleName: './node_modules/.pnpm/[email protected]/node_modules/brotli-wasm/pkg.bundler/brotli_wasm_bg.wasm',
        loc: '1:0',
        message: "Module parse failed: Unexpected character '\x00' (1:0)\n" +
          'The module seem to be a WebAssembly module, but module is not flagged as WebAssembly module for webpack.\n' +
          'BREAKING CHANGE: Since webpack 5 WebAssembly is not enabled by default and flagged as experimental feature.\n' +
          "You need to enable one of the WebAssembly experiments via 'experiments.asyncWebAssembly: true' (based on async modules) or 'experiments.syncWebAssembly: true' (like webpack 4, deprecated).\n" +
          `For files that transpile to WebAssembly, make sure to set the module type in the 'module.rules' section of the config (e. g. 'type: "webassembly/async"').\n` +
          '(Source code omitted for this binary file)',

This means that I need to configure my compiler to understand that a certain module is supposed to be WASM. I can certainly do that, but I won't ask my users to. They will be faced with similar errors from their compilers, and it's everyone's least favorite chore to tweak these things.

Conclusion

  • We are shipping full decompression support in Node.js via DecompressionStream + a custom Brotli decompression stream on top of zlib.
  • We are shipping GZIP + Deflate decompression support in the browser.
  • We are not shipping Brotli decompression support in the browser. The user will see a warning, and the decompression stream will be passthrough.

@kettanaito kettanaito merged commit a5447cd into Michael/support-fetch-content-encoding Oct 22, 2024
2 checks passed
@kettanaito kettanaito deleted the feat/use-compression-streams branch October 22, 2024 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants