Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming encoder #178

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Streaming encoder #178

wants to merge 10 commits into from

Conversation

Geal
Copy link

@Geal Geal commented Oct 17, 2022

Fix #154

⚠️ not ready yet, only the bare minimum was done to validate it in tokio and gzip, the rest wil come later

this is a follow up on the proposal for a poll_flush implementation. It is implemented on tokio::bufread::Encoder and in the macros. We then provide a StreamingEncoder that wraps a AsyncRead and AsyncFlush implementation to flush it if it produced nothing after a configurable time.

The test shows it appears to work, but I'd like to test it more using the proptest integration.
On the API side, I think it will look better as a .flush_timeout() method?

I looked into flushing once a configurable amount of data has been passed, but that requires hooking the data producer to count the bytes passing through, pass it to the Encoder, then have the StreamingEncoder monitor that count. That's a bit involved, maybe that could be a separate implementation.

@Geal
Copy link
Author

Geal commented Oct 19, 2022

@Nemo157 I cleaned it up a bit, to make it testable with other algorithms than just gzip. Should I really add implementations for other targets than tokio? Considering that support for tokio 0.2 and 0.3 is removed in #152, the stream feature is deprecated, and the futures-io target makes no assumption on the runtime, I only see tokio as a possible target here?
(this PR needs adherence to the runtime because it uses a timeout)

Another approach I'd like investigate is having the streaming encoder expose a handler of some kind (channel, whatever) that could be used to trigger a flush externally. The user would then decide when to flush, possibly based on timeouts, or other conditions (like the deferred responses in the graphql router). That would remove the link to the runtime, and let higher level libraries decide on the flushing strategy

@Geal
Copy link
Author

Geal commented Oct 19, 2022

with 87bf8f1 and cc86d79 there's an implementation driven by a channel. I think that's the cleaner approach here, since it makes no assumption on the platform, and that channel can be carried in different places depending on the use case (ex: in tower-http's CompressionLayer, the Encoder is hidden deeply in the ResponseFuture)

@Geal
Copy link
Author

Geal commented Oct 24, 2022

I removed the timeout based solutin, this should be enough to review now @Nemo157

@Geal Geal marked this pull request as ready for review October 24, 2022 10:10
The FlushableEncoder now accepts a Stream, which will be more flexible
than forcing a channel
@Geal
Copy link
Author

Geal commented Nov 2, 2022

@Nemo157 could you activate the CI on this PR?

@Nemo157
Copy link
Member

Nemo157 commented Nov 2, 2022

Hmm, reminds me I need to find out why github has changed the default to not running CI, afaik it's supposed to be secure against malicious PRs 🤔

I'll hopefully have some time to review this week.

@Geal
Copy link
Author

Geal commented Nov 15, 2022

@Nemo157 friendly ping! 😄 Is there any way I can make this easier to review for you?

@Geal
Copy link
Author

Geal commented Dec 7, 2022

@Nemo157 looks like the CI is passing. Do you think anything is missing from this PR?

@Geal
Copy link
Author

Geal commented Apr 18, 2023

@Nemo157 FYI I'll vendor part of the compression code in the router, so I can fix the bug on my end, so no pressure for you on this PR, review it when it's convenient :)

Geal added a commit to apollographql/router that referenced this pull request Apr 27, 2023
We replace tower-http's `CompressionLayer` with a custom stream transformation. This is necessary because tower-http uses async-compression, which buffers data until the end of the stream to then write it, ensuring a better compression. This is incompatible with the multipart protocol for `@defer`, which requires chunks to be sent as soon as possible. So we need to compress them independently.

This extracts parts of the codec module of async-compression, which so far is not public, and makes a streaming wrapper above it that flushes the compressed data on every response in the stream.

This is expected to be temporary, as we have in flight PRs for async-compression:
- Nullus157/async-compression#155
- Nullus157/async-compression#178

With Nullus157/async-compression#150 we might be able to at least remove the vendored code
@robjtede
Copy link
Member

@NobodyXu since this is additive, I propose we review this after the 0.4.0 release ?

@robjtede robjtede added the A-semver-minor non-breaking change label May 10, 2023
@NobodyXu
Copy link
Collaborator

Agree, I can't see how this could result in a breaking change.

@NobodyXu
Copy link
Collaborator

Plus it's not ready yet, so IMO we should go ahead and release 0.4.0

@reu reu mentioned this pull request May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-semver-minor non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encoder does not support intermediate flushes before the reader has ended
4 participants