Skip to content
This repository has been archived by the owner on Sep 21, 2024. It is now read-only.

feat!: Replace Bundle with CAR streams in push #624

Merged
merged 2 commits into from
Sep 12, 2023
Merged

Conversation

cdata
Copy link
Collaborator

@cdata cdata commented Sep 8, 2023

This change removes all usage of Bundle within our network code, replacing it with CARv1-formatted byte streams. These byte streams are produced by literal Streams, which enable us to send large amounts of data back and forth without necessarily ballooning memory (though still subject to buffering by the Stream consumer).

reqwest streaming complications

One complicating factor I encountered in this change is that our HTTP client of choice (reqwest) brings some caveats when considering streaming requests.

First, it's important to note: we prefer reqwest because it is adaptive to our Wasm targets, preferring a hyper-based implementation on native targets but substituting a fetch-based API on web.

Sync streams

The first challenge I hit is that all streams used in reqwest bodies must meet a Sync bound, and most of our streams do not rigorously enforce this bound at this time (it seemed plausible to me that we might be able to meet it with some refactoring). Although in theory it would be interesting to share a stream across many threads, where each thread polls the stream so that chunks could be processed in parallel, in practice we do not have this use case today. For details on why this bound is present in reqwest, please follow the links in this issue I filed against the project: seanmonstar/reqwest#1969

To resolve this issue, I have adapted a suggestion from a discussion in Rust internals. We now have a helper called Unshared that can wrap !Sync things and add an unsafe Sync implementation to them. The wrapper only allows mutable borrows of the wrapped value, which presumably are exclusive, justifying the Sync bound. Additionally I added an UnsharedStream wrapper that implements Stream and gives us a Stream + Sync for any Stream + Send that we wrap it with (via Unshared), enabling us to satisfy the bounds in reqwest streaming bodies.

Wasm streams

I encountered another complication with reqwest/Wasm: although streaming request bodies via fetch are possible thanks to ReadableStream, reqwest/Wasm does not implement support for them.

So, to work around this, I added a conditional compilation branch to our client for push requests on wasm32 targets. The upshot is that on web, we use a different request method when pushing that uses fetch via gloo-net with a ReadableStream body.

Crate refactor

This change ended up leading to a major refactor of our crates. Implementing streaming push means adding streaming support to our API client. However, we have (on main) an existing dependency edge from noosphere-sphere (where streaming is implemented) to noosphere-api (where the client is implemented). So, adding streaming support to the API client would have created a dependency cycle between our crates (which is not allowed). Originally, both noosphere-api and noosphere-sphere were inlined to the noosphere-core crate, so I decided to migrate us back to that state with this change as it resolves the cyclical dependency. The noosphere-sphere and noosphere-api crates will remain in-tree until we have the chance to publish a final, +deprecated version for both of them.

Breaking changes include:

  • Removes noosphere-sphere and noosphere-api crates, consolidating them with noosphere-core
  • Deprecates gateway route /api/v0alpha1/push in favor of /api/v0alpha2/push
  • Deprecates all usages of Bundle
  • Removes SphereDb::put_block_stream (replaced with more generalized helper)
  • All test helpers are re-organized, now all appearing within helpers modules inside of their domain-relevant crates, and require the helpers feature to be used across crates
  • test_kubo feature renamed to test-kubo

@cdata cdata force-pushed the feat/streaming-push-api branch 10 times, most recently from fe7a8e4 to 899d3fb Compare September 10, 2023 18:14
@cdata cdata marked this pull request as ready for review September 11, 2023 16:09
@cdata cdata requested a review from jsantell September 11, 2023 16:10
jsantell
jsantell previously approved these changes Sep 11, 2023
Copy link
Contributor

@jsantell jsantell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Mostly nit comments, main suggestions are to only use a single API export from noosphere-core (e.g. v0alpha2) throughout the codebase (via re-exporting v0alpha1 params and routes inside v0alpha2`, or through other means)

rust/noosphere-common/src/sync.rs Show resolved Hide resolved
rust/noosphere-common/src/task.rs Show resolved Hide resolved
rust/noosphere-core/src/authority/author.rs Show resolved Hide resolved
rust/noosphere-core/src/authority/authorization.rs Outdated Show resolved Hide resolved
rust/noosphere/Cargo.toml Outdated Show resolved Hide resolved
rust/noosphere/tests/cli.rs Show resolved Hide resolved
rust/noosphere/tests/distributed_stress.rs Show resolved Hide resolved
rust/noosphere/tests/gateway.rs Show resolved Hide resolved
rust/noosphere/tests/gateway.rs Outdated Show resolved Hide resolved
@cdata cdata merged commit 9390797 into main Sep 12, 2023
16 checks passed
@cdata cdata deleted the feat/streaming-push-api branch September 12, 2023 17:47
@github-actions github-actions bot mentioned this pull request Sep 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants