Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fetcher): first step at optimization #30

Merged
merged 12 commits into from
Jan 9, 2025
Merged

Conversation

hannahhoward
Copy link
Member

@hannahhoward hannahhoward commented Dec 14, 2024

Goals

First pass to optimize TTFB and bandwidth on the batching fetcher

Implementation

see https://www.notion.so/storacha/Notes-on-Gateway-optimization-15c5305b55248022ad9be861df4899fb?pvs=4

The final algorithm that I settled on, which balances resource usage with optimal TTFB & bandwidth, is as follows:

  • during a batch fetch, return blocks to the caller as soon as they are available (previously, the entire batch would need to be processed before any blocks are returned to the caller)
  • remove multipart range requests, opting for a single range request that covers the entire batch. this may potentially grab a few extra bytes, but multipart range request support varies a lot server to server, and I'd rather not require it as we move into a decentralized world.
  • when multiple batch requests are present at the same time, queue them up as follows:
    • make a request
    • as soon as first byte is received, kick off the next request while processing the response body of the first request
    • iterate through batches but make sure you only have at most two requests that are processing blocks (so memory usage is never more than batch size * 2). IOW, if you have received first byte on 2 requests, do not kick off any more until at least one finishes processing.
    • this should produce a steady stream of blocks that can be processed as fast as any client can handle
    • implementing this algorithm does require the use of async generators
  • this also adds some helpful tracing helpers. I'm not sure if these should live in this library though.

For discussion

Future suggestion: freeway should be a monorepo, and this should be part of freeway.

Final trace:
Screenshot 2025-01-03 at 8 42 11 PM

@hannahhoward hannahhoward changed the title WIP: Optimization Tracing and Optimization Dec 25, 2024
@hannahhoward hannahhoward requested a review from alanshaw January 4, 2025 04:31
@hannahhoward hannahhoward marked this pull request as ready for review January 4, 2025 04:31
@hannahhoward hannahhoward changed the title Tracing and Optimization feat(fetcher): first step at optimization Jan 4, 2025
allows simple & batching fetchers to use a custom fetch implementation. also exposes tracing
library.
@alanshaw
Copy link
Member

alanshaw commented Jan 4, 2025

I’m hesitant to move away from multipart byte range requests. I really feel that the client should ask for exactly what it wants and the server should decide whether it’s optimal to “over share” ranges because they’re close together. Not using byte range requests forces the server side to over-egress with no real recourse i.e. no-one is going to be willing to pay for it. I’m not sure how we justify this in the decentralized network?

I’d like to see some evidence of multipart byte range support varying per server, and some reasoning for why it would cause us problems. It’s well spec’d and in my experience seems consistent between implementations. Note we don’t need to mandate support for multipart, but there would be no incentive to not support it because you’d end up being a slow server, requiring a request per block (assuming we’d not send requests to cover multiple blocks since the extra egress is not accountable).

feat(fetcher): allow passing a custom fetch implementation
@alanshaw
Copy link
Member

alanshaw commented Jan 5, 2025

I have slept on this and I’m not totally against it. I think that the multipart header in a response is likely larger than a block header within a CAR file. So actually requesting multiple blocks in a single request might actually be less egress than the equivalent multipart request, assuming the blocks are next to each other (which is also likely) and we don't allow the distance between blocks to be larger than say, your average multipart header. I don’t have any evidence for this, but I reckon it’s probably true 😜.

@alanshaw
Copy link
Member

alanshaw commented Jan 6, 2025

Future suggestion: freeway should be a monorepo, and this should be part of freeway.

I'm fine with this going forwards. Gateway related libraries were originally separate because freeway was just one of a few different gateways we implemented e.g. autobahn - we needed related libraries in multiple implementations. Arguably this should have been part of gateway-lib originally.

@alanshaw
Copy link
Member

alanshaw commented Jan 6, 2025

I'm not clear on the performance benefits this change has had? Is there a comparable before/after trace that I can look at?

I'm just wondering if simply altering the existing code to resolve blocks as soon as they are processed would yield basically the same results?

// get last byte to fetch
const aggregateRangeEnd = resolvedBlobs.reduce((aggregateEnd, r) => r.range[1] > aggregateEnd ? r.range[1] : aggregateEnd, 0)
// fetch bytes from the first starting byte to the last byte to fetch
const headers = { Range: `bytes=${resolvedBlobs[0].range[0]}-${aggregateRangeEnd}` }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this doesn't account for the space between blocks and it would be easy to upload a CAR file where the blocks are not ordered and thus would cause the worker to download far more data than needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I have seen this type of thing where DAGs have been generated and stored to disk using for example leveldb and then streamed back out in key order (effectively random) to be stored.

fix(blob-fetcher): remove unused package
@hannahhoward
Copy link
Member Author

@alanshaw I've reverted back to using multipart ranges, let me know if you need any additional changes here

Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/api.ts Outdated
* [-100]
* ```
*/
export type Range = AbsoluteRange | SuffixRange
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not removing multipart-byte-range can we just re-use or re-export the types from there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right :)

@hannahhoward hannahhoward merged commit c2c609e into main Jan 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants