Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content Addressable Bundles #638

Open
hayatoito opened this issue Mar 17, 2021 · 4 comments
Open

Content Addressable Bundles #638

hayatoito opened this issue Mar 17, 2021 · 4 comments

Comments

@hayatoito
Copy link
Collaborator

Let me file an issue to discuss a proposal: Resource Loading with Content Addressable Bundles.

This is a strawperson proposal at very early stage, and this can be positioned as "my personal proposal", as of now.

I've put it in explainer directory, however, if that is inappropriate and confusing, I'll move it to somewhere. Maybe we can have proposals directory, as WICG/webcomponenets has. I am not sure what is the best practice in this repository.

I hope we can use this issue to discuss and get early feedback.
I've observed that PRs are not a good place to discuss overall design and ideas because we are likely to lose the discussion history. An issue might be a better place to discuss.

cc: @littledan @jyasskin @yoavweiss

hayatoito added a commit to hayatoito/webpackage that referenced this issue Mar 17, 2021
- Add a link to issue WICG#638
- Add "URL integrity" to Goal
- Add "Resource Batch Preloading" to references
hayatoito added a commit to hayatoito/webpackage that referenced this issue Mar 17, 2021
- Add a link to issue WICG#638
- Add "URL integrity" to Goal
- Add "Resource Batch Preloading" to references
hayatoito added a commit to hayatoito/webpackage that referenced this issue Mar 17, 2021
- Add a link to issue WICG#638
- Add "URL integrity" to Goal
- Add "Resource Batch Preloading" to references
@horo-t
Copy link
Collaborator

horo-t commented Mar 17, 2021

Thank you for filing this issue. As I mentioned at today’s meeting, the goal of this proposal is still unclear for me.

  • This proposal aims to support Code Splitting, as webpack or other bundlers
    already support as a user-land solution. Smaller bundles, if used correctly,
    can have a major impact on load time.

As written in this page (https://webpack.js.org/guides/code-splitting/), the purpose of Code Splitting is to “have a major impact on load time”. It would be helpful if you can write more precisely how this proposal can support this purpose, and why it is better than just using the existing proposal "Subresource loading with Web Bundles".

  • Non-opinionated about bundle granularity. There are trade-offs how a site
    composes their resources into bundles in order to balance various factors like
    total bytes transferred, loading latency, or cache granularity. Instead of a
    all-or-nothing bundle, this proposal aims to provide a way to express a
    dependency graph of bundles. The use of bundlers is an established practice in
    Web development. Bundlers, such as webpack, skypack, would know much about
    which resources should be grouped as a bundle, and might want to express their
    intent as a dependency graph of bundles, considering various trade-offs. They
    wouldn't want to lose this information in building an all-or-nothing bundle,
    and a browser wants to know it to improve a loading performance.

I understand the importance of carefully maintaining the dependency graph of modules while developing.
But I don't know why it is important to keep the dependency graph inside bundle format while serving the codes to browsers. Who will be happy if the precise dependency graph is provided to the browser via bundle format? Could you please explain the use cases of it?

  • The proposal aims to give a browser an opportunity to improve their cache
    efficiency by introducing immutability to a bundle. If a bundle's URL
    doesn't change, we assume the bundle's contents are exactly same. This is
    not an effort by a convention. The proposal aims to force immutability by
    introducing a Content-Addressable Hash, which is conceptually similar to a
    Git's commit ID you might be familiar with. Content-Addressability gives web
    developers reproducible builds as well as giving a browser an opportunity to
    improve their cache efficiency.

I don't understand how other browsers work. But in Chromium, the code cache generated by V8 engine is stored in the HTTPCache. And the code cache will be used when the cached HTTP response is still valid. So I'm not clear how this forced immutability can improve the cache efficiency, especially when the HTTPCache is split per Origin. Could you please write more detailed explanation?

@hayatoito
Copy link
Collaborator Author

hayatoito commented Mar 23, 2021

Note: I'm working on updating the proposal at #639, answering the questions. Thanks!

@jyasskin
Copy link
Member

Some overall thoughts:

  1. I think it would help to create an explainers/proposals/ directory to distinguish things that we're pretty confident are a good direction from things that we're just exploring.

  2. The Introduction and Goals sections don't explain to me why this proposal exists. "there is no mechanism to fetch the partial content of the bundle." makes me think that it's going to be an alternative to the subsetting options in @littledan's https://github.com/WICG/resource-bundles/blob/main/subresource-loading.md, but I don't see anything about the request headers to let the server know which subset to request, or anything about when to request a byte range from a bundle.

  3. Allow linking to sub-packages instead of just including them #40 discusses the goal to let bundles declare that they depend on another bundle. Is that actually what this is about?

  4. "Content-addressable" makes me think these items will be identified purely by their hashes, but it looks like they pair a real URL with a hash of the content that's expected to be there, and reject the content if it doesn't match. That makes them more similar to subresource integrity.

  5. I thought through ways to refer to sub-packages in draft-yasskin-dispatch-web-packaging. Those were all embedded into the overall package, rather than external URLs, but some of the same considerations apply. In particular, do we always want to refer by exact hash, or does it make sense to also allow a minimum timestamp, or more complete semver-ish version compatibility?

  6. I'm not sure it makes sense to designate a "main resource" for subresource bundles. e.g. if we want to bundle a bunch of images together, or a stylesheet with a script, they're all peers. It's only trees of scripts and stylesheets that have a notion of an entrypoint. This is part of @littledan's motivation for Separating the primary URL into a section #617.

jyasskin referenced this issue Mar 24, 2021
- Add FAQ: WebBundles for Ad Serving use cases
- Add FAQ: Subresource Integrity
@hayatoito
Copy link
Collaborator Author

Thanks @jyasskin! I really appreciate your feedback!

Let me reply for some of feedback if I can answer to them briefly. For other feedback, let me take that into consideration and improve the proposal based on the feedback.

Re 1:
Sounds good! Let me put a proposal into explainers/proposals. I'm thinking of splitting the current proposal into two proposals: 1) Declare dependencies to external bundles, and 2) Content Addressable Bundles, in order to make each proposals more understandable and their use cases and goals more clear. Let me put them into explainers/proposals once I finish splitting. That might take some time.

Re 3, 5:
I didn't know #40. Thanks! That seems to share a common goal. Let me mention #40, and use #40 for further discussion.

Re 6:
That's one of TODO items. I'm still exploring how entry points should be. I'm now looking how webpack works to learn what they have been doing. Yes, #617 is related. Thanks!

I'll keep this issue up-to-date once I have more insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants