-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helpers design doc #94
base: main
Are you sure you want to change the base?
Conversation
Does this design account for composing two loaders that are both trying to use these helpers? Decorator pattern can work well for stuff like this. Decorator pattern could actually work for all hooks: these helpers, existing hooks, and the fs hooks. // my-loader.mjs
export function create(hooks: NodeLoaderHooks) {
return {
...hooks,
resolve(url: Url, context) { /* resolve hook goes here. Can call `hooks.resolve()` if appropriate */ },
fileExists(path: string) { /* augment fileExists */ },
findPackageJson(packageName: string, base: string, isScoped: boolean) { /* ... */ }
// All other hook behaviors are unmodified by this loader. Other loaders might modify them
}
} The snippet above has them all on a single namespace, but could also work splitting into multiple namespaces. export function create(hooks) {
...hooks,
fs: {
...hooks.fs,
fileExists() { /* augment fileExists */ }
}
} |
The idea is that each helper would be a “pure” function, operating only on its input; hence the need for passing in a lot more input, from references to other functions to also containers of state like the module cache. But once every helper function is pure and stateless, they should be able to be used by any loader in a chain without conflict. |
I'm thinking about a situation where one loader needs to augment the behavior of a helper function, and another loader needs to pass an implementation of that helper function into another helper function. The second loader needs a reference to the first loader's augmentations of the helper. |
Presumably, with ambiant loaders (I'm working on them), one loader could just override the |
This isn’t a use case I considered, like say if you want to override a helper like I’m still hoping to avoid needing lots of defined hooks, like separate hooks for every filesystem operation etc. For now my goal is only to reduce boilerplate among hooks that users are writing, not provide even more ways to hook into Node’s internals (at least, not as part of this effort). Getting back to my original question, does anyone have any feedback about this general design pattern? |
To override Something like: import * as wrappedHelpers from 'loader_utils_generated_alias_46873'; // avoid collisions
export function findPackageJson() { wrappedHelpers.findPackageJson(/* augmentations */) /* augmentations */ } If helper A calls helper B, calls helper C, then helper C's dependencies should be pass-able to helper A. So that any helper can accept a I dunno if there's any performance difference between a single |
I was thinking that each helper would be available on But please, let’s stop focusing on overriding helpers. The question is whether the API I’ve started to sketch out in this draft PR is good or could be improved. |
It might help to get a usage example for invoking |
I’ve added an example of the intended/expected usage: https://github.com/nodejs/loaders/pull/94/commits/0d263e3b7c9ff83812aa02a96e99cbbf47d7e444. I’ll add another example with a custom implementation of one of the child functions. |
The situation I'm thinking of is: what's intended implementation when |
At the moment I can’t think of a realistic example for why you’d want to customize const tryStatSync = () => true; // Or whatever
const pathToPackage = fileURLToPath(packageResolve(specifier, context.parentURL, context.conditions, {
tryStatSync,
})); |
I'm thinking about virtual filesystem situations. Not sure if that's intended to be done in this way, or via the virtual filesystem hooks. But I like that we got clarification on the ability to pass transitive dependencies into a helper. So a single |
Fwiw my thinking for virtual filesystems is that we would be able to override One note: perhaps it'd be simpler if the helpers too a import fs from 'node:fs';
packageResolve(..., {fs}); Rather than (which could get unwieldy if there was multiple FS IO slots): import {tryStatSync} from 'node:fs';
packageResolve(..., {tryStatSync}); |
* @param {Set<string>} conditions | ||
* @returns {resolved: URL, format? : string} | ||
*/ | ||
function packageResolve(specifier, base, conditions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Shouldn't conditions
be part of a parameter bag? I think it's reasonable to imagine we would want to add more settings affecting the resolution down the road; making each one a separate argument would make the function difficult to use (especially when you only want to mention some of them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the signature of the real function in https://github.com/nodejs/node/blob/3350d9610864af3219de7dd20e3ac18b3c214c52/lib/internal/modules/esm/resolve.js. We can change that, but I wasn’t thinking of that right now.
|
||
```js | ||
function packageResolve(specifier, base, conditions, { | ||
parsePackageName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are parsePackageName
& the other functions below other helpers that you just didn't list for brievety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I wanted to get feedback on this design before I wrote it out for all the other helpers. These are all functions in https://github.com/nodejs/node/blob/3350d9610864af3219de7dd20e3ac18b3c214c52/lib/internal/modules/esm/resolve.js
I think the question here is whether you want to override all filesystem calls, including app code that uses If you really want to override all file IO, you could just replace the |
It's still my understanding that the thought experiment from #89 (comment) requires composing helper functions from two different loaders into one. This design by itself doesn't seem to fill that need. So for this design to make sense, I think there needs to be some understanding that an additional feature will fill the cross-loader composition requirement. |
Can you explain a bit what this requirement is? There’s a lot of stuff in that comment you’re linking to. My goal with these helpers is just to reduce boilerplate. You could use them in a sorta-compositional way by having one loader override these helpers, so that when the next loader imports them they get the custom versions. So if the first loader uses These helpers aren’t meant to solve all outstanding feature requests for loaders. It’s just one part of the roadmap. If there’s anything about creating these helpers that hinders any future use cases, or any way that the design of these helpers can be improved to better enable those use cases, please make specific suggestions on what to change. |
The API needs more work, but overall these seem like positive directions. One approach might also be to just start simple - I don't think anyone would reject exposing With package boundary functions, there are some questions as to whether it should directly return configuration, or if you should use path-based APIs primarily - Note that Perhaps having |
The yarn team might have some good insight on the design, since they have similar functions in their yarn PnP API. For example, referring to #94 (comment):
The latter exists in yarn's PnP API, named This I think clearly matches the way yarn and ts-node need to interop.
So when pnp and ts-node hooks are both active, the resolver should use:
I acknowledge that I'm jumping ahead, because I'm talking about composing hooks from two loaders. But I think it's still relevant to the design of these helpers. |
In this case it would resolve to |
Regarding augmenting functionality, could these helpers be implemented in ESM? If so, then couldn't loaders be used to customise their dependencies? // exposed on node:module
import { whatever } from 'node:fs';
export default function packageResolve(…) { … } // custom-loader.mjs
import * as td from 'testdouble';
export async function load(url, { parentURL }, nextLoad) {
await td.replaceEsm('node:fs', { whatever: fakeWhatever });
const { packageResolve } = await import('node:module');
// …
}
function fakeWhatever() { … } |
At the moment, at least, all internal code must be CommonJS. As far as I understand it (and I’m sure this is a simplification), we’d have to rebuild how the internal code gets assembled to be passed into V8 at startup for it to become ESM, since it’s not like internal code uses either the CJS or ESM loaders to run. |
Is exposing the current logic as pieced helpers the right approach? I see two use cases that relate to this:
I can see how helpers would be a perfect fit for (2), but just exposing the logic without allowing for overriding it would still cause a lot of rewriting and maintenance for use case (1), no? A few examples:
Essentially, the deeper in the call stack the behavior trying to be customized is, he more replication there will be. A simple wrapper around the helpers would allow to change references to point to custom functions but it would be patching the internal loader and is hacky. As a compromise couldn't node expose the a To a limited extent, this is what
I'd image this is achievable through adapters and shims without much change to the current public spec. But it could be taken further later with decomposed prototypes for [1] nodejs/node#33460 |
No, because then the entire class becomes a public API surface and we could essentially never change any part of it without those changes being breaking. This is what happened with CommonJS monkey-patching and is why ESM is comparatively locked down. |
Well, all the internal CJS resolver functions are still private, so that's not the best example. |
Yeah, I don't think this is the same as require. CJS implementation allows for ad-hoc customization of certain behavior by using a Module prototype to get/set function references used in their private logic. I'm sure there were good reasons for it but with the benefit of hindsight, it doesn't look right. I'd separate the concern into two parts:
The suggestion is to expose the prototype and not the instance used by node, and either can be frozen so (2) is not a problem. Customization would happen through prototype extension into a new class/prototype that would then be used in custom loaders. Everything is immutable. As for (1), the API surface could be made to match whatever helpers were going to be created. Maintainers would choose what logic makes the "stable"/"public" cut, and the rest could be private. Granted, the more that makes to the prototype, the more useful it would be and if that's small enough there's no benefit. But it would still have the benefit of later be dialed up as see fit. My guess is the biggest risk of having more rather than less logic as part of the prototype is for use cases of using it as helpers. If used to create custom loaders, the authors of those libraries are fairly aware they are working at a lower level, documentation could state the stability of each method (much like it does today for some features), and the library users would essentially have to enable a feature flag to use them (i.e. The API surface for the helper use case and custom loader case doesn't even have to be the same. The custom loader prototype could be passed to a factory method that node calls to create custom loader, and only a subset of it - possibly even keeping with the named function exports idea, but still using the prototype instance behind the scenes - could be available for importing to use as helper. This would be quite a break tho. |
I started writing a design doc for the helper functions we’re considering adding, to help people reimplement only small portions of hooks (see README). The idea is that if someone wants to change a small part of what happens within Node’s internal
resolve
, for example, rather than copy/pasting all the code from Node’s internalresolve
and changing one thing (and then this loader getting outdated with the next version of Node where Node’s internalresolve
changes) we provide lots of functions that make up all the steps of Node’sresolve
, and the hook author can sting those together with whatever minor changes they want as part of the userresolve
hook.On a practical level this will mean making public many of the functions in https://github.com/nodejs/node/blob/3350d9610864af3219de7dd20e3ac18b3c214c52/lib/internal/modules/esm/resolve.js, and splitting some of the larger ones into smaller ones that become public. Because many functions reference other helper functions, I’m thinking of a pattern where most of these internal calls become references that get passed in. So for example within
packageResolve
there’s a call toparsePackageName
; the latter can be passed in as an argument topackageResolve
, if the user desires to overrideparsePackageName
withinpackageResolve
. Leaving it undefined would mean that Node’s default internalparsePackageName
would be used.I’m opening this PR as a draft to get initial feedback on this general approach. If you all think it’s good and the best way to address how to make these functions into public APIs, then I’ll keep going on this design doc and try to define all of them (at least for
resolve
, for now) and then we can implement this.