Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc for reuse of defaultResolve #33

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jonaskello
Copy link

@GeoffreyBooth I hope it was something like this you were referring to as design doc :-). It's not really a ready-made design proposal at this point but more a description of requirements and ideas.

@jonaskello
Copy link
Author

@DerekNonGeneric
Copy link
Contributor

@jonaskello, this topic is also something I have known would require a design document, so I am glad that you made a good start with this seeing as how we still have not determined how we are going to expose the default resolution algorithms to a) the loader or b) anywhere else in userland. I think these default algorithms should be exposed as helper functions as part of a core module, but we have not really given this topic as much air-time as we probably should have (others have requested it already).

I would gladly endure a presentation about how folks propose we go about doing this and I'm sure the others would as well.

@jonaskello
Copy link
Author

I'm mostly interested in the utility functions approach, specifically for the resolve algorithm. I would suggest this way forward:

  • Extract some functions in resolve.js to a separate file resolve-utils.js. The functions in resolve-utils.js are functions that later should be exposed as a core module.
  • The original resolve.js should call the functions in the resolve-utils.js file so the default resolve becomes a consumer of the utility functions (and so they are tested when default resolve is tested).
  • In the first step be just move functions as-is to the utils file.
  • The next step would be to refactor the functions in resolve-utils.js so they do not do any fs calls (instead moving any fs calls to the resolve.js file so they are made on the "outside" of the utils file)

I could propose the changes above in a PR but I think there is some process that should be followed here? Could someone please suggest the next step I should take in order do move forward with the above?

@GeoffreyBooth
Copy link
Member

Could someone please suggest the next step I should take in order do move forward with the above?

I think maybe write down which functions would go in this new file and what their signatures would be; and note which ones would change (like removing FS calls).

@jonaskello
Copy link
Author

Yes, thanks, that seems like an obvious next step now that you mention it :-). I've been working towards a more complete resolve function for typescript in this repo. In that code I've copied the nodejs default loader code for the resolve part and tried to rework the API for it. I'll try to get some time to extract the signatures from there and put it in this design doc.

@JakobJingleheimer
Copy link
Member

Regarding exposing those utils via a core module (presumably module), that sounds to me fairly straight-forward and doable with low effort and high reward (and should be transparent to existing tests 🙏).

As for overriding them for a hook to use, if the modified bits need to exist/persist outside the hook/loader, that sounds much, much more complicated: That would need to somehow be mapped per loader so that one loader doesn't pull the rug out from under another (where loader A and loader B both replace utility foo with entirely different implementations, and loader A gets loader B's foo).

But if the override only needs to exist within the loader, I think the createRequire-like approach someone (maybe you?) proposed a few meetings ago could do the trick. Perhaps something like

import { createResolve } from 'module';

import { packageJsonReader, statSync } from './resolveOverrides';

const customResolve = createResolve({
	packageJsonReader,
	statSync,
	// … any other resolve util
});

export async function resolve(/* … */) {
	const something = customResolve(/* … */);
}

@arcanis
Copy link
Contributor

arcanis commented Oct 23, 2021

But if the override only needs to exist within the loader, I think the createRequire-like approach someone (maybe you?) proposed a few meetings ago could do the trick. Perhaps something like

This would be too limited for real-world use cases, where loaders frequently need to take "ownership" of a part of the resolution in a way that others don't have to be aware of what has been changed exactly.

@JakobJingleheimer
Copy link
Member

If only 1 loader does it, that would be fine. But I think as soon as more than 1 claims dominion over mutually exclusive behaviour, we have an unstoppable force meets an immovable object scenario 😣

@jonaskello
Copy link
Author

I've added an example API for the utility functions and example application code to call them. This is a rough first version that I arrived at when working on ts-resolve. I wanted to keep it fairly close to what the original code looks like. I think with further iteration we could completely remove the need for filesystem callbacks and also cleanup/merge some of the functions to have a smaller API surface.

@GeoffreyBooth
Copy link
Member

This is really good. I like all the function signatures, that’s what we need to go ahead and build this.

I think you can remove the passing mention of possible hooks to override low-level file system calls. I feel like such hooks probably wouldn’t be chainable, and you also probably wouldn’t have all the context you’d need to know what to do (like what the original specifier was). I think the other design with the utility functions is much more promising.

Could we add an example of a user loader that uses these utility functions? Like a user loader resolve hook that uses one or more of the utility functions to reduce boilerplate.

@arcanis
Copy link
Contributor

arcanis commented Oct 26, 2021

I think you can remove the passing mention of possible hooks to override low-level file system calls. I feel like such hooks probably wouldn’t be chainable, and you also probably wouldn’t have all the context you’d need to know what to do (like what the original specifier was). I think the other design with the utility functions is much more promising.

I'd really appreciate a practical example of how you envision loaders to interact when they need to override the filesystem. Even if only in pseudo-code, can you write a loader that would allow to load files from zip archives and another that would add resolution for the browser field? I don't see how the first loader would tell the second how to check the file existence without dedicated hooks, or the second having private knowledge of the first (which isn't scalable), so I'd appreciate some clarification (since it came up a few times already).

@jonaskello
Copy link
Author

@GeoffreyBooth I removed the file system hooks mention. Added an example resolve hook to resolve typescripts file. The example is not runnable at this point, it's more like a sketch on how it could be done. If I get some more time over I might be able to put together a runnable example.

@jonaskello
Copy link
Author

jonaskello commented Oct 26, 2021

@arcanis I cannot see how that would be done either. If I was tasked with doing something like that I would probably prefer to compose my own single loader from utility packages rather than chaining two pre-existing loaders, ie. there is a package that exposes utilities loading from zip files, and another package that exposes utilities for resolving the browser field. I write my own loader that imports both those packages and calls into those utility functions.

Although I could see how chaining existing loaders would be useful for non-programmers. To me the chaining loaders seems akin to middlewares in a http server. In that case there is often a mutable context passed between them and some middleware makes an assumption that a previous middleware put something on the context. So like you said second has knowledge of first, or at least has a requirement for the context to be filled in by some previous middleware. I'm sure this has already been discussed in length :-)

@arcanis
Copy link
Contributor

arcanis commented Oct 26, 2021

If I was tasked with doing something like that I would probably prefer to compose my own single loader from utility packages rather than chaining two pre-existing loaders, ie. there is a package that exposes utilities loading from zip files, and another package that exposes utilities for resolving the browser field. I write my own loader that imports both those packages and calls into those utility functions.

There are two reasons why this cannot work in my opinion:

  • The ecosystem is far too large and fast-moving to allow tooling maintainers to write "combined loaders". I mentioned that a solution needs to be scalable; this is precisely because our users will have too many use cases for us to anticipate them all - not only because of today's landscape, but because new ones will keep appearing.

  • If tooling maintainers don't write these "combined loaders", then it'll be up to our users to do it (which matches what you suggest). This wouldn't work either because 1/ the JS ecosystem has many users of various levels, and not all of them have the knowledge required to write loaders 2/ even levels put aside, developer experience is a significant concern, and the higher is the friction the more frustrating will be using tools in the Node ecosystem.

If the goal of the Loaders API is to enable tooling authors, then I have to insist that 3rd-party loader composition must be a first-class citizen, not something that's left to Node's users (and tooling maintainers) to solve 😕

In that case there is often a mutable context passed between them and some middleware makes an assumption that a previous middleware put something on the context.

A context would be fine but it would need to be standardised. If, say, Yarn has to specify that we provide specific hooks that 3rd-party loaders have to use if defined, this will require Yarn to "sell" them to each loader author - and that'd be a very difficult and thankless task to give us.

But if Node says, in its documentation, something like this (written on the top of my head), then it's fine by me:

The resolve hook is passed a context argument with the following fields:

interface LoaderContext {
  fs: PromisifiedFs;
}

Conformant loaders MUST use the fs property from the provided context rather than the fs builtin module. [And for good measure perhaps block access to the fs builtin, or at least warn on access, since it'd be unreliable.]

Does that make sense?

@jonaskello
Copy link
Author

Sure, that makes sense. I currently have this in ts-resolve which I think is similar to what you suggest:

export type FileSystem = {
  readonly cwd: () => string;
  readonly isFile: IsFile;
  readonly isDirectory: IsDirectory;
  readonly getRealpath: GetRealpath;
  readonly readFile: ReadFile;
};

export type IsFile = (path: string) => boolean;
export type IsDirectory = (path: string) => boolean;
export type GetRealpath = (path: string) => string | undefined;
export type ReadFile = (filename: string) => string | undefined;

As I said I can see the use-case of chaining for non-programmers. As a programmer though I do prefer code. Complex configuration issues can be fairly hard to debug. Like in some CI build systems that try to do everything with complex declarative config and plugins for everything where instead a few lines of custom script could have done exactly what was needed. But I think this is a different discussion :-).

Too keep this PR a bit focused, I think maybe we should have two separate design proposals, one for utility functions (this PR) and one for filesystem hooks/context (a new PR). If the utility functions can be made "filesystem-less" (no filesystem calls and no passing in callbacks for filesystem calls) then they are fairly independent from the filesystem hooks/context suggested above? So the two proposals would not be competing but instead be complementary?

@GeoffreyBooth
Copy link
Member

If the goal of the Loaders API is to enable tooling authors, then I have to insist that 3rd-party loader composition must be a first-class citizen

We’ve been calling this “chaining.” It was the topic of today’s meeting. There’s a design doc linked from the main README about it, and it’s the next item on the loaders roadmap.

The current load hook is meant to abstract reading files from disk, to enable cases like loading over HTTPS; that case is an example in the Node docs. I think we’ll want “higher level” hooks like this, rather than hooks at the fs level that could interfere with how user code interacts with the file system.

As for loading from a .zip file, that’s possible already with the load hook. The part that’s tricky is resolving when a package.json is inside a .zip file. That would involve reimplementing lots of Node’s resolution algorithm, which having utility functions would help abstract.

@jonaskello
Copy link
Author

jonaskello commented Oct 27, 2021

I can see how this would work in a single resolver hook, but I think what @arcanis is referring to is when you chain two resolver hooks together. Today the resolver hook has this signature:

resolve(specifier, context: {conditions, parentURL}, defaultResolve)

Now if I would have two of these functions, one that can read package.json from zip files and another that can resolve js imports back to typescript source files, how would they work together if you call them one after the other?

If I would implement a typescript resolver hook with that signature, I would have to make the decision inside that hook how to access the file system. This is why I decided to not publish a package with a ready-made resolve hook but instead a package with lower-level resolve function. So my resolve function in ts-resolve looks like this:

export function tsResolve(
  specifier: string,
  context: { parentURL: string | undefined; conditions: ReadonlyArray<string> },
  tsConfig: string,
  fileSystem?: FileSystem | undefined
): { fileUrl: string; tsConfigUrl: string } | undefined;

So this function has not decided how to use the filesystem, instead the caller can pass their own filesystem object that has the required functions. IMO it is better to publish these type of utility packages rather than ready-made resolve hooks. As an analogy I also believe it would be better to publish utility functions instead of ready-made middleware for http servers. Eg. the graphql team publish a middleware ready made for express, but it does not work in koa, fastify etc. If they instead published a library of pure functions to parse a graphql request it would be easy to write your own middleware in any of the http server frameworks. I'm thinking that if there was a clean solution to chaining stuff like this then middlewares in http servers would have it as this is a problem I guess have been studied for many years. But they don't, instead they pass a mutable context that can be anything which is not really clean IMO.

I have not read the chaining proposal so this is just my uninformed take on it at the moment :-).

@arcanis
Copy link
Contributor

arcanis commented Oct 27, 2021

Now if I would have two of these functions, one that can read package.json from zip files and another that can resolve js imports back to typescript source files, how would they work together if you call them one after the other?

Yep, exactly what I meant. I'm not too worried about chaining in general, or about the helpers having some way to configure how they access the filesystem, since those are both fairly straightforward APIs that won't need many iterations. I'm more concerned about the two put together (multiple resolvers supported by a virtual fs), because there are multiple different ways to solve the problem and I suspect it'll take some dedicated discussion to pick one.

I couldn't attend the last meeting for familial reasons, but I'll book some time to be there at the next one and put that on the agenda, if that's ok @GeoffreyBooth.

(Sorry to have derailed your thread, @jonaskello, I got worried about the specific mention of "no fs hook" 😅)

@GeoffreyBooth
Copy link
Member

Please see https://github.com/nodejs/loaders/blob/main/doc/design/proposal-chaining-middleware.md for a WIP design doc for chaining.

@JakobJingleheimer
Copy link
Member

Sorry, a bit behind and catching up:

If I was tasked with doing something like that I would probably prefer to compose my own single loader from utility packages rather than chaining two pre-existing loaders, ie. there is a package that exposes utilities loading from zip files, and another package that exposes utilities for resolving the browser field. I write my own loader that imports both those packages and calls into those utility functions.

This sounds like a good design in theory but untenable in practice (for the reasons Maël cited).

Although I could see how chaining existing loaders would be useful for non-programmers. To me the chaining loaders seems akin to middlewares in a http server.

Funny you mention that 😜

As a programmer though I do prfer code. Complex configuration issues can be fairly hard to debug.

💯 we cannot end up with a Karma config monstrosity (we're fortunately a long way away from that though).


RE next meeting: I'll be travelling back from Provence a few hours before, so if I randomly don't show up, I'm trapped in transit (if I'm more than 5 mins late, I'm probably not gonna make it).

@JakobJingleheimer JakobJingleheimer added the loaders-agenda Issues and PRs to discuss during the meetings of the Loaders team label Nov 17, 2021
@GeoffreyBooth GeoffreyBooth removed the loaders-agenda Issues and PRs to discuss during the meetings of the Loaders team label Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants