Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to resolve dependencies "in the past" #212

Open
AlexandreDecan opened this issue Jul 12, 2023 · 11 comments
Open

Ability to resolve dependencies "in the past" #212

AlexandreDecan opened this issue Jul 12, 2023 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@AlexandreDecan
Copy link

AlexandreDecan commented Jul 12, 2023

Hello @andrew,

This endpoint is really useful to resolve dependencies based on the current situation of a package (i.e., based on the releases that are currently available). Would it be possible to add a "date" field to simulate dependency resolution for a specific date? To some extent, this means only considering releases of packages that were available up to the given date.

Consider the following example:

  • Root package is "A" in version 1.0.0 (version is not important here), and has a dependency towards "B";
  • Dependency constraint for "B" is >=1.0.0;
  • On 23-07-12 (today), B has two releases: 1.0.0 released on 23-06-01 (last month), 1.0.1 released on 23-07-01 (12 days ago).

Currently, the endpoint would list "[email protected]" as a resolved dependency (as 1.0.1 is the latest/highest release satisfying the constraint). What we would like to do is, for example, to know what were the dependencies of A on 23-06-01. In this specific example, the endpoint would return "[email protected]" since 1.0.0 is the highest/latest release of B complying with given dependency constraint at that specific point in time (because 1.0.1 was not yet released at that date). So, with a very naive approach, given a date field, the process to resolve dependencies (and transitive dependencies) would consist of applying the current algorithm but only taking into account releases whose release date is <= date.

(Adding @HassanOnsori since he's the one working on this currently ;-))

Funding

  • You can sponsor this specific effort via a Polar.sh pledge below
  • We receive the pledge once the issue is completed & verified
Fund with Polar
@andrew andrew self-assigned this Jul 13, 2023
@andrew andrew added the enhancement New feature or request label Jul 13, 2023
@andrew
Copy link
Member

andrew commented Jul 13, 2023

Yes I think that's quite doable, I'll need to add a parameter option to both the resolve service and the versions list endpoint on the packages service, will take a look at that either tomorrow or next week.

Also I think you're one of the first teams using this service, so good to know how you find it or any other feedback you have.

andrew added a commit to ecosyste-ms/packages that referenced this issue Jul 13, 2023
@andrew
Copy link
Member

andrew commented Jul 13, 2023

I've added the published_before parameter to the packages service (date-time), example: https://packages.ecosyste.ms/api/v1/registries/conda-forge.org/packages/imaris-ims-file-reader/versions?published_before=2022-08-19T00:12:30.000Z

Just need to do the same in the resolve service now.

@AlexandreDecan
Copy link
Author

Thanks Andrew, you're awesome ;-)

Also I think you're one of the first teams using this service, so good to know how you find it or any other feedback you have.

We are currently exploring the possibilities offered by ecosyste.ms and so far, we are happy with what you've done! :-) That said, the main difficulties we have relate to the lack of "documentation": it is not always easy to understand exactly what each endpoint is providing, and what's the content of each field returned by the API. I guess it's a matter of habit :-) So far, in case of doubt, I look at the code (but I'm not familiar with Ruby, and it's good that I've some experience with Django to understand how Rails is structured ^^)

It is also quite difficult to grasp the "implicit relational model" behind these endpoints, especially when working with GitHub Actions, since Actions are packages (i.e., they come from the "package" endpoints), but most of the data about these packages can be obtained from the "repo" endpoints (e.g., releases of Actions are not "package releases", but "repository tags"; dependencies for these Actions can be obtained from the "repo/manifest" endpoint, and not from the "package dependencies" endpoint, etc.). I guess it's quite specific to Actions since they do not really have explicit releases, nor explicit dependencies ;-) But so far, we manage to get the data we need :)

Some other minor points we noticed:

  • The name of an ecosystem or a registry is case sensitive, and we often get 404 because we misspelled them (e.g., "github" instead of "GitHub", or "npm" instead of "npmjs.org", etc.);
  • There are a few API endpoints that are not "fully" consistent with other ones, in the sense that the response headers do not always list the total number of items, nor the total number of pages. I don't remember exactly which ones, but @HassanOnsori should be able to provide some examples :-)
  • @HassanOnsori will confirm this, but if I remember correctly, some endpoints do not respect the "per_page" field (and the default value is not mentioned in the documentation, but it seems to be consistently 100). To be checked/confirmed. The allowed range for "per_page" could be mentioned (I tested once to obtain a list of 10,000 packages from npm, and it worked. That's good, but I guess you expect users not to use such large numbers :-) Using a larger number implied a 500 code ;)

@andrew
Copy link
Member

andrew commented Jul 26, 2023

I've deployed a first pass at this: ea6cbfd the before parameter should be a standard datetime format i.e. 2022-08-19T00:12:30.000Z

@AlexandreDecan
Copy link
Author

Thanks! When I try on https://resolve.ecosyste.ms/ with one of my package (namely portion on PyPI), I always get the same result regardless of the value of before. It seems that the result was cached since the page has a "Generated on 25 Apr 2023 09:12" written on it ;-)
I tried to use the API, but when I click on "API", leading to https://resolve.ecosyste.ms/docs/index.html, I get an error ("Fetch error - Not Found /docs/api/v1/openapi.yaml").

Considering the "resolve" endpoint, in complement to the name of a package, would it be possible to specify the version of that package whose dependencies should be resolved?

@andrew
Copy link
Member

andrew commented Jul 26, 2023

@AlexandreDecan pushed a fix for the caching issue and added the api doc file.

@andrew
Copy link
Member

andrew commented Jul 26, 2023

@AlexandreDecan I've deployed an experimental, optional version parameter as well, give it a try.

@AlexandreDecan
Copy link
Author

The documentation (in the API page) for version indicates a version string($date-time). Is this expected? I think version string is what you meant?

I'm waiting for the job to complete, but looking at the code, I'm wondering whether transitive dependencies comply with the before parameter. Note that I'm not at all familiar with Ruby, so I may have missed something ;-) I also noticed, while looking at the changes you made in ea6cbfd there's a TODO note indicating that versions should be sorted (see ea6cbfd#diff-7e307d8bab95a59317bbf5d2fe25e1862fbb6d884c0664df6750e6957fceaa93R36). I don't know exactly in which order versions are currently sorted, but sorting versions is something that is definitely needed to resolve dependencies. Most package managers are sorting versions by date (except Nuget that does this in reverse order, at least a few years ago, and I don't know why they do that ^^). Some package managers (such as pip or npm) are sorting versions by "version number" (e.g., even if "2.0.1" is newer than "3.0.0", if "3.0.0" is complying with the dependency constraint, it will be selected instead of "2.0.1"). Is this something you considered?

@andrew
Copy link
Member

andrew commented Jul 26, 2023

@AlexandreDecan sorry that's a copy-pasta typo, fixed now. I believe all transitive dependencies should honor before,

url = "https://packages.ecosyste.ms/api/v1/registries/#{@registry}/packages/#{package_name}/versions?per_page=1000"
is the only line that loads versions.

I'll check the sorting tomorrow, this project is a bit of a "best-case" right now as a generic resolver, it doesn't encode the exact resolution strategy of each package manager, but definitely can improve in places.

@AlexandreDecan
Copy link
Author

Would it be easy for you to add support for dependency constraint in the "version" field (or through another field)?

@HassanOnsori
Copy link

Hi @andrew,

I used the 'resolve' API to resolve the dependencies of an npm package, but I encountered an error in the result. The error message is as follows:

error@#<PubGrub::SolveFailure: "Could not find compatible versions\n\nBecause @actions/github >= 5.0.1 depends on @octokit/plugin-rest-endpoint-methods >= 5.13.0, < 6.0.0\n and no versions satisfy @octokit/plugin-rest-endpoint-methods >= 5.13.0, < 6.0.0,\n @actions/github >= 5.0.1 is forbidden.\nSo, because root depends on @actions/github = 5.0.3,\n version solving has failed.">'.

https://resolve.ecosyste.ms/resolve?registry=npmjs.org&package_name=@actions/github&version=5.0.3&before=2022-09-22%2008:04:01+00:00
Below are the parameters used for the above endpoint:
Registry: npmjs.org
Package Name: @actions/github
Version: 5.0.3
Date Before: 2022-09-22 08:04:01+00:00

More examples :
https://resolve.ecosyste.ms/resolve?registry=npmjs.org&package_name=@actions/core&version=^1.10.0&before=2023-07-06%2013:09:39+00:00
https://resolve.ecosyste.ms/resolve?registry=npmjs.org&package_name=@actions/github&version=^5.1.1&before=2023-07-06%2013:09:39+00:00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants