Skip to content
This repository has been archived by the owner on May 4, 2024. It is now read-only.

[move-package] Support dependency overrides #1023

Open
wants to merge 10 commits into
base: sui-move
Choose a base branch
from
Open

Conversation

awelc
Copy link
Collaborator

@awelc awelc commented Mar 30, 2023

Motivation

When building a dependency graph, different versions of the same (transitively) dependent package can be encountered. If this is indeed the case, a single version must be chosen by the developer to be the override, and this override must be specified in a manifest file whose package dominates all the conflicting "uses" of the dependent package. These overrides are taken into consideration during the dependency graph construction.

When constructing the graph (top to bottom) for internal dependencies (external dependencies are batch-processed at the end of the graph construction), we maintain a set of the current overrides collected when processing dependencies (starting with an empty set), and do the following when processing a package (starting at root):

  • process overrides first to collect a list of overrides to be added to the set (if they are not yet in the set - outer overrides "win") and then process non-overridden dependencies using a freshly updated overrides set
  • when trying to insert a package into the graph, mark it as being on the "override path" if the set of overrides is non-empty:
    • if no package exists in the graph, insert it
    • if a conflicting package already exists in the graph, override it if an override can be found in the set (if it does not, report an error)
    • if the same package already exists in the graph, and this package is on the "override path" keep checking its dependencies to make sure that previously used overrides are correct (dominate all uses of the package)
  • after a package is fully processed, remove its own overrides from the set.

External dependencies are provided by external resolvers as fully formed dependency sub-graphs that need to be inserted into the "main" dependency graph being constructed. Whenever an external dependency is encountered, it's "recorded" for future batch-processing along with the set of overrides available at the point of sub-graph insertion. When processing an individual sub-graph insertion, we do a depth first search of the subgraph in order to:

  • detect which of the sub-graph's packages need to be overridden (in which case their dependencies in the sub-graph no longer should be inserted into the main graph)
  • avoid inserting sub-graph edges into the main dependency graph if they belong to overridden
    packages

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Tests added

Copy link
Collaborator

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the initial pass, I think it's largely there, thanks @awelc, I especially appreciate the comprehensive test suite ! High-level notes:

  • Some of the new functions need some love vis-a-vis doc comments -- it's a bit of a struggle to see how they all fit together without it, especially the recursive ones.
  • There's a comment that suggests that we should not exit early if we are on an overridden path that seems to exit early anyway -- potential bug and missing test case there?
  • We're generally missing tests for overrides that themselves contain dependencies, which certainly has potential for some hairy edge cases.
  • The tests make heavy use of the version field to create different versions of the same package even though they all point to the same place. I'm not sure how legitimate that is, because we don't really pay attention to the version during resolution (if anything, we should probably get rid of the version as a legitimate field on dependencies). Consider using different locations to represent different versions of a package.
  • Copybara -- we should probably port this PR over to the sui repo land it there and let copybara port it back. I was able to do that before by generating a patch file from my commit, and then applying it in sui.

@@ -0,0 +1,3 @@
[package]
name = "C"
version = "0.0.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both A and B agreed on the version of C and it didn't match the version that C reported, I would expect the build to fail, is that what you've found in practice? If not and the version is just ignored, then we should construct a test case that picks versions of C from distinct packages, rather than tweaking the version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These versions are not taken into consideration at all. The only versions taken into consideration are those mentioned in the dependency specification. In other words if A depends on C in the following way:

C = { local = "../C", version = "1.0.0" }

and B depends on C in the following way:

C = { local = "../C", version = "1.0.0" }

or even this way

C = { local = "../C" }

We'd have a diamond problem if some root package depended both on A and B.

Since we rely on versions mentioned in the dependency specification, it did not seem necessary to create different subdirectories for different versions of local dependencies.

I can certainly do that, though, if the current setup is not clear enough (though hopefully not wrong...)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording a conversation we had offline: I think the test set-up here is fine, but it's highlighted a peculiarity with dependency specification (that we treat dependencies as different based on the version field, even though we don't use the version field during dependency resolution), which we should change in future (probably by removing the version field).


[dependencies]
A = { local = "../A" }
ADep = { local = "../ADep", version = "1.0.0", override = true }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the use of version here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment explains how I indeed rely on these version numbers to distinguish between "different package versions" (#1023 (comment)). These two deps are considered "different" from the point of view of the algorithm (and dependency graph structure) as the two following ones:

ADep = { local = "../ADep", version = "1.0.0" }
ADep = { local = "../ADep", version = "2.0.0" }
ADep = { local = "../ADepV1" }
ADep = { local = "../ADepV2" }

@@ -0,0 +1,3 @@
Failed to resolve dependencies for package 'Root': Resolving dependencies for package 'B': Conflicting dependencies found:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the error message in this case to be the same as the error message for diamond_problem_dep_incorrect_override_empty, which points out that the override should dominate all uses -- is that feasible? If so, I think it will be easier to identify the issue in all cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this would be ideal but it's currently impossible... It's due to the restriction that when we process an overridden dependency we do not add this dependency to the set of overrides:

        // It's also pretty important that we do not extend overrides with the override being
        // currently processed as we have no way of knowing if it's the correct one (dominating all
        // package "uses"). If we did, we could override an existing (non-overridden) entry in the
        // graph with an incorrect override without being able to detect it.

In other words, in this case, there isn't even a notion of override when processing this particular pair of conflicting packages...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the doc comment to explain the restriction leading to different error messages a bit better:

        // It's also pretty important that we do not extend overrides with the override being
        // currently processed. The reason for it is that in order to detect incorrect overrides
        // (such that do not dominate all package "uses") we rely on the package being reachable via
        // different paths:
        // - if it's reached via an overridden path again for the same override, it's OK
        // - if it's reached via an overridden path again for a different override, it's an error
        // - if it's reached via a non-overridden path, it's an error (insufficient override)
        //
        // While the first type of error could still be detected if we injected the currently
        // processed override into the overrides set, the second one would not.  Consider
        // diamond_problem_dep_incorrect_override_occupied example (in tests) to see a situation
        // when a non-overridden path is chosen first to insert the package and then insufficient
        // override could be considered correct if we injected it into the overrides set (as we will
        // not have another path to explore that would disqualify it).

Entry::Occupied(mut entry) => {
if let Some(overridden_pkg) = overrides.get(&ext_name) {
// override found - use it and return - no reason to process its dependencies
entry.insert(overridden_pkg.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the acyclicity check here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I assumed we might as I don't think we have any guards against actively creating cycles during graph construction nor does override_verify include such check when traversing the graph, which could lead to infinitely traversing the cycle.

I will try creating an example of such behavior to either confirm the use of acyclicity here or rule it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added an example that without the check would lead to infinite recursion.

Ok(())
}

fn merge_pgk(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a complicated recursive function, it deserves a pretty hefty doc comment I think! (Could be that part of the big doc comment that you have at the top could go here?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

self.check_acyclic()?;
self.override_verify(&pkg, name, overrides)?;
}
return Ok(pkg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be guarded by the else of the if above? The comment above seems to suggest so.

This also implies a missing test case.

Copy link
Collaborator Author

@awelc awelc Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be guarded by the else of the if above? The comment above seems to suggest so.

This was not my intention. The idea was that if package already in the graph (the same as package we are trying to insert) is NOT on the upgrade path, we are OK to return early. Otherwise, the override_verify function kicks in to analyze the subgraph (we don't need to re-process manifest files so instead of falling through from here, a separate traversal is implemented in override_verify) - either override_verify will return an error or not, in which case it is also OK to return (early as there are no more packages to analyze and insert here). Added comments in the code to clarify this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense, thanks for clarifying!

// check if any (should be 0 or 1) edges are the overrides of pkg
let pkg_overrides: Vec<_> = self
.package_graph
.neighbors_directed(name, Direction::Incoming)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use edges_directed to avoid the lookup on edge_weight and the unwrap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to find a better fitting function, but I don't think that edges_directed is implemented for GraphMap (nor is the IntoEdgesDirected)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, just one final question from me, about how we handle overrides in the case where the package table is vacant at the entry for that overridden package, and then let's ship this thing!

dst_pkg.clone(),
dst,
ext_graph,
ext_table,
resolver,
overrides,
)?;
// unwrap is safe as all edges have a Dependency weight
let ext_dep = ext_graph.edge_weight(ext_name, dst).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is it possible to avoid the unwrap by iterating over edges_directed instead of neighbors_direceted?

// override found - use it
entry.insert(overridden_pkg.clone());
entry.into_mut()
// override found - use it and return - its dependencies have already been
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we only care to check if there's an override when there's a potentially conflicting set of dependencies, but if the package table is empty we just insert the package, and don't check for overrides. Is this because if there is an override, it will be in the package table already, or is this something that we need to change? (If the former, we should add a comment on the Vacant case as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! And your assessment is correct - the override would have already been inserted (and at the point of the override insertion, the package in question is not in the overrides set). I have put a comment on the Vacant case accordingly.

@@ -0,0 +1,3 @@
[package]
name = "C"
version = "0.0.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording a conversation we had offline: I think the test set-up here is fine, but it's highlighted a peculiarity with dependency specification (that we treat dependencies as different based on the version field, even though we don't use the version field during dependency resolution), which we should change in future (probably by removing the version field).

// check if any (should be 0 or 1) edges are the overrides of pkg
let pkg_overrides: Vec<_> = self
.package_graph
.neighbors_directed(name, Direction::Incoming)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.check_acyclic()?;
self.override_verify(&pkg, name, overrides)?;
}
return Ok(pkg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense, thanks for clarifying!

@awelc
Copy link
Collaborator Author

awelc commented Apr 21, 2023

Okay, just one final question from me, about how we handle overrides in the case where the package table is vacant at the entry for that overridden package, and then let's ship this thing!

As we discussed offline, this situation cannot happen as a package is inserted into the overrides set after it's "processed" that is after its insertion is attempted in process_dependency . In fact, we return a package from the package_table entry to have it inserted into the overrides set.

@@ -0,0 +1 @@
[email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Emacs temp files!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang!!! Removed!

awelc added a commit to MystenLabs/sui that referenced this pull request Apr 21, 2023
## Description 

When building a dependency graph, different versions of the same
(transitively) dependent package can be encountered. If this is indeed
the case, a single version must be chosen by the developer to be the
override, and this override must be specified in a manifest file whose
package dominates all the conflicting "uses" of the dependent package.
These overrides must taken into consideration during the dependency
graph construction and this PR implements the relevant changes to
dependency graph construction algorithm. For additional details see the
doc-comments in the code as well as the PR this one is based on
(move-language/move#1023) which contains a
discussion of issues encountered during development of this algorithm.

## Test Plan 

A comprehensive test suite is attached.
---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.

### Type of Change (Check all that apply)

- [x] user-visible impact

### Release notes

This change allows a developer to override transitive dependencies of a
package they are developing to avoid conflicts that could otherwise
arise.
bmwill pushed a commit to move-language/move-sui that referenced this pull request Apr 18, 2024
## Description 

When building a dependency graph, different versions of the same
(transitively) dependent package can be encountered. If this is indeed
the case, a single version must be chosen by the developer to be the
override, and this override must be specified in a manifest file whose
package dominates all the conflicting "uses" of the dependent package.
These overrides must taken into consideration during the dependency
graph construction and this PR implements the relevant changes to
dependency graph construction algorithm. For additional details see the
doc-comments in the code as well as the PR this one is based on
(move-language/move#1023) which contains a
discussion of issues encountered during development of this algorithm.

## Test Plan 

A comprehensive test suite is attached.
---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.

### Type of Change (Check all that apply)

- [x] user-visible impact

### Release notes

This change allows a developer to override transitive dependencies of a
package they are developing to avoid conflicts that could otherwise
arise.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants