Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repro: make stages run in order of the graph #9675

Merged
merged 4 commits into from
Jul 3, 2023

Conversation

skshetry
Copy link
Member

@skshetry skshetry commented Jun 30, 2023

This PR closes #9367, and makes a few other changes:

  • --single-item now runs in the order the target is provided, instead of the position in it's dag. So, if you provide dvc repro stage1 stage2 --single-item, stage1 will run before stage2 even if stage2 comes before stage1. Because that's what you want most of the time.
    (And I did not want to complicate the code too much.)
  • --downstream stages will run in the order of the graph instead of running stages of each target with post-order traversal.

The order is also enforced in normal cases, not just in --downstream by traversing the graph in post order. Previously, we were removing duplicates manually.

dvc/dvc/repo/reproduce.py

Lines 222 to 228 in 36ec25f

for stage in all_pipelines:
if stage not in steps:
# NOTE: order of steps still matters for single_item
if single_item and stage not in stages:
continue
steps.append(stage)

In case of --single-item, this means it'll run in the order of arguments passed.
For --downstream, this fixes running of stage out of order when multiple arguments are passed.
For normal case, this enforces the same logic, so that they are never run out-of-order.
@codecov
Copy link

codecov bot commented Jun 30, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.06 ⚠️

Comparison is base (36ec25f) 90.63% compared to head (3834cb8) 90.57%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9675      +/-   ##
==========================================
- Coverage   90.63%   90.57%   -0.06%     
==========================================
  Files         472      480       +8     
  Lines       36280    36334      +54     
  Branches     5218     5219       +1     
==========================================
+ Hits        32882    32911      +29     
- Misses       2812     2839      +27     
+ Partials      586      584       -2     
Impacted Files Coverage Δ
dvc/repo/graph.py 92.59% <100.00%> (+1.54%) ⬆️
dvc/repo/reproduce.py 98.97% <100.00%> (+2.71%) ⬆️
tests/func/repro/test_repro.py 100.00% <100.00%> (ø)
tests/unit/repo/test_graph.py 100.00% <100.00%> (ø)

... and 9 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@skshetry skshetry marked this pull request as ready for review June 30, 2023 08:00
@skshetry skshetry requested a review from a team June 30, 2023 08:09
@shcheklein
Copy link
Member

Let's check and update the docs for this please. I think we had one or more descriptions of the execution order. Plus we need to check the option descriptions.

@skshetry
Copy link
Member Author

Nothing should change from the user’s perspective here, at least from what’s documented. This is mostly a bugfix.

@shcheklein
Copy link
Member

Asair we mention that we don't guarantee the order of execution, i might be wrong though.

@skshetry skshetry merged commit bfa5b1a into iterative:main Jul 3, 2023
18 of 20 checks passed
@skshetry skshetry deleted the fix-downstream branch July 3, 2023 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

repro --downstream seems to ignore dependencies in for loops
4 participants