Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update Rollups process #775

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 59 additions & 27 deletions src/release/rollups.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
## Background

The Rust project has a policy that every pull request must be tested after merge
before it can be pushed to master. As PR volume increases this can scale poorly,
before it can be pushed to master. As PR volume increases, this can scale poorly,
especially given the long (~3.5hr) current CI duration for Rust.

Enter rollups! Changes that are small, not performance sensitive, or not platform
dependent are marked with the `rollup` command to bors (`@bors r+ rollup` to
approve a PR and mark as a rollup, `@bors rollup` to mark a previously approved
approve a PR and mark it as a rollup, `@bors rollup` to mark a previously approved
PR, `@bors rollup-` to un-mark as a rollup). 'Performing a Rollup' then means
collecting these changes into one PR and merging them all at once. The rollup
command accepts four values `always`, `maybe`, `iffy`, and `never`. See [the
command accepts four values: `always`, `maybe`, `iffy`, and `never`. See [the
Rollups section] of the review policies for guidance on what these different
statuses mean.

Expand All @@ -23,67 +23,99 @@ queue has been merged.
## Making a Rollup

1. Using the interface on [Homu queue], select pull requests and then
use "rollup" button to make a rollup pull request. (The text about
fairness can be ignored.)
use "rollup" button to make a rollup pull request.

**Important note**: consider for addition PRs marked as
`rollup=always`, `rollup=maybe` and `rollup=iffy`, based on the
review policies of [the Rollups section]. Be extra careful when
deciding what to include, in particular on `rollup=maybe` and
`rollup=iffy` PRs. We should try as much as possible to avoid risking
and hit regressions (bugs or perf). Also consider that contributors
to hit regressions (bugs or perf). Also consider that contributors
often forget to tag things with rollup=never, when they should have
done so, so when PRs are not explicitly tagged with rollup, be extra
careful.
careful. Also carefully consider the area of the compiler the PRs touch.
Two diagnostic PRs may actually conflict with each other, as they both
change compiler output, which causes failures in each other's tests,
when both of them are merged together in a rollup without causing git merge-conflicts.
In this case the older PR should be given the privilege to merge first
and the newer PR should then be rebased as needed.

2. Run the following command in the pull request thread:

```
@bors r+ rollup=never p=5
````

where 5 is the number of PRs contained in your rollup.
Comment on lines 45 to +48
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking that what the existing documentation says (rollups always p=5) might be better than p=pr_count that is usually done. There are a lot of cases where 7 PRs get rolled up with p=7 and sit in the queue for a number of hours, only to be leapfrogged by a p=8 rollup that contains much newer PRs. Seems to have a tendency to break FIFO.

Copy link
Member Author

@matthiaskrgr matthiaskrgr Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's why I added the "Rollups should not overlap" rule.

Copy link
Member

@jieyouxu jieyouxu Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can just ping the contributors who are assigning rollup p>5 (I've done this a couple of times myself) and just point them to p=5 being the good default and having that respect queue order. It helps to elaborate on the reasoning for choice of p=5 and not arbitrary p=N where N is the number of PRs rolled up.

If we update the docs here and then still notice contributors assigning p > 5 to rollups, we can just ping them and point them to the update docs here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thanks for the edits)

Just wanted to say, we should make these rollup advice as accessible and useful to onboard other contributors w/ r+ perms who may want to help to do rollups too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Somewhat unrelated, but also I feel like the p=??? advice is at times too vague to be useful, at least when I read some of the forge docs related to p=??? advice)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have something like
tool/subtree updates p=5 and rollups p=5 as default, so that we still have some headroom to 4 or 3 things if needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I feel like the tool/subtree updates should receive same priority as rollups and all pinned at p=5 exactly, good point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmh imo it makes sense to prioritise subtree/module updates over rollups since doing a tool sync is often more complicated than rebasing PRs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just, they should receive a consistent p between themselves, but otherwise it feels whatever to me.

3. If the rollup fails, use the logs rust-log-analyzer
provides to bisect the failure to a specific PR and do
`@bors r-`. If the PR is running, you need to do `@bors r- retry`. Otherwise,
provides to bisect the failure to a specific PR and
`@bors r-` the PR. If the PR is running, you need to do `@bors r- retry`. Otherwise,
your rollup succeeded. If it did, proceed to the next rollup (every now and
then let `rollup=never` and toolstate PRs progress).
4. Recreate the rollup without the offending PR starting again from **1.**. There's a link in the rollup PR's body to automatically prefill the rollup UI with the existing PRs (minus any PRs that have been `r-`d)
then let `rollup=never/iffy` and toolstate PRs progress).
4. Recreate the rollup without the offending PR starting again from **1.**.
There's a link in the rollup PR's body to automatically prefill the rollup UI
with the existing PRs (minus any PRs that have been `r-`d)
Try avoiding adding any additional PR to the current "batch" as this
unnecessarily increases the chance of test failures.
Rollups should not overlap, if a PR is already contained in a rollup that is not closed,
it should not be added to another different rollup at the same time.
If a rollup fails and you are not sure which PR caused the problem,
you may bisect the rollup and split it up into two rollups until the offending PR becomes clear.

## Selecting Pull Requests

The queue is sorted by rollup status. In general, a good rollup includes one or two `iffy` PRs (if available), a bunch of `maybe` (unmarked) PRs, and a large pile of `always` PRs. A rollup should never include `rollup=never` PRs.
The queue is sorted by rollup status. In general, a good rollup contains a bunch of `maybe` (unmarked) PRs, and a large pile of `always` PRs. You can include one or two `iffy` PRs if you are confident that they will pass.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure including "iffy" PRs is a good idea in rollups. I personally tend to never includ them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kinda the whole reason we have "iffy". If you never include them, then they may as well be marked "never".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.

Copy link
Member

@jieyouxu jieyouxu Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally consider iffy PRs to be the whole point of rollups, tbh. Some PRs are clearly rollup=never if you are almost certain they will fail or if they have perf implications, but some of the iffy PRs are for the "it can fail, but we're not too sure" cases: it's great if it passes full CI, in which case we saved a bunch of time. It's also great if it fails full CI, in which case the failure time is amortized.

If you have a lot of time, you can also make a rollup just from `iffy` PRs (if there are enough of them) and weed out the failures one by one.
A rollup should never include `rollup=never` PRs.

The actual absolute size of the rollup can depend based on experience and current length of the queue.
People new to making rollups might start with including 1 `iffy`, 2 `maybe`s, and 4 `always`s. Usually 6-8 PRs per rollup is a good compromise.
There is rarely a need to roll up more than 10 PRs at once (unless there are >30 PRs waiting the queue), keep in mind that we also try to minimize regressions per merge.

The actual absolute size of the rollup can depend based on experience, people new to making rollups might start with including 1 `iffy`, 4 `maybe`s, and 5 `always`s, but more experienced people might even make a rollup of 1-2 `iffy`s, 8 `maybe`s, and 10 `always`s! Massive rollups are rarely needed, but as your intuition grows you'll get better at judging risk when including PRs in a rollup.
Don't try to make mega-rollups (15-20 PRs that merge half or more of the entire queue all at once) to keep the number of perf or bug regressions per merge as low as possible and keep potential regressions [bisectable].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Don't try to make mega-rollups (15-20 PRs that merge half or more of the entire queue all at once) to keep the number of perf or bug regressions per merge as low as possible and keep potential regressions [bisectable].
Limit the size of rollups, even if the queue is backed up -- large rollups run the risk of failing or merge conflicts, and smaller rollups keep potential regressions [bisectable]. On average, rollups are <N> PRs large, often varying from <N - M> to <N + M> depending on the number of `rollup=always` PRs that can be included.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

choose some value for N and M.

Copy link
Member

@Noratrieb Noratrieb Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that thanks to unrolled builds, bisection can be done within a roll-up and cargo bisect rustc does that


Don't hesitate to downgrade the rollup status of a PR! If your intuition tells you that a `rollup=always` PR has some chances for failures, mark it `rollup=maybe` or `rollup=iffy`. A lot of the unmarked `maybe` PRs are categorized as such because the reviewer may not have considered rollupability, so it's always worth picking them with a critical eye. Similarly, if a PR causes your rollup to fail, it's worth considering changing its rollup status
Don't hesitate to downgrade the rollup status of a PR! If your intuition tells you that a `rollup=always` PR has some chances for failures, mark it `rollup=maybe` or better `rollup=iffy`. A lot of the unmarked `maybe` PRs are categorized as such because the reviewer may not have considered rollupability, so it's always worth picking them with a critical eye. Similarly, if a PR causes your rollup to fail, it's worth considering changing its rollup status.

Generally, PRs, that touch CI configuration or the bootstrapping process are probably `iffy` and should be handled with care. On the other hand, PRs that just edit docs are usually `rollup=always`.
Generally, PRs that touch CI configuration or the bootstrapping process are probably `iffy` or `never` and should be handled with care. On the other hand, PRs that just edit docs are usually `rollup=always`.

Avoid having too many PRs with large diffs or submodule changes in the same rollup. Also avoid having PRs you suspect will have large perf impacts, and mark them as `rollup=never`.
Avoid having too many PRs with large diffs or subtree changes in the same rollup. Self-contained submodule changes (such as `miri` updates) on the other hand may be fine to be rolled up with other unrelated PRs.
Also avoid having PRs you suspect will have large perf impacts and mark them as `rollup=never`.

It's tempting to avoid including `iffy` PRs at all since ideally you want your rollup to succeed. However, it's worth remembering that the job of the PR queue is to _test_ PRs, not to land them. As such, a rollup that fails because of an `iffy` PR is a good thing, since that PR would have to be tested at _some point_ anyway and it would have taken up the same amount of time to test if it never got included in a rollup. One way to look at rollups when it comes to `iffy` PRs is that a rollup is a way for a bunch of other PRs to piggyback on the CI cycle that the `iffy` PR needs anyway. If rollups avoid `iffy` PRs entirely what ends up happening is that these PRs tend to languish in the queue for a long time, which isn't good.
If an `iffy` PR keeps failing in a rollups, it should be marked `never` to prevent it from causing further problems inside unrelated rollups. This will also cause it to bump up in front of all `maybe`s in the queue and the author will get feedback quicker in case of subsequent failures.
It should be noted which runner the PR failed on, to run this runner as a `try-job` job and make sure it succeeds there before another merge is attempted (example on syntax [here]).
In general, if possible, try to test a failed PR via a handful of carefully selected try-jobs instead of having to run the full battery of all 60 runners on if it's likely a PR may fail again.

Similarly, make sure to leave some spare CI cycles so that `never` PRs also get a chance! If you're the only person making rollups it's worth letting them run during times you're not paying attention to the queue, but these days there are rollup authors in multiple time zones, so it's often best to just keep an eye on the relative size of the queue and put aside a couple CI cycles for `never` PRs, especially if they pile up.
To not have `never` or `iffy` PRs stuck in the queue indefinitely, it is recommended to alternate between rollup and non-rollup prs, so one `never`, one rollup, one `iffy`, one `rollup`, one `never` etc, until most of the `never`s are merged.
If you are the only person making rollups, you can also leave a couple of `never`/`iffy`s for a time where you know nobody will be doing rollups actively, or for weekends which generally see a lower number of PR approvals.

Try to be fair with rollups: Rollups are a way for things to jump the queue. For `rollup=maybe` PRs, try to include the oldest one (at the top of the section) so that newer PRs aren't jumping the queue over older PRs entirely. You don't have to include every PR older than PRs included in your rollup, but try to include the oldest. Similar to the perspective around `iffy`, it's useful to look at a rollup as a way for other PRs to piggyback on the CI cycle of the oldest PR in queue.
Very old (several months) or very large PRs that are extremely prone to merge conflicts may also be given a slight priority bump (`p=1`) to finally get them out of the queue without having to rebase them repeatedly.
Ultimately, we want to keep the number of regressions per merge at a minimum while also minimizing the amount of time between approval and the final merge of a PR, to avoid unnecessary merge conflicts and rebases.


## Failed rollups
If the rollup has failed, run the `@bors retry` command if the
failure was spurious (e.g. due to a network problem or a timeout). If it wasn't spurious,
find the offending PR and throw it out by copying a link to the rust-logs-analyzer comment,
and writing `Failed in <link_to_comment>, @bors r-`. Hopefully,
the author or reviewer will give feedback to get the PR fixed or confirm that it's not
at fault. The failed rollup PR can be closed.
failure was spurious (e.g. due to a network problem or a timeout).
There may be a matching `CI-spurious-fail-.*` label that you can use to tag the PR as such, to help discover common fail patterns.
If it wasn't spurious, find the offending PR and throw it out by copying a link to the rust-logs-analyzer comment,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If it wasn't spurious, find the offending PR and throw it out by copying a link to the rust-logs-analyzer comment,
If it wasn't spurious, find the offending PR and return it to the author to be fixed by copying a link to the rust-log-analyzer comment,

and leaving a comment like `Failed in <link_to_comment>, @bors r-`.
In case the log-analyzer does not give any meaningful output, you can directly open the ci-logs (the `(web)` link), find the point where the error was thrown
and directly copy the URL to the respective line in the log output.
Hopefully, the author or reviewer will give feedback to get the PR fixed or confirm that it's not
at fault. The failed rollup PR should then be closed.

Once you've removed the offending PR, recreate your rollup without it (see 1.).
Merge one batch of PRs by throwing out the failures one by one instead of adding new PRs to it, as this may introduce additional points of failure.

Once you've removed the offending PR, re-create your rollup without it (see 1.).
Sometimes however, it is hard to find the offending PR. If so, use your intuition
to avoid the PRs that you suspect are the problem and recreate the rollup.
Another strategy is to raise the priority of the PRs you suspect,
mark them as `rollup=never` (or `iffy`) and let bors test them standalone to dismiss
or confirm your hypothesis.
or confirm your hypothesis, or split the rollup into 2 smaller ones until are certain of the failure cause. If a PR was found to be the cause and other PRs were "wrongfully" `iffy`'d, they can of course be reprioritised as `maybe` again.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
or confirm your hypothesis, or split the rollup into 2 smaller ones until are certain of the failure cause. If a PR was found to be the cause and other PRs were "wrongfully" `iffy`'d, they can of course be reprioritised as `maybe` again.
or confirm your hypothesis, or split the rollup into 2 smaller ones until you are certain of the failure cause. If a PR was found to be the cause and other PRs were "wrongfully" marked `iffy`, they can of course be reprioritised as `maybe` again with `@bors rollup=maybe` or `@bors rollup-`.


If a rollup continues to fail you can run the `@bors rollup=never` command to
If a PR in a rollup continues to fail you can run the `@bors rollup=never` command to
never rollup the PR in question.
Comment on lines +115 to 116
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If a PR in a rollup continues to fail you can run the `@bors rollup=never` command to
never rollup the PR in question.
If a PR in a rollup continues to fail you can run the `@bors rollup=never` command to
ensure the PR gets tested independently, since it's likely it will fail again in the future.


[Homu queue]: https://bors.rust-lang.org/queue/rust
[the Rollups section]: ../compiler/reviews.md#rollups
[here]: https://github.com/rust-lang/rust/pull/132434#issue-2628063878
[bisectable]: https://rust-lang.github.io/cargo-bisect-rustc/
Loading