Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.4.0] Previous failing test masks correct result in remote cache #23259

Closed
bazel-io opened this issue Aug 9, 2024 · 4 comments
Closed

[7.4.0] Previous failing test masks correct result in remote cache #23259

bazel-io opened this issue Aug 9, 2024 · 4 comments

Comments

@bazel-io
Copy link
Member

bazel-io commented Aug 9, 2024

Forked from #11057

@bazel-io bazel-io added this to the 7.4.0 release blockers milestone Aug 9, 2024
@bazel-io
Copy link
Member Author

bazel-io commented Aug 9, 2024

The branch, release-7.4.0, may not exist. Please retry the cherry-pick after the branch is created.

@bazel-io
Copy link
Member Author

Cherry-pick was attempted but there were merge conflicts in the following file(s). Please resolve manually.

src/main/java/com/google/devtools/build/lib/analysis/test/TestRunnerAction.java

cc: @bazelbuild/triage

@iancha1992
Copy link
Member

Cherry-pick was attempted but there were merge conflicts in the following file(s). Please resolve manually.

src/main/java/com/google/devtools/build/lib/analysis/test/TestRunnerAction.java

cc: @bazelbuild/triage

cc: @meteorcloudy @fmeum @tjgq

@iancha1992
Copy link
Member

Cherry-picked in #23673

github-merge-queue bot pushed a commit that referenced this issue Sep 19, 2024
…llowing … (#23673)

…a failure of the same test.

A source edit might have occurred in the interim, putting the test back
into a passing state that has been previously stored in the cache. The
current logic disables disk/remote caching when rerunning a test
following a failure, which causes spurious reexecutions (see issue
#11057).

Although this code has been around for a very long time, it seems that
it was originally added to ensure that a flaky test failure doesn't
become sticky. However, for this purpose it suffices to bust the local
cache, as a disk/remote cache should never store failures (if it did,
this would not be a complete fix, and we'd have much bigger problems
elsewhere). Note that Bazel itself never uploads local failures to a
disk/remote cache, but can't control whether a remote executor does.

The fix is to split the `TestRunnerAction.canBeCached` logic into two
versions: one that takes into account a previous test result (for the
local cache check) and another that doesn't (to determine whether we
should attempt to hit the disk/remote cache).

Fixes #11057.

PiperOrigin-RevId: 661319835
Change-Id: I9248cbfa31dd135b2bda971e48bb17d7f828889c 
(cherry picked from commit e9709b7)

Fixes #23259

Co-authored-by: Googler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants