Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow (2 hours) for 7000 targets #221

Open
jjh42 opened this issue Jul 27, 2024 · 7 comments
Open

Slow (2 hours) for 7000 targets #221

jjh42 opened this issue Jul 27, 2024 · 7 comments

Comments

@jjh42
Copy link

jjh42 commented Jul 27, 2024

Hi,

I'm not quite sure how to get a simple reproduction. In our repo (bazel 7.1.1, bazel-diff 7.0) when we have a large number (7000) targets change (due to e.g. changing a build flag) then bazel-diff get-impacted-targets step takes 2 hours to complete.

We are running a setup very similar to the example script in this repo.

I'm just wondering if you had any pointer how we could try and determine what's going on. From other comments, I don't think this is expected.

@tinder-maxwellelliott
Copy link
Collaborator

Hello there @jjh42,

Thanks for reporting this. Is there any way you could give #193 a try? You will need to set the --modified-filepaths flag seen here

names = ["-m", "--modified-filepaths"],
to get the speed up

@jjh42
Copy link
Author

jjh42 commented Jul 29, 2024

Thanks for your response. I will give that a try. I think I can just use bazel's git_override to get that particular commit.

Btw is there a reason you don't merge that branch in / any downsides?

@tinder-maxwellelliott
Copy link
Collaborator

I don't see any downsides from this approach, it should be just as precise with dramatically less file reads. I will work on trying to use this internally to help with validation

@tinder-maxwellelliott
Copy link
Collaborator

Did this end up working for you?

@jjh42
Copy link
Author

jjh42 commented Sep 12, 2024

Hi @tinder-maxwellelliott. Unfortunately it didn't seem to help us much. Still taking over 2 hours.

Just to confirm we get a list of targets that changed with

git diff --name-only "$previous_revision" "$final_revision" > "$modified_file_paths"

and then the two generate-hashes we add

--modified-filepaths=$modified_file_paths

@tinder-maxwellelliott
Copy link
Collaborator

Are you using cquery for your execution? That can explain the slowdown

@jjh42
Copy link
Author

jjh42 commented Dec 6, 2024

(sorry for the slow response, I didn't see a notification about this reply).

I don't think so? This is the script we were trying.

#!/bin/bash
set -e
echo "Starting bazel diff $(date)"
# Path to your Bazel WORKSPACE directory
workspace_path=$1
# Path to your Bazel executable
bazel_path=$2
# Starting Revision SHA
previous_revision=$3
# Final Revision SHA
final_revision=$4
impacted_targets_path=$5
starting_hashes_json="/tmp/starting_hashes.json"
final_hashes_json="/tmp/final_hashes.json"
bazel_diff="/tmp/bazel_diff"
modified_file_paths="/tmp/modified_file_paths.txt"

"$bazel_path" run //:bazel-diff --script_path="$bazel_diff"

echo git -C "$workspace_path" checkout "$previous_revision" --quiet
git -C "$workspace_path" checkout "$previous_revision" --quiet
git stash

# This should provide a list of files that were modified between the two revisions
git diff --name-only "$previous_revision" "$final_revision" > "$modified_file_paths"

echo "Generating Hashes for Revision '$previous_revision'"
$bazel_diff generate-hashes --modified-filepaths=$modified_file_paths -w "$workspace_path" -b "$bazel_path" $starting_hashes_json --includeTargetType

git -C "$workspace_path" checkout -f "$final_revision" --quiet
git stash

echo "Generating Hashes for Revision '$final_revision'"
$bazel_diff generate-hashes --modified-filepaths=$modified_file_paths -w "$workspace_path" -b "$bazel_path" \
   $final_hashes_json --includeTargetType

echo "Determining Impacted Targets $(date)"`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants