-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prechecks for asv #2107
base: master
Are you sure you want to change the base?
Prechecks for asv #2107
Conversation
b00e8e9
to
f23a441
Compare
ca077d4
to
3553540
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments. Apart from that I see the point in not running the asv checks on every build since it takes 35 minutes.
However couldn't we run it in parallel with the asv benchmarks and make it block merging? This shouldn't increase time to run the tests, since benchmarks take way longer. However I'm not sure if it's worth the extra CPU (will leave up to you or maybe @G-D-Petrov who knows more about our CI pipelines)
python/utils/asv_checks.py
Outdated
if not ok_errors_list is None: | ||
for ok_error in ok_errors_list: | ||
err_output.replace(ok_error, "") | ||
err_output = re.sub(r'\s+', '', err_output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes all whitespace, correct? Wouldn't this make the case where we have some leftover errors harder to read?
E.g. if we have 2 errors:
Expected error which should be removed
Unexpected error which should persist and be displayed
In the error message bellow the unexpected error will have removed whitespaces.
Also this could be a problem if the ok_error_list contains more than one error and one of them has a whitespace.
E.g. ok_errors_list = ["Expected 1", "Expected 2"]
And we have the logs which contain just the two expected errors:
Expected 1
Expected 2
After the first iteration the error would become:
Expected2
Which wouldn't match the second expected error and we would end up with an "Unkown error" even though we only have 2 exoected errors.
Probably doesn't matter too much as you only ever use a single expected error. Still it would be better to instead of remove all whitespace after each error to check err_output.strip()==""
in the end.
python/utils/asv_checks.py
Outdated
|
||
def get_project_root(): | ||
file_location = os.path.abspath(__file__) | ||
return file_location.split("/python/")[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would theoretically fail if our working directory has a python lib in the path. E.g. imagine you install arcticdb in:
/home/grusev/code/python/ArcticDB/python/python_code
. We could do something like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to increase confidence :-)
python/utils/asv_checks.py
Outdated
sys.path.insert(0,f"{path}/python") | ||
sys.path.insert(0,f"{path}/python/tests") | ||
|
||
bencmark_config = f"{path}/python/.asv/results/benchmarks.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: bencHmark
python/utils/asv_checks.py
Outdated
they would need to be in order for completion of current PR""") | ||
print("_" * 80) | ||
|
||
print("\n\nCheck 1: Executing check for python cod of asv tests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: codE
.github/workflows/asv_checks.yml
Outdated
default: true | ||
|
||
jobs: | ||
run-asv-check-script: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @IvoDD,we don't need this to be a separate flow, just a separate job in the analysis_flow.
This way it will be easier for people, as they would have to check only 1 flow.
Let's move this job to the analysis_flow.yml, similarly to the code_coverage job there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree makes lots of sence
.github/workflows/asv_checks.yml
Outdated
@@ -0,0 +1,79 @@ | |||
name: Run ASV Tests Check Python Script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you call this "ASV Linting" or something it will be more obvious to people that it doesn't actually run the benchmarks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but note that this does something additional as check of versions of benchmark tests ... see bellow
.github/workflows/asv_checks.yml
Outdated
VCPKG_NUGET_USER: ${{secrets.VCPKG_NUGET_USER || github.repository_owner}} | ||
VCPKG_NUGET_TOKEN: ${{secrets.VCPKG_NUGET_TOKEN || secrets.GITHUB_TOKEN}} | ||
CMAKE_C_COMPILER_LAUNCHER: sccache | ||
CMAKE_CXX_COMPILER_LAUNCHER: sccache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't get why we need these compiler settings given that the whole point is we don't need to build the wheel to run these linting checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to do the checks arcticdb library needs to be installed ... And installing released arcticdb does not help either as in benchmarks module we use libs from tests package (tested already as initally we wanted to be part of asv main worflow action) ... thus I need to invoke "pip install -ve ." which would do builds hence I coppied all things that would be needed for that from another workflow
If there is a way to achieve that without doing full CPP build I am ok to try this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed after trasitionning this as per GP's comment this is not relevant anymore
from typing import List | ||
|
||
|
||
def error(mes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use logging
not print statements in all PRs please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will start using it primarily
python/utils/asv_checks.py
Outdated
if error_code == 0: | ||
print("ABOVE ERRORS DOES NOT AFFECT FINAL ERROR CODE = 0") | ||
|
||
if not output is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if output is not None
is more idiomatic (same applies elsewhere)
python/utils/asv_checks.py
Outdated
if not err_output is None: | ||
error(err_output) | ||
if error_code == 0: | ||
print("ABOVE ERRORS DOES NOT AFFECT FINAL ERROR CODE = 0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"DO NOT" not "DOES NOT"
python/utils/asv_checks.py
Outdated
orig_hash = compute_file_hash(benchmark_config) | ||
|
||
print("_" * 80) | ||
print("""IMPORTANT: The tool checks CURRENT versions of asv tests along with asv.conf.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to this:
print("""IMPORTANT: The tool checks CURRENT ACTUAL versions of asv benchmark tests along with the one in benchmarks.json file.
That means that if there are files that are not submitted yet (tests and benchmark.json),
they would need to be in order for completion of current PR.
benchmarks.json is updated with a version number calculated as a hash
of the python test method. Thus any change of this method triggers different
version. Hence you would need to update json file also.
It happens automatically if you run following commandline:
> asv run --bench just-discover --python=same """)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference Issues/PRs
What does this implement or fix?
There are couple of checks that need to be done before making sure asv benchmark tests are ok to be merged:
Check that code of tests is ok and can run
Checks that versions of benchmark tests are up to date in asv,conf.json
this PR will prepare a script which can be run by anyone before making PR for review as well as github action that will execute it to confirm all is ok
A new python utility is added to do the required checks. Usage:
This tool now can be used in a github action to do automatic check
Successfull check of Job: https://github.com/man-group/ArcticDB/actions/runs/12713349006/job/35441035657
A job failed because benchmark.json not up to date: https://github.com/man-group/ArcticDB/actions/runs/12711972129/job/35436505808
NOTE:
The most efficien way to do ASV check is on your machine wither by executing the both above mentioned commands or the script python python/utils/asv_checks.py (better)
On github currently the workflow is not efficient as it does approx 35min build and the actual check is just 5 secs afterwards. Why it is not efficient?
Any other comments?
Checklist
Checklist for code changes...