Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prechecks for asv #2107

Open
wants to merge 29 commits into
base: master
Choose a base branch
from
Open

Prechecks for asv #2107

wants to merge 29 commits into from

Conversation

grusev
Copy link
Collaborator

@grusev grusev commented Jan 8, 2025

Reference Issues/PRs

What does this implement or fix?

There are couple of checks that need to be done before making sure asv benchmark tests are ok to be merged:

Check that code of tests is ok and can run

asv check --python=same

Checks that versions of benchmark tests are up to date in asv,conf.json

asv run --bench just-discover --python=same

this PR will prepare a script which can be run by anyone before making PR for review as well as github action that will execute it to confirm all is ok

A new python utility is added to do the required checks. Usage:

python python/utils/asv_checks.py

This tool now can be used in a github action to do automatic check

Successfull check of Job: https://github.com/man-group/ArcticDB/actions/runs/12713349006/job/35441035657

A job failed because benchmark.json not up to date: https://github.com/man-group/ArcticDB/actions/runs/12711972129/job/35436505808

NOTE:

  1. The most efficien way to do ASV check is on your machine wither by executing the both above mentioned commands or the script python python/utils/asv_checks.py (better)

  2. On github currently the workflow is not efficient as it does approx 35min build and the actual check is just 5 secs afterwards. Why it is not efficient?

  • it needs build. and as there is no build repository of previous jobs like build job the build has to be repeated here
  • this cannot be combines with ASV tests. The asv tests will do specific build for their own purpose and run the tests. The build AND the test are one atomic process thus we cannot plugin there and even if we plugin the inefficiency remains in terms of time

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@grusev grusev force-pushed the asv_slow branch 2 times, most recently from b00e8e9 to f23a441 Compare January 9, 2025 16:01
@grusev grusev force-pushed the asv_slow branch 2 times, most recently from ca077d4 to 3553540 Compare January 10, 2025 09:21
@grusev grusev marked this pull request as ready for review January 10, 2025 18:10
Copy link
Collaborator

@IvoDD IvoDD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments. Apart from that I see the point in not running the asv checks on every build since it takes 35 minutes.

However couldn't we run it in parallel with the asv benchmarks and make it block merging? This shouldn't increase time to run the tests, since benchmarks take way longer. However I'm not sure if it's worth the extra CPU (will leave up to you or maybe @G-D-Petrov who knows more about our CI pipelines)

if not ok_errors_list is None:
for ok_error in ok_errors_list:
err_output.replace(ok_error, "")
err_output = re.sub(r'\s+', '', err_output)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes all whitespace, correct? Wouldn't this make the case where we have some leftover errors harder to read?

E.g. if we have 2 errors:

Expected error which should be removed
Unexpected error which should persist and be displayed

In the error message bellow the unexpected error will have removed whitespaces.

Also this could be a problem if the ok_error_list contains more than one error and one of them has a whitespace.
E.g. ok_errors_list = ["Expected 1", "Expected 2"]
And we have the logs which contain just the two expected errors:

Expected 1
Expected 2

After the first iteration the error would become:

Expected2

Which wouldn't match the second expected error and we would end up with an "Unkown error" even though we only have 2 exoected errors.

Probably doesn't matter too much as you only ever use a single expected error. Still it would be better to instead of remove all whitespace after each error to check err_output.strip()=="" in the end.


def get_project_root():
file_location = os.path.abspath(__file__)
return file_location.split("/python/")[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would theoretically fail if our working directory has a python lib in the path. E.g. imagine you install arcticdb in:
/home/grusev/code/python/ArcticDB/python/python_code. We could do something like this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to increase confidence :-)

sys.path.insert(0,f"{path}/python")
sys.path.insert(0,f"{path}/python/tests")

bencmark_config = f"{path}/python/.asv/results/benchmarks.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: bencHmark

they would need to be in order for completion of current PR""")
print("_" * 80)

print("\n\nCheck 1: Executing check for python cod of asv tests")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: codE

default: true

jobs:
run-asv-check-script:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @IvoDD,we don't need this to be a separate flow, just a separate job in the analysis_flow.
This way it will be easier for people, as they would have to check only 1 flow.
Let's move this job to the analysis_flow.yml, similarly to the code_coverage job there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree makes lots of sence

@@ -0,0 +1,79 @@
name: Run ASV Tests Check Python Script
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you call this "ASV Linting" or something it will be more obvious to people that it doesn't actually run the benchmarks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but note that this does something additional as check of versions of benchmark tests ... see bellow

VCPKG_NUGET_USER: ${{secrets.VCPKG_NUGET_USER || github.repository_owner}}
VCPKG_NUGET_TOKEN: ${{secrets.VCPKG_NUGET_TOKEN || secrets.GITHUB_TOKEN}}
CMAKE_C_COMPILER_LAUNCHER: sccache
CMAKE_CXX_COMPILER_LAUNCHER: sccache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't get why we need these compiler settings given that the whole point is we don't need to build the wheel to run these linting checks

Copy link
Collaborator Author

@grusev grusev Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do the checks arcticdb library needs to be installed ... And installing released arcticdb does not help either as in benchmarks module we use libs from tests package (tested already as initally we wanted to be part of asv main worflow action) ... thus I need to invoke "pip install -ve ." which would do builds hence I coppied all things that would be needed for that from another workflow

If there is a way to achieve that without doing full CPP build I am ok to try this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed after trasitionning this as per GP's comment this is not relevant anymore

from typing import List


def error(mes):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use logging not print statements in all PRs please

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will start using it primarily

if error_code == 0:
print("ABOVE ERRORS DOES NOT AFFECT FINAL ERROR CODE = 0")

if not output is None:
Copy link
Collaborator

@poodlewars poodlewars Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if output is not None is more idiomatic (same applies elsewhere)

if not err_output is None:
error(err_output)
if error_code == 0:
print("ABOVE ERRORS DOES NOT AFFECT FINAL ERROR CODE = 0")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"DO NOT" not "DOES NOT"

orig_hash = compute_file_hash(benchmark_config)

print("_" * 80)
print("""IMPORTANT: The tool checks CURRENT versions of asv tests along with asv.conf.json")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to this:

print("""IMPORTANT: The tool checks CURRENT ACTUAL versions of asv benchmark tests along with the one in benchmarks.json file.
        That means that if there are files that are not submitted yet (tests and benchmark.json),
        they would need to be in order for completion of current PR.
        benchmarks.json is updated with a version number calculated as a hash
        of the python test method. Thus any change of this method triggers different
        version. Hence you would need to update json file also.
        It happens automatically if you run following commandline:
         > asv run --bench just-discover --python=same  """)

Copy link
Collaborator

@poodlewars poodlewars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants