Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanups and Speedup for python unittests #13439

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

dcbaker
Copy link
Member

@dcbaker dcbaker commented Jul 16, 2024

I initially set out to just do some long overdue maintenance of our unit tests, along way I cleaned up a number of issues that have been bugging me for a while:

  • use mock instead of try/finally blocks
  • use os.enivon.get so that some things don't have to be set to run
  • move a lot of constant setup to setUpClass instead of setUp, which means the fixtures get run less
  • use more of our helpers instead of open coding
  • make more use of UnitTest.addCleanup
  • actually clean up more often, and move the cleanup code closer to the producer.
  • Allow our cleanup of Meson calls to happen even if the call to Meson itself fails
  • cleanup some complicated logic
  • use better/faster types
  • use properties for rarely used dynamic attributes
  • Use tmpdir instead of source dir when we can reasonably assume that there will not be Anti-Virus, and allow Windows users to force the use of the tmpdir

For me the end result is a reduction in a full unit test run time by ~15%, and there are no stray files left in the source tree compared to the main branch.

I wont be surprised if there are a few regressions on Windows to clean up.

@dcbaker dcbaker requested a review from jpakkane as a code owner July 16, 2024 22:22
@dcbaker
Copy link
Member Author

dcbaker commented Jul 16, 2024

I've also reworded a few of the commit messages to fix typos, but I'll wait to re-push so I don't flood the CI

@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch from a94fe1c to 2f66081 Compare July 16, 2024 22:51
Comment on lines -579 to -594
if is_cygwin():
self.new_builddir_in_tempdir()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit message doesn't justify why it's okay to allow the umask tests to now run in the source tree by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it doesn't run on the source tree, and that has nothing to do with the source tree? The source tree is always, unconditionally copied into a new directory, the only thing that changes here is where the build directory is being placed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit message doesn't justify why it's okay to allow the umask tests to now run in the meson.git source tree by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the reason I'm mentioning this is because the comment in the now deleted function says cygwin requires forcing a tempdir for this small subset of tests, it doesn't say anything about the relationship between the directory with meson.build and the one with build.ninja

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, your comment confused me, I thought you were suggesting that the test is now configuring from the source tree, not that there were issues with the build directory being in the source tree.

I'm curious as to what the actual original issue with installing in the source dir was, and why it doesn't seem to be a problem now, looking at the code there was a CI regression on cygwin (when we were running cygwin on azure) from this, but it doesn't seem to happen now. I'm really confused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know either! :D

If we think it's no longer relevant then I'm not opposed to deleting it, it's just the commit message should indicate we are doing so (and why).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've added a paragraph about those changes to the commit message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting and interesting. This failure seems to be sporadic. I'm going to guess there's a race condition involved, but it sounds hard to solve, so I just more-or-less put the workaround back.

@@ -187,7 +187,7 @@ def test_validate_dirs(self):

# Using parent as builddir should fail
with self.subTest('parent directory'):
self.builddir = os.path.dirname(self.builddir)
self.builddir = self.common_test_dir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the solution to this is maybe that the _build_dir_root should be a single base directory in $TMPDIR, and all temporary builddirs should be created inside it?

Copy link
Member Author

@dcbaker dcbaker Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that wont work, this test as always pointed as a parent of the source directory, it just did it in a more round about way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that this actually made me think that we should do this:

/tmp/meson-tests-temp-XXXX
/tmp/meson-tests-temp-XXXX/tmpYYYYY
/tmp/meson-tests-temp-XXXX/tmpZZZZZ

If something goes catastrophically wrong in exiting the test harness, it's only one directory to rm -rf manually. And os.path.dirname(self.builddir) would be /tmp/meson-tests-temp-XXXX which is equally under our control, rather than being /tmp (my assumption is you didn't want the # Using parent as builddir should fail test to suddenly start trying to configure in /tmp itself).

Copy link
Member Author

@dcbaker dcbaker Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried pointing at /tmp initially and it works fine, the call to os.path.basename works because of the assumption that the build directory is inside the source tree.

It's really unfortunate we can't use pytest, because it has "session" level fixtures that would make this trivial. The best we can do with unittest (I think) is a per module directory, so you'd still end up with:

/tmp/meson-tests-allplatformtests-XXXX/
/tmp/meson-tests-windowstests-XXXX/
/tmp/meson-tests-machinefiletests-XXXX/

@@ -88,11 +88,11 @@ def setUpClass(cls) -> None:

# Misc stuff
if cls.backend is Backend.ninja:
cls.no_rebuild_stdout = ['ninja: no work to do.', 'samu: nothing to do']
cls.no_rebuild_stdout = frozenset({'ninja: no work to do.', 'samu: nothing to do'})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a genuine worry that this is going to be modified? Invoking frozenset() seems interesting in a commit claiming to be speeding things up. ;)

Copy link
Member Author

@dcbaker dcbaker Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my benchmarks using frozenset was slightly faster than a normal set for doing the lookups, and was only slightly slower for creating than a regular set. Since the frozenset only gets created once per class that seems like a pretty good tradeoff. If you're still concerned about that, this actually only gets used in allplatformtests.py, so we could move it to that class to speed things up even more. edit: it also gets used in WindowsTest

Copy link
Member

@eli-schwartz eli-schwartz Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I just wasn't sure why a set isn't good enough, that we need a frozenset. :P

You say a frozenset is slightly faster but I was under the impression that they both use the same CPython implementation and that frozensets are just sets with the mutability methods removed and __hash__ added so that you can use mydict[frozenset({'one'})]. So I would be surprised if there was any performance difference that wasn't actually environment noise e.g. system load at the time the benchmark was done.

The real difference between the two is that sets have syntax for construction, whereas frozensets always incur a name lookup and function call at the time of creation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I've just been doing too much C++ and Rust recently, and would rather have immutable than mutable types by default :D

@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch from 2f66081 to a9626a6 Compare July 17, 2024 16:28
@dcbaker
Copy link
Member Author

dcbaker commented Jul 17, 2024

Squashed and re-ordered a bit to address comments from IRC/Matrix. Attempting to fix the one test that is failing on Windows

@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch 4 times, most recently from 099f255 to 8ad08d7 Compare July 17, 2024 18:51
@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch from 8ad08d7 to 880b885 Compare July 17, 2024 19:23
@eli-schwartz
Copy link
Member

Partial review and partial merge of commits 85e9233...4b76aab

@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch 2 times, most recently from 9a803fa to 4c9587c Compare August 26, 2024 20:58
Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, late...

unittests/linuxliketests.py Outdated Show resolved Hide resolved
Comment on lines 189 to 190
with self.subTest('parent directory'):
self.builddir = os.path.dirname(self.builddir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems nice. Just thought I'd note the commit message for this claims these are "uint tests"...

@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch from 4c9587c to 950d96d Compare September 13, 2024 16:45
This moves the cleanup callback to the point the file is being created,
which is more accurate, and doesn't require adding `if os.path.exists()`
calls
Rather than relying on callers to cleanup after the helper.
Stop thrashing users SSDs because Windows AV is broken. This also helps
IDEs that try to keep up with the quickly created and destroyed files.

Users on Windows and Cygwin who don't have catastrophically bad AV
solutions can use the MESON_UNIT_TEST_FORCE_TMPDIR=1 environment
variable to get better behavior.
Set lookups are faster for membership checks.
This avoids having to set them up in cases we don't use them, or
override them in cases where we change the build directory
@dcbaker dcbaker force-pushed the submit/unittest-cleanups-and-speedups branch from 950d96d to 6dce000 Compare October 1, 2024 22:11
@@ -268,15 +268,13 @@ def test_subproject_variables(self):
'''
tdir = os.path.join(self.unit_test_dir, '20 subproj dep variables')
stray_file = os.path.join(tdir, 'subprojects/subsubproject.wrap')
if os.path.exists(stray_file):
windows_proof_rm(stray_file)
self.addCleanup(windows_proof_rm, stray_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes behaviour. The old code ensured that if the file was already there for whatever reason it would be deleted before the test was run.

Copy link
Member Author

@dcbaker dcbaker Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it's changing behavior. The current behavior is:

  1. the test creates the file
  2. the test runs, and makes no effort to delete the file
  3. on re-run the test deletes the file
    This is weird, and not something that we generally do.

This changes the behavior to match what basically all tests do, and what should happen:

  1. the test creates the file
  2. the test registers a callback to ensure that the file is cleaned up when the test exits, regardless of whether that is in error or in success
  3. on re-run the test doesn't need to look for the file, because it doesn't exist.

We have tons of tests that make the assumption "files I create as part of my test process do no pre-exist", and will die horribly, or worse, fail silently if they do.

@jpakkane
Copy link
Member

jpakkane commented Oct 2, 2024

Other than the one change LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants