Parallelize tests #17872

majocha · 2024-10-11T11:36:38Z

Enable running xUnit tests in parallel.

To use xUnit means to customize it. Two optional features added:

Running collection and theory cases in parallel based on https://www.meziantou.net/parallelize-test-cases-execution-in-xunit.htm
By default xUnit's unit of parallelization is test collection. Test cases in a collection run in sequence. Also, by default each class/module constitutes a collection. We have a lot of test cases in large modules and large theories that were bottlenecked by this.
This customization enables parallelism in such cases. It can be reverted to default for a particular module with [<RunInSequence>] attribute.
Relevant issue is on the roadmap for xUnit v3, so this will probably become unnecessary in the future.
Console streams captured universally and redirected to xUnit's output mechanism, which means you can just do printfn in a test case and it goes to the respective output.

This can be inspected in the IDE and in case of failure is printed out when testing from the command line.

The default way in xUnit is to use ITestOutputHelper. This is very unwieldy, because it requires placing test cases in a class with a constructor, and then threading the injected output helper into any function that wants to output text. We have many tests in modules not classes, and many of the tests are using a lot of utility functions. Adjusting it all to use ITestOutputHelper is not feasible. OTOH just outputting with printfn is unobtrusive, natural and works well with interactive prototyping of test cases.
This customization will probably become unnecessary with xUnit v3.

The above customizations are not required for the test suite to work correctly. They are contained to the XunitHelpers.fs file and enabled with conditional compilation using XUNIT_EXTRAS constant defined in FSharp.Test.Utilities.fsproj

Some local run times:

dotnet test .\tests\FSharp.Compiler.ComponentTests\ -c Release -f net9.0

Test summary: total: 4489, failed: 0, succeeded: 4258, skipped: 231, duration: 199.0s

dotnet test .\tests\fsharp\ -c Release -f net9.0

Test summary: total: 579, failed: 0, succeeded: 579, skipped: 0, duration: 41.9s

dotnet test .\FSharp.sln -c Release  -f net9.0

Test summary: total: 12963, failed: 0, succeeded: 12694, skipped: 269, duration: 253.3s

Some considerations to make this work and keep it working
To run tests in parallel we must deal with global resources and global state accessed by the test cases.

Out of proc:
Tests running as separate processes are sharing the file system. We must make sure they execute in their own temporary directories and don't overwrite any hardcoded paths. This is already done, mostly in separate PR.

Hosted:
Many tests use hosted compiler and FsiEvaluationSession, sharing global resources and global state within the runner process:

Console streams - this is swept under a rug for now by using a simple AsyncLocal stream splitter.
FileSystem global mutable of the file system shim - few tests that mutate it, must be excluded from parallelization.
Environment.CurrentDirectory - many tests executing in hosted session were doing a variation of File.WriteAllText("test.ok", "ok") all in the current directory i.e. bin, leading to conflicts. This is replaced with a threadsafe mechanism.
Environment variables, Path - mostly this applies to DependencyManager, excluded from parallelization for now.
Async default cancellation token - few tests doing Async.CancelDefaultToken() must be excluded from parallelization.
global state used in conjunction with --times option - tests excluded from parallelization.
global mutable state in the form of multiple caches implemented as ConcurrentDictionary. This seems no longer a problem, contained using some exclusions from parallelization.

I'll ad to the above list if I recall anything else.

Problems:
Tests depending on tight timing, orchestrating stuff by combinations of Thread.Sleep, Async.Sleep and wait timeouts.
These are mostly excluded from parallelization, some attempts at fixing things were made.

Obscure compiler bugs revealed in this PR:

Internal error: value cannot be null this mostly happens in coreClr, one time, sometimes a few times during the test run.
Error creating evaluation session because of NRE somewhere in TcImports.BuildNonFrameworkTcImports. This is more rare but may be related to the above.

These were related to some concurrency issues; modyfing frameworkTcImportsCache without lock and a bug in custom lazy implementation in il.fs. Hopefully both fixed now.

Running in parallel:
Xunit runners are configured with mostly default parallelization settings.

dotnet test .\FSharp.sln -c Release -f net9.0 will run all discovered test assemblies in parallel as soon as they're built.
This can be limited with the -m switch. For example,
dotnet test -m:2 .\FSharp.Compiler.Service.sln
will limit the test run to at most 2 simultaneous processes. Still, each test host process runs its test collections in parallel.

Some test collections are excluded form parallelization with [<Collection(nameof DoNotRunInParallel)>] attribute.

Running in the IDE with "Run tests in parallel" enabled will respect xunit.runner.json settings and the above exclusions.

TODO:

Make sure this keeps working properly with BUILDING_USING_DOTNET scenario (Attempt to make FCS solution build without arcade and with the SDK specified in global.json #14677)

github-actions · 2024-10-11T11:37:18Z

⚠️ Release notes required, but author opted out

Warning

Author opted out of release notes, check is disabled for this pull request.
cc @dotnet/fsharp-team-msft

psfinaki · 2024-10-15T13:48:16Z

Thanks for your endurance, Jakub 💪

majocha · 2024-10-18T17:39:06Z

@psfinaki I will need some help with this Source-Build error:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=847523&view=logs&j=2f0d093c-1064-5c86-fc5b-b7b1eca8e66a&t=52d0a7a6-39c9-5fa2-86e8-78f84e98a3a2&l=45

At this moment this is very stable locally but will also probably need testing on other machines than mine :)

What's left to do is to tune this for stability in CI. I've been trying different things and timing runs. The most glaring problem is the testDesktop. In CI desktop runs both FSharpSuite and ComponentTests take around 40 minutes each. I guess slicing the test suite and running with multi-agent parallel strategy would improve things here.
I added some simple provisions for easier slicing using traits: --filter ExecutionNode=n will now take a stable slice of the test suite (currently n is hardcoded 1..4)

I also noticed Linux run is constantly low on memory, this is unrelated as it happens on main, too. For this reason I set MaxParallelThreads=4 in build.sh to cool things down a bit.

psfinaki · 2024-10-21T10:56:44Z

@majocha the error is weird, nothing comes to my mind right away. Let's rebase and rerun and see if it's still happening... Sorry, I know this is somewhat lame, it's just that SourceBuild is a Linux thing and it's not trivial to debug its issues locally.

As for cooling things down - I also noticed this today, thanks for addressing this.

What else do you think we can split from this PR into some separate ones?

majocha · 2024-10-21T11:49:24Z

What else do you think we can split from this PR into some separate ones?

There are some small further fixes, maybe also the whole console handling does not really depend on parallel execution.

Somewhat related thing I have on my mind recently is to implement a FileSystem shim for tests that will be as much in-memory as possible and isolated per testcase. It wouldn't handle the tests that start separate processes, though.

majocha · 2024-10-21T14:15:49Z

@majocha the error is weird, nothing comes to my mind right away. Let's rebase and rerun and see if it's still happening... Sorry, I know this is somewhat lame, it's just that SourceBuild is a Linux thing and it's not trivial to debug its issues locally.

Thanks! Rebasing did help.

psfinaki · 2024-10-21T14:50:10Z

There are some small further fixes, maybe also the whole console handling does not really depend on parallel execution.

Yeah console handling would be probably good to isolate if possible.

Somewhat related thing I have on my mind recently is to implement a FileSystem shim for tests that will be as much in-memory as possible and isolated per testcase. It wouldn't handle the tests that start separate processes, though.

Just for my understanding, what would this add on top of the current results the PR achieves?

majocha · 2024-10-21T15:57:08Z

Just for my understanding, what would this add on top of the current results the PR achieves?

This would be an experiment for another PR, but basically, I don't like all that copying to temp dirs that I added in recent PRs.
A FileSystem shim just for testing, that virtualizes all of the writes and keeps track of which test case wrote what, to correctly isolate them, would be maybe possibly more performant and a cleaner solution.

psfinaki · 2024-10-21T17:17:04Z

Right, yeah, I see. No it's worth playing with, although given that we don't touch these tests too much, it's probably worth seriously investing into only if it starts yielding reasonable performance fruits.

majocha · 2024-10-24T15:03:20Z

src/Compiler/Driver/GraphChecking/GraphProcessing.fs

@@ -252,6 +252,7 @@ let processGraphAsync<'Item, 'Result when 'Item: equality and 'Item: comparison>
        let rec queueNode node =
            Async.Start(
                async {
+                    use! _catch = Async.OnCancel(completionSignal.TrySetCanceled >> ignore)


Fix. This was starting with a cancellation token but without catching the OCE.

majocha · 2024-10-24T15:04:10Z

src/Compiler/Utilities/illib.fs

@@ -137,7 +137,7 @@ module internal PervasiveAutoOpens =
    type Async with

        static member RunImmediate(computation: Async<'T>, ?cancellationToken) =
-            let cancellationToken = defaultArg cancellationToken Async.DefaultCancellationToken
+            let cancellationToken = defaultArg cancellationToken CancellationToken.None


We probably never want the default token to cancel compiler jobs.

majocha · 2024-10-24T15:06:28Z

src/FSharp.Core/mailbox.fs

-                (pulse :> IDisposable).Dispose()
+                if isNotNull pulse then
+                    (pulse :> IDisposable).Dispose()
+                    pulse <- null)


This is a provisional fix for #17849 This is a core change that should be extracted to a PR and carefully thought through.

majocha · 2024-10-24T15:07:42Z

tests/FSharp.Build.UnitTests/FSharp.Build.UnitTests.fsproj

@@ -4,17 +4,26 @@

  <PropertyGroup>
    <TargetFrameworks>net472;$(FSharpNetCoreProductTargetFramework)</TargetFrameworks>
-    <TargetFrameworks Condition="'$(OS)' == 'Unix'">$(FSharpNetCoreProductTargetFramework)</TargetFrameworks>
+    <TargetFrameworks Condition="'$(OS)' == 'Unix' or '$(BUILDING_USING_DOTNET)' == 'true'">$(FSharpNetCoreProductTargetFramework)</TargetFrameworks>


To make the whole FSharp.sln work with BUILDING_USING_DOTNET

majocha · 2024-10-24T15:21:12Z

tests/FSharp.Compiler.Private.Scripting.UnitTests/FSharpScriptTests.fs

@@ -85,15 +85,15 @@ x
        )
 #endif

-    [<Fact>]
+    [<Fact(Skip="TBD")>]


The logic of this test is unclear to me. The stream that the executed Console.Readline reads from is not necessarily the same as the one we passed to the FsiEvaluationSession.Create. We're just testing Console.Readline here. We probably should be testing that FsiEvaluationSession reads from the given input stream.

Yeah, this looks like yet another weird test, thanks for noticing :)

majocha · 2024-10-24T15:24:03Z

tests/FSharp.Compiler.Service.Tests/FsiTests.fs

        // Build command line arguments & start FSI session
        let argv = [| "C:\\fsi.exe" |]
        let allArgs = Array.append argv [|"--noninteractive"; if useOneDynamicAssembly then "--multiemit-" else "--multiemit+" |]

        let fsiConfig = FsiEvaluationSession.GetDefaultConfiguration()
-        FsiEvaluationSession.Create(fsiConfig, allArgs, inStream, new StreamWriter(outStream), new StreamWriter(errStream), collectible = true)
+        FsiEvaluationSession.Create(fsiConfig, allArgs, TextReader.Null, TextWriter.Null, TextWriter.Null, collectible = true)


There are actually no tests using these streams.

majocha · 2024-10-24T15:30:51Z

tests/FSharp.Test.Utilities/ILChecker.fs

@@ -15,8 +15,7 @@ module ILChecker =

    let private exec exe args =
        let arguments = args |> String.concat " "
-        let timeout = 30000


Running in parallel in CI makes the timings less predictable. There is a --blame-hang-timeout to catch actually hanging tests.

Yeah --blame-hang-timeout is a good thing and should have been there long ago.

majocha · 2024-10-24T15:33:08Z

tests/FSharp.Test.Utilities/ScriptHelpers.fs


 type FSharpScript(?additionalArgs: string[], ?quiet: bool, ?langVersion: LangVersion, ?input: string) =

+    do ignore input


This was only used in one test that is temporarily skipped awaiting a rewrite.

majocha · 2024-10-24T15:34:41Z

tests/FSharp.Test.Utilities/ScriptingShims.fsx

            () 
-        else failwith $"Script called function 'exit' with code={code} and collected in stderr: {errorStringWriter.ToString()}"


Large outputs can be now just printed to the test-specific stdout instead of being passed as exception messages.

majocha · 2024-10-24T15:41:40Z

tests/FSharp.Test.Utilities/XunitHelpers.fs

@@ -0,0 +1,222 @@
+#nowarn "0044"


This is needed by xUnit as it at the same time deprecates and requires some of its stuff. ¯\_(ツ)_/¯

majocha · 2024-10-24T15:42:39Z

tests/FSharp.Test.Utilities/XunitSetup.fs

@@ -0,0 +1,12 @@
+namespace FSharp.Test


This file must be linked in every test project to wire up xunit customizations.

psfinaki · 2024-10-25T11:13:47Z

tests/fsharp/xunit.runner.json

It looks like all the xunit.runner.json files look the same now - that's awesome! I think we can leave only one then and reference it from all the projects, maybe via some Directory.Props voodoo.

psfinaki · 2024-10-25T11:31:32Z

tests/FSharp.Test.Utilities/XunitHelpers.fs

TBH I am a bit scared of that many customizations... I see a lot of effort here and I very much appreciate this - you can probably start blogging about xUnit yourself now :)

Among other things, it's a question of maintenance here - we'd need to understand xUnit to the same depth as you do in order to change things here, and this might be now quite a long way.

Maybe you can order the customizations by how effective they are, so that we can probably apply the 80/20 principle here? I wonder if other dotnet repo do anything similar - that could also be an argument.

Also, do you have some gut feeling about how much this relies on the current xUnit hacks/shape? As in, which of those customization will get broken or become unneeded with some future xUnit updates?

Yes, I've been thinking about how to make this failsafe, so to say.
Right now there are 3 tiers of customizations in this file:

the console output capture - this one is essential to enable parallelization at all. It can also be made the least intrusive, as it can be installed with just a standard xUnit's BeforeAfter attribute if needed.

console output redirection to xUnit's output - this one is a very nice to have for two reasons: 1. The current situation is that we output large amounts of text by passing huge messages to failwithf 2. It's not realistic to rewrite everything to pass around ITestOutputHelper everywhere.
To make this simpler would require a proper static test context, I don't think that's coming to xUnit 3? I see ppl rolling their own solutions https://github.com/SimonCropp/XunitContext , not unlike what we have here.

speed ups related to theory and collection parallelization based on https://www.meziantou.net/parallelize-test-cases-execution-in-xunit.htm, they are also very nice to have, given that we have some huge modules with lots of tests each. This is possibly maybe something that will find its way to xUnit proper someday: Feature Request - Test case parallelization when there is no class/collection fixture xunit/xunit#1986 but not likely.

There is also some custom traits generation, unused experiment, totally throw-away.

I think it could be possible to encapsulate the changes here in some #ifdefs to make sure there is a safe failure mode and customizations can be at least switched off without bringing everything down.

Also, do you have some gut feeling about how much this relies on the current xUnit hacks/shape? As in, which of those customization will get broken or become unneeded with some future xUnit updates?

The API of xUnit v2 that this all relies on is fairly stable imv. I didn't really consider v3 but I'll take a look and maybe even test it with it.

Alright, thanks for all your efforts once again!

Yeah so the more we can achieve with standard Xunit means the better - but we can definitely do justified exceptions here.

Looking into xUnit 3 could be also very interesting - now that we've consolidated projects to use it and are also consolidating the execution configuration, that's at least easier to play with.

The idea with #ifdefs is also interesting. If this doesn't get spilled all of the code, it's something to consider.

The idea with #ifdefs is also interesting. If this doesn't get spilled all of the code, it's something to consider.

I think just this file.

xUnit v3 will hopefully make things easier by adding a AsyncLocal backed TestContext.Current.TestOutputHelper, and even a [<assembly: CaptureConsole>] attribute.
xunit/xunit#1730 (comment)

This will make a lot of the helper code here unnecessary, but our test cases should be already compatible and not require any changes.

I also put most of the customizations behind conditional compilation and did a test run with disabled customizations that passed ok.

Alright, yeah, that's great :)

psfinaki · 2024-10-29T12:47:28Z

tests/FSharp.Test.Utilities/XunitSetup.fs

+
+/// Exclude from parallelization. Execute test cases in sequence and do not run any other collections at the same time.
+[<CollectionDefinition(nameof DoNotRunInParallel, DisableParallelization = true)>]
+type DoNotRunInParallel = class end


So help me understand - why do we end up needing both DoNotRunInParallel and RunInSequence attributes?

Even if want both, the name should be clarified :)

Absolutely! This one is just xUnit's DisableParallelization. It's suitable for stuff that touches truly global state like DefaultCancellationToken. RunInSequence is related to XUNIT_EXTRAS as it restores xUnit's default behavior. E.g. runs a module tests one by one, in case that module has local shared state like a shared fsisession or something. In case XUNIT_EXTRAS is not defined, it does nothing.

But so can we try using DisableParallelization in places where RunInSequence is used? What effect would it have?

It would make things slower because DisableParallelization runs the excluded tests one by one after all the parallel collections are done. RunInSequence makes the marked class / module into normal xUnit implied collection i.e. it does not prevent it from running in parallel with other collections, it just makes the test cases and theory cases it contains run one after another, while other test collections can execute at the same time.

It's a bit complicated but it all works around the unfortunate design decision in xUnit that a test collection is the smallest unit of parallelization. The general idea with RunInSequence is to have as little of those as possible and work to eliminate them as the attribute marks tests that are not fully isolated from each other.

Maybe I rename the first one to DisableParallelization or DisableXunitParallelization?

psfinaki · 2024-10-29T18:26:29Z

tests/fsharp/XunitHelpers.fs

Also, I'd like us to have as few xunit helpers as possible - so decided to help you on that here. If that get's merged, this whole file can be removed I guess.

majocha mentioned this pull request Oct 11, 2024

Experiment, parallelize some tests #17662

Closed

8 tasks

majocha changed the title ~~Parallelize tests, continuation~~ Parallelize tests Oct 12, 2024

majocha force-pushed the parallel-tests branch 6 times, most recently from decc8cb to 278e2ec Compare October 13, 2024 09:01

rebase

0ed0442

majocha force-pushed the parallel-tests branch from 1c17ac7 to 0ed0442 Compare October 21, 2024 13:00

psfinaki added the NO_RELEASE_NOTES Label for pull requests which signals, that user opted-out of providing release notes label Oct 21, 2024

majocha added 4 commits October 23, 2024 10:34

Merge branch 'main' into parallel-tests

72ae0c6

Merge branch 'main' into parallel-tests

612a7b3

merge main

f2795ee

Merge branch 'main' into parallel-tests

d384a06

majocha commented Oct 24, 2024

View reviewed changes

majocha mentioned this pull request Oct 24, 2024

Fix potential OCE in graph processing #17921

Merged

psfinaki reviewed Oct 25, 2024

View reviewed changes

majocha added 7 commits October 25, 2024 15:26

Merge remote-tracking branch 'dotnet/main' into parallel-tests

7a94530

orchestrate instead of Async.Sleep

e93a33e

xunit customizations behind conditional compilation

47d9cdc

Merge branch 'main' into parallel-tests

4a5ba8f

Merge branch 'main' into parallel-tests

7359bdf

make console capture explicit

83c5b98

diff

1ef5c8d

majocha mentioned this pull request Oct 28, 2024

Some more assorted tests improvements #17931

Open

try fix test

e665e79

majocha force-pushed the parallel-tests branch from 1867a20 to e665e79 Compare October 28, 2024 11:50

majocha added 2 commits October 28, 2024 18:21

lazy pattern

efc4c28

reenable extras

1b00731

majocha mentioned this pull request Oct 28, 2024

Reenable Transparent Compiler tests #17933

Draft

psfinaki reviewed Oct 29, 2024

View reviewed changes

Merge branch 'main' into parallel-tests

01cc1f8

psfinaki reviewed Oct 29, 2024

View reviewed changes


		type FSharpScript(?additionalArgs: string[], ?quiet: bool, ?langVersion: LangVersion, ?input: string) =

		do ignore input

		()
		else failwith $"Script called function 'exit' with code={code} and collected in stderr: {errorStringWriter.ToString()}"

Parallelize tests #17872

Are you sure you want to change the base?

Parallelize tests #17872

Conversation

majocha commented Oct 11, 2024 • edited Loading

github-actions bot commented Oct 11, 2024 • edited Loading

⚠️ Release notes required, but author opted out

psfinaki commented Oct 15, 2024

majocha commented Oct 18, 2024 • edited Loading

psfinaki commented Oct 21, 2024

majocha commented Oct 21, 2024

majocha commented Oct 21, 2024

psfinaki commented Oct 21, 2024

majocha commented Oct 21, 2024 • edited Loading

psfinaki commented Oct 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

majocha Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

majocha Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

majocha commented Oct 11, 2024 •

edited

Loading

github-actions bot commented Oct 11, 2024 •

edited

Loading

majocha commented Oct 18, 2024 •

edited

Loading

majocha commented Oct 21, 2024 •

edited

Loading

majocha Oct 25, 2024 •

edited

Loading

majocha Oct 29, 2024 •

edited

Loading