Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running the test-suite terminates abruptly with fd:5: hGetLine: end of file in MSYS2 Windows #9571

Closed
jasagredo opened this issue Dec 27, 2023 · 5 comments · Fixed by #10114

Comments

@jasagredo
Copy link
Collaborator

Describe the bug
The test-suite terminates non-deterministically with:

cabal-tests.exe: fd:5: hGetLine: end of file

To Reproduce
Steps to reproduce the behavior:

$ ./validate.sh -v -s build -s lib-suite
...
=== Cabal: cabal-testsuite =================================================== 00:10:15 ===
C:\Users\Javier\code\cabal\dist-newstyle-validate-ghc-9.6.3\build\x86_64-windows\ghc-9.6.3\cabal-testsuite-3\build\cabal-tests\cabal-tests.exe --builddir=/c/Users/Javier/code/cabal/dist-newstyle-validate-ghc-9.6.3/build/x86_64-windows/ghc-9.6.3/cabal-testsuite-3 -j4 --with-ghc=ghc
threads: 4
tests to run: 509
PackageTests\TestSuiteTests\ExeV10\cabal-with-hpc.multitest.hs                                                    SKIP no cabal-install (4.38s)
PackageTests\AllowOlder\cabal.test.hs                                                                             SKIP no cabal-install (4.45s)
PackageTests\AllowNewer\cabal.test.hs                                                                             SKIP no cabal-install (4.45s)
PackageTests\AutoconfBadPaths\cabal.test.hs                                                                       SKIP no cabal-install (0.53s)
PackageTests\AutogenModules\Package\setup.test.hs                                                                 OK (1.51s)
PackageTests\AutogenModules\SrcDist\setup.test.hs                                                                 OK (2.23s)
PackageTests\Backpack\bkpcabal01\cabal.test.hs                                                                    SKIP no cabal-install (0.54s)
...
SKIP no cabal-install
END Test.Cabal.Server
GHCi exited with ExitFailure (-1073741510) (use -v for more information)
cabal-tests.exe: fd:4: hGetLine: end of file

Turns out that annotating the readUntilEnd and readUntilSigil calls to hGetLine shows that non-deterministically, one of them fails. If I also print the script_path provided, it is different each time, even sometimes it refers to tests which are marked as skipIfWindows.

I don't know how the Test.Cabal.Server is setup, maybe we are leaving the GHCi session in a zombie state?

Expected behavior
The test-suite should just run normally, as it does on CI.

System information

  • MINGW64_NT-10.0-22621 3.4.10.x86_64 Msys
  • cabal 3.10.2.0, ghc 9.6.3
@jasagredo jasagredo changed the title Running the test-suite terminates with fd:5: hGetLine: end of file in MSYS2 Windows Running the test-suite terminates abruptly with fd:5: hGetLine: end of file in MSYS2 Windows Dec 27, 2023
@jasagredo
Copy link
Collaborator Author

I was suggested to use the RTS flag --io-manager=native. This fixes the problem 🎉

I will leave this issue open to make sure that ./validate.sh takes this into account in some way.

@jasagredo
Copy link
Collaborator Author

(Thanks for the GHC team at IOG for the suggestion @hsyl20 😄)

@ulysses4ever
Copy link
Collaborator

Still happening on master, unfortunately.

@jasagredo
Copy link
Collaborator Author

This is starting to be beyond my knowledge. I re-enabled the Windows CI in #10282 with the following results:

  • First a cabal-tests.exe: fd:5: hGetLine: end of file on 9.10.1
    logs-failed-9.10.1.txt

  • In the next run, cabal-tests.exe: fd:5: hGetLine: end of file on 9.8.2
    logs-failed-9.8.2.txt and cabal-tests.exe: fd:8: hGetLine: end of file in 9.6.4
    logs-failed-9.6.4.txt

  • Then I enabled --io-manager=native which solved it for my local runs, resulting in a segfault in 9.0.2:

    2024-08-27T23:04:42.4099278Z 
    2024-08-27T23:04:42.4099909Z Access violation in generated code when writing 0x0
    2024-08-27T23:04:42.5100373Z 
    2024-08-27T23:04:42.5100863Z  Attempting to reconstruct a stack trace...
    2024-08-27T23:04:42.5101196Z 
    2024-08-27T23:04:42.5101349Z    Frame	Code address
    2024-08-27T23:04:42.5102665Z  * 0x9a215dd20	0x28ae318 D:\a\cabal\cabal\dist-newstyle-validate-ghc-9.0.2\build\x86_64-windows\ghc-9.0.2\cabal-testsuite-3\build\cabal-tests\cabal-tests.exe+0x24ae318
    2024-08-27T23:04:42.5108006Z  * 0x9a215dd28	0x2769608 D:\a\cabal\cabal\dist-newstyle-validate-ghc-9.0.2\build\x86_64-windows\ghc-9.0.2\cabal-testsuite-3\build\cabal-tests\cabal-tests.exe+0x2369608
    2024-08-27T23:04:42.5109223Z  * 0x9a215dd30	0x345e680
    2024-08-27T23:04:42.5109543Z  * 0x9a215dd38	0x7ef4fe1057b0
    2024-08-27T23:04:42.5109871Z  * 0x9a215dd40	0x7ef531e058d8
    

    logs-segfault-9.0.2.txt

  • Then I enabled --io-manager=native for all but 9.0.2, resulting in a cabal-tests.exe: fd:8: hGetLine: end of file exception in 9.0.2 logs-segfault-another-9.0.2.txt

Notice it is not the same test that throws this exception. I'm running out of ideas here so I gently invoke @Mistuke which might have some insights, at least on how we could hunt this.

mergify bot pushed a commit that referenced this issue Sep 13, 2024
geekosaur pushed a commit that referenced this issue Sep 13, 2024
mergify bot added a commit that referenced this issue Sep 13, 2024
…10305)

* hackage-tests: Add --index-state argument to fix the cabal files

We need to fix the index-state we test against so a new bad cabal file
doesn't take down the CI for everyone.

Towards #10284

(cherry picked from commit 8e4d167)

* ci: Fix --index-state for hackage roundtrip tests

As a principle, tests which are required for CI to pass should be
reproducible and not depending on external resources changes or being
modified. The hackage tests currently violate this by depending on the
latest index state from hackage. This is problematic because until the
test is fixed all merges into master are blocked. Even though the
patches in question have nothing to do with the test.

It would be more suitable for a nightly job to run on the latest index
and for normal CI to run with a fixed index which is updated
periodically in a controlled manner.

Fixes #10284

(cherry picked from commit 31507b1)

* Re-enable Windows CI

(cherry picked from commit 4aade2d)

* CI: skip cli-suite on Windows due to #9571 (#10257)

(cherry picked from commit 30d2a38)

---------

Co-authored-by: Matthew Pickering <[email protected]>
Co-authored-by: Javier Sagredo <[email protected]>
Co-authored-by: Artem Pelenitsyn <[email protected]>
@jasagredo
Copy link
Collaborator Author

Closing this one as the io-manager flag fixed it also in CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants