Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Mill's Graal native image support working on Windows (500USD Bounty) #4274

Closed
lihaoyi opened this issue Jan 9, 2025 · 17 comments
Closed
Labels
Milestone

Comments

@lihaoyi
Copy link
Member

lihaoyi commented Jan 9, 2025


From the maintainer Li Haoyi: I'm putting a 500USD bounty on this issue, payable by bank transfer on a merged PR implementing this.


Graal native images already work on Mac and Linux, for Windows we basically need to make #4272 pass.

The current problem is we get these errors:

[5497] X mill.testkit.UtestExampleTestSuite.exampleTest 63598ms 
[5497]   utest.AssertionError: else assert(evalResult.isSuccess)
[5497]   evalResult: mill.testkit.IntegrationTester.EvalResult = EvalResult(false,[info] compiling 1 Java source to D:\a\mill\mill\out\example\javalib\publishing\7-native-image\packaged\server\test.dest\sandbox\run-2\out\foo\compile.dest\classes ...

[5497]   [info] done compiling

[5497]   ========================================================================================================================

[5497]   GraalVM Native Image: Generating 'native-image' (executable)...

[5497]   ========================================================================================================================

[5497]   

[5497]   [1/8] Initializing...                                                                                    (0.0s @ 0.07GB)

[5497]   Error: Error compiling query code (in C:\Users\RUNNER~1\AppData\Local\Temp\SVM-1484501498942[836](https://github.com/com-lihaoyi/mill/actions/runs/12685458345/job/35356141119?pr=4272#step:12:837)3566\AMD64LibCHelperDirectives.c). Compiler command ''C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.42.34433\bin\HostX64\x64\cl.exe' /WX /W4 /wd4201 /wd4244 /wd4245 /wd4800 /wd4804 /wd4214 '/FeC:\Users\RUNNER~1\AppData\Local\Temp\SVM-14845014989428363566\AMD64LibCHelperDirectives.exe' 'C:\Users\RUNNER~1\AppData\Local\Temp\SVM-14845014989428363566\AMD64LibCHelperDirectives.c'' output included error: [AMD64LibCHelperDirectives.c, C:\Users\RUNNER~1\AppData\Local\Temp\SVM-14845014989428363566\AMD64LibCHelperDirectives.c(11): fatal error C1083: Cannot open include file: 'C:\Users\runneradmin\AppData\Local\Coursier\Cache\arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-17.0.7\graalvm-community-jdk-17.0.7_windows-x64_bin.zip\graalvm-community-openjdk-17.0.7+7.1\lib\svm\clibraries\windows-amd64\include\amd64cpufeatures.h': No such file or directory]

[5497]   Error: Use -H:+ReportExceptionStackTraces to print stacktrace of underlying exception

[5497]   ------------------------------------------------------------------------------------------------------------------------

[5497]                           0.3s (7.5% of total time) in 10 GCs | Peak RSS: 0.40GB | CPU load: 2.74

[5497]   ========================================================================================================================

[5497]   Finished generating 'native-image' in 3.8s.

[5497]   1 tasks failed
[5497]   foo.nativeImage os.SubprocessException: Result of C:\Users\runneradmin\AppData\Local\Coursier\cache\arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-17.0.7\graalvm-community-jdk-17.0.7_windows-x64_bin.zip\graalvm-community-openjdk-17.0.7+7.1\bin\native-image.cmd?: 1
[5497]   
[5497]     os.proc.call(ProcessOps.scala:230)
[5497]       mill.scalalib.NativeImageModule.$anonfun$nativeImage$3(NativeImageModule.scala:43)

[5497]   ,)
[5497]     utest.asserts.Asserts$.assertImpl(Asserts.scala:30)
[5497]     mill.testkit.ExampleTester.validateEval(ExampleTester.scala:145)
[5497]     mill.testkit.ExampleTester.processCommand(ExampleTester.scala:134)
[5497]     mill.testkit.ExampleTester.processCommandBlock(ExampleTester.scala:95)
[5497]     mill.testkit.ExampleTester.$anonfun$run$2(ExampleTester.scala:194)
[5497]     mill.testkit.ExampleTester.$anonfun$run$2$adapted(ExampleTester.scala:194)
[5497]     scala.collection.ArrayOps$.foreach$extension(ArrayOps.scala:1323)
[5497]     mill.testkit.ExampleTester.run(ExampleTester.scala:194)
[5497]     mill.testkit.ExampleTester$.run(ExampleTester.scala:61)
[5497]     mill.testkit.UtestExampleTestSuite$.$anonfun$tests$3(UtestExampleTestSuite.scala:19)
[5497]     scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[5497]     mill.api.Retry.$anonfun$apply$1(Retry.scala:32)
[5497]     mill.api.Retry.$anonfun$apply$1$adapted(Retry.scala:32)
[5497]     mill.api.Retry.$anonfun$indexed$2(Retry.scala:42)
[5497]     scala.util.Try$.apply(Try.scala:217)
[5497]     mill.api.Retry.$anonfun$indexed$1(Retry.scala:42)
[5497]     java.lang.Thread.run(Thread.java:[840](https://github.com/com-lihaoyi/mill/actions/runs/12685458345/job/35356141119?pr=4272#step:12:841))

This is very weird, because although it says Cannot open include file: 'C:\Users\runneradmin\AppData\Local\Coursier\Cache\arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-17.0.7\graalvm-community-jdk-17.0.7_windows-x64_bin.zip\graalvm-community-openjdk-17.0.7+7.1\lib\svm\clibraries\windows-amd64\include\amd64cpufeatures.h, when I try to add ls ... or type ... of that file to the github actions job the file does exist and I can read it

Once we get the basic Graal unit tests passing on Windows, it should be easy to add a windows publishing job (below) using windows-latest to publish windows-compatible Mill executables:

- os: macos-latest

Lastly, we would need to update the windows mill.bat launcher with changes similar to #4273 to make it able to download the correct native images

@lihaoyi lihaoyi changed the title Set up Mill Graal native images on Windows Get Mill's Graal native image support working on Windows Jan 9, 2025
@ayewo
Copy link
Contributor

ayewo commented Jan 9, 2025

@lihaoyi is it fine if I work on this?

@lihaoyi lihaoyi changed the title Get Mill's Graal native image support working on Windows Get Mill's Graal native image support working on Windows (500USD Bounty) Jan 9, 2025
@lihaoyi lihaoyi added the bounty label Jan 9, 2025
@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 9, 2025

@ayewo go for it, I just put a bounty on it as well

@ayewo
Copy link
Contributor

ayewo commented Jan 9, 2025

@lihaoyi

I've been able to narrow down the issue to the Windows path limit of a maximum of 260 characters.

Graal attempts to use the Visual Studio C compiler to compile the following code:

'C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.42.34433\bin\HostX64\x64\cl.exe' \
/WX /W4 /wd4201 /wd4244 /wd4245 /wd4800 /wd4804 /wd4214 \
'/FeC:\Users\RUNNER~1\AppData\Local\Temp\SVM-5703768390083211131\RISCV64LibCHelperDirectives.exe' \
'C:\Users\RUNNER~1\AppData\Local\Temp\SVM-5703768390083211131\RISCV64LibCHelperDirectives.c'

But the compilation depends on this include which has a path length of 273 chars which is > 260:

C:\Users\runneradmin\AppData\Local\Coursier\cache\arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-17.0.7\graalvm-community-jdk-17.0.7_windows-x64_bin.zip\graalvm-community-openjdk-17.0.7+7.1\lib\svm\clibraries\windows-amd64\include\riscv64cpufeatures.h

windows-latest—has long paths enabled in the registry. Which leaves us with the Visual Studio compiler: cl.exe.

I ran some tests and it turns out that cl.exe is the culprit. It doesn't support paths that exceed the max path limitation of 260, so the only option is to shorten the path to Coursier's Cache on Windows.

On the Windows runner, Coursier' Cache is at: C:\Users\runneradmin\AppData\Local\Coursier\cache.

When I grepped the env vars on the Windows runner, I saw that you already use super short paths for build outputs:

env | grep D:
GITHUB_WORKSPACE=D:\a\mill\mill
...
RUNNER_WORKSPACE=D:\a\mill

So I tried to use a shorter path for the COURSIER_CACHE by adding a new env var to .github/workflows/run-tests.yml:

COURSIER_CACHE: 'D:\a\Coursier-Cache'

I tried to inject the COURSIER_CACHE env var into CoursierModule.scala, but Coursier is still using the default cache location, after my changes.

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 9, 2025

Thanks @ayewo! IIRC I also tried setting the COURSIER_CACHE environment variable when running Mill but it did not seem to take effect. Maybe @alexarchambault has some ideas on the proper way to do this?

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 10, 2025

@ayewo Managed to make the test PR make more progress, using COURSIER_ARTIFACT_CACHE instead of COURSIER_CACHE. Now it seems to be hitting a different failure with the classloading https://github.com/com-lihaoyi/mill/actions/runs/12702881955/job/35409995382?pr=4272

@ayewo
Copy link
Contributor

ayewo commented Jan 10, 2025

@lihaoyi yes, COURSIER_ARTIFACT_CACHE does work when I tested it now. Not sure why it is not publicly documented.

I'm working on the class loading issue. The native image had scala-library in its classpath when it was built, but that dependency is not bundled with the final image. Looking to add it as an Ivy dependency ivy"org.scala-lang:scala-library:${scalaVersion}" to the example so the issue can go away.

@lolgab
Copy link
Member

lolgab commented Jan 10, 2025

Scala Native creates a llvmLinkingInfo text file with all the options and passes it to clang with clang @llvmLinkingInfo to work around the character limit.
Maybe the tools you are calling also support such @ operator.

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 10, 2025

One issue is that we don't call clang ourselves AFAIK, but rather call it transitively through Graal's native-image binary. Probably should open an upstream issue, but I don't have an appropriate x64 windows laptop to put together a minimal repro

@alexarchambault
Copy link
Collaborator

That's COURSIER_ARCHIVE_CACHE, yes, for the JVMs. That's the directory where coursier unpacks archives (rather than keeping files as is, under COURSIER_CACHE)

@alexarchambault
Copy link
Collaborator

alexarchambault commented Jan 10, 2025

@lihaoyi FYI, in my mill-native-image plugin, on Windows, we put the class path as a manifest JAR (or "pathing JAR") to native-image. I can't recall why, but that could help with the error you're getting (which looks like an issue with the class path passed to native-image)

@ayewo
Copy link
Contributor

ayewo commented Jan 11, 2025

@lihaoyi

I managed to get this test to pass on the Windows runner:

> ./out/foo/nativeImage.dest/native-image
Hello, World!

by copying scala-library-2.13.11.jar into the nativeImage.dest folder.

@ayewo
Copy link
Contributor

ayewo commented Jan 11, 2025

So it looks like ./mill foo.nativeImage will need to call the .assembly task to build a fat jar?

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 11, 2025

I managed to make it pass in CI by adding the --no-fallback flag to nativeImageOptions, znot sure what the flsg does, but the test completed successfully

@ayewo
Copy link
Contributor

ayewo commented Jan 11, 2025

Interesting.

According to the Graal docs:

--no-fallback: build a standalone image or report a failure.

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 11, 2025

Found the flag here https://stackoverflow.com/questions/78322168/java-lang-classnotfoundexception-in-graalvm-executable

@lihaoyi
Copy link
Member Author

lihaoyi commented Jan 11, 2025

I merged the linked PR. Next steps for this ticket are to get publishing working and update the mill.bat script. @ayewo i can probably fix the publishing job but may need your help with mill.bat

@lolgab
Copy link
Member

lolgab commented Jan 11, 2025

--no-fallback disables the feature to fallback to build a normal Java application if building a native-image doesn't work. Usually you always want to add it since you want a failure, not a Java application on the JVM.

@lihaoyi lihaoyi closed this as completed Jan 15, 2025
@lefou lefou added this to the 0.12.6 milestone Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants