Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for running on Windows with Snapdragon X #8531

Merged
merged 11 commits into from
Jul 25, 2024
Merged

Improvements for running on Windows with Snapdragon X #8531

merged 11 commits into from
Jul 25, 2024

Conversation

AndreasKunar
Copy link
Contributor

@AndreasKunar AndreasKunar commented Jul 17, 2024

Improvements/Issues addressed:

  1. Add documentation on how to build for Windows, especially for ARM (with MSVC or clang)
  • See - llama.cpp/docs/build.md
  1. Fix, that building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (_asm_ directive) - fixes Bug: ggml-aarch64.c does not compile on Windows ARM64 with MSVC #8446
  • append "&& ! ((defined(_MSC_VER)) && ! defined(clang))" before each _asm_. Clang on Windows masquerades as MSCV, therefore the strange conditional - that it must not be a MSVC which is not clang.

ToDo in future fixes for Snapdragon X on Windows and not addressed by this PR:

P.S: This is my first PR ever, and with the help of chatGPT as tutor. Please have mercy on me, if I misunderstood things.

@github-actions github-actions bot added documentation Improvements or additions to documentation build Compilation issues ggml changes relating to the ggml tensor library for machine learning labels Jul 17, 2024
@AndreasKunar AndreasKunar marked this pull request as draft July 17, 2024 07:36
@AndreasKunar AndreasKunar marked this pull request as ready for review July 17, 2024 14:02
@max-krasnyansky
Copy link
Collaborator

@AndreasKunar
We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite.
#7191

You can also find the overall procedure in the Github Actions Workflow
Here is the summary:

  • Install Visual Studio 2022 (Community or another edition).
    Make sure to install complete ARM64 support (libraries, tools, etc)
  • Install Chocolatey (choco) package manager -- https://chocolatey.org/install
  • Use choco to install Ninja, CMake, LLVM
From Power Shell:  
        choco install ninja
        choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'
        choco install llvm

Once everything is installed build with

     cmake --preset arm64-windows-llvm-release
     cmake --build build-arm64-windows-llvm-release

@AndreasKunar
Copy link
Contributor Author

AndreasKunar commented Jul 17, 2024

@AndreasKunar We just need to update the README using instructions from my original PR that enabled optimized builds for the X-Elite. #7191

You can also find the overall procedure in the Github Actions Workflow Here is the summary:

  • Install Visual Studio 2022 (Community or another edition).
    Make sure to install complete ARM64 support (libraries, tools, etc)
  • Install Chocolatey (choco) package manager -- https://chocolatey.org/install
  • Use choco to install Ninja, CMake, LLVM
From Power Shell:  
        choco install ninja
        choco install cmake --installargs 'ADD_CMAKE_TO_PATH=System'

While I understand, some might use chocolately, winget is already shipping with Windows, and winget install cmake also does the job. I'd rather have both options available, than forcing chocoletely.

    choco install llvm

I'd rather also have the option to directly install llvm via github, without needing chocolately


Once everything is installed build with

 cmake --preset arm64-windows-llvm-release
 cmake --build build-arm64-windows-llvm-release

I agree and should have included this! THANKS!

Good input, I will update the doc tomorrow (Jul 18) morning my time (its late today, I'm tired/prone to errors, and I'd rather start fresh).

@max-krasnyansky
Copy link
Collaborator

I'd rather also have the option to directly install llvm via github, without needing chocolately

No objection to having options :)
We should have a recommended way though, that is also used in the CI for consistency.

So far, in the CI I used choco
https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758

If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget.
We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).

@AndreasKunar
Copy link
Contributor Author

I'd rather also have the option to directly install llvm via github, without needing chocolately

No objection to having options :) We should have a recommended way though, that is also used in the CI for consistency.

So far, in the CI I used choco https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L758

If winget is better option (ie included in windows by default, etc) I don't mind updating the CI to use winget. We need to fix the CI anyway for the MSVC builds (see my comment in that discussion).

@max-krasnyansky - I have tried to clarify/simplify the instructions. The current VS2022 automatically installs cmake and ninja, so no need for separate installs (I also tested this on my PC). Clang can be either installed via download or choco.

@AndreasKunar
Copy link
Contributor Author

@max-krasnyansky apologies, I found an even easier installation method and changed the documentation again. Installing all via the VS2022 install (git, cmake, clang, MSVC and ninja) and documented it accordingly. I also suggest to disable openmp (-D GGML_OPENMP=OFF) with clang for arm64 and documented this.

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jul 19, 2024
@max-krasnyansky
Copy link
Collaborator

Looks good to me.

Re: OpenMP. @fmz and I are working on the next round of the Threadpool updates that we submitted earlier.
It had to be re-written a bit due to OpenMP getting merged and other changes. The new PR will enable Threadpool for Windows on ARM64 as well (using native Windows thread primitives).
Also technically OpenMP works with LLVM/Clang, libomp.dll and headers are included in the LLVM distribution, but CMake fails to find it, it's a CMake issue. Anyway, we're going to introduce better threading via new PR soon :)

@AndreasKunar
Copy link
Contributor Author

@ggerganov can you please point me to someone who can review+approve this? max-krasnyansky, the author of the initial Snapdragon X / Windows optimizations thinks its OK. Thanks in advance.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that:

  • Building with cmake --preset arm64-windows-llvm-release the new aarch64 assembly kernels will compile and work correctly?
  • Building with cmake --preset arm64-windows-MSVC will also succeed, but the aarch64 assembly kernels will be skipped by the new conditionals?

building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (asm directive)

I also need someone to confirm this statement and that there is nothing better that we can do than disabling this code for certain environments (pinging @Dibakar)

Did you try using __asm instead of __asm__?

https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?view=msvc-170

docs/build.md Outdated Show resolved Hide resolved
@AndreasKunar
Copy link
Contributor Author

Do I understand correctly that:

  • Building with cmake --preset arm64-windows-llvm-release the new aarch64 assembly kernels will compile and work correctly?
  • Building with cmake --preset arm64-windows-MSVC will also succeed, but the aarch64 assembly kernels will be skipped by the new conditionals?

yes.

building with MSVC breaks for Snapdragn X / Windows because MSVC does not support any C in-line Assembly for ARM (asm directive)

I also need someone to confirm this statement and that there is nothing better that we can do than disabling this code for certain environments (pinging @Dibakar)

Did you try using __asm instead of __asm__?

https://learn.microsoft.com/en-us/cpp/assembler/inline/asm?view=msvc-170

In MSVC all asm only works with 32-bit x86, and neither with x64 nor with arm64. @max-krasnyansky verified this in his comments.

Co-authored-by: Georgi Gerganov <[email protected]>
AndreasKunar referenced this pull request in sykuang/llama.cpp Jul 23, 2024
@Dibakar
Copy link
Contributor

Dibakar commented Jul 24, 2024

The conditional seems to be fine for now, I will confirm it shortly.

@AndreasKunar
Copy link
Contributor Author

@ggerganov - sorry for my inexperience with pull requests, and for my asking because I don't want to do anything wrong. Is there something I still need to do for this, or is everything automatic, once it got approved? Should I close the request?

@ggerganov ggerganov merged commit bf5a81d into ggerganov:master Jul 25, 2024
53 checks passed
@AndreasKunar AndreasKunar deleted the snapdragonxwin-fix1 branch July 25, 2024 17:31
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024
* Improvements for Windows with Snapdragon X

* Revert "Improvements for Windows with Snapdragon X"

This reverts commit bf21397.

* Improvements for Windows with Snapdragon X

* WOA build clarifications

* WIndows on ARM build clarifications

* cmake build for Windows clarifications

* Update docs/build.md

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: AndreasKunar <andreaskmsn.com>
Co-authored-by: Georgi Gerganov <[email protected]>
@hmartinez82
Copy link

hmartinez82 commented Aug 6, 2024

Question: Why is __clang__ being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?

@AndreasKunar
Copy link
Contributor Author

Question: Why is __clang__ being disabled too? Won't that render the Clang (regular clang, not clang-cl) builds also unable to take advantage of MATMUL?

Its not disabled at all. But clang on Windoes says its both MSVC and clang. My conditional disables the asm only for "real" MSVC (MSVC and not clang)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Compilation issues documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: ggml-aarch64.c does not compile on Windows ARM64 with MSVC
6 participants