Cannot install on Windows. Please help. #5704
Replies: 2 comments
-
I have a similar error: "Makefile:904." The silence is deafening... Edit - cmake appears to be working. |
Beta Was this translation helpful? Give feedback.
-
I used these instructions and could not get cmake to output all the targets with Toolkit v12.6. Aside from that there is something wrong with the make implementation that always sources Visual Studio cl.exe as the complier, so "invalid numeric argument '/Wextra'" is thrown because cl.exe only supports /W0 - /W4 or /Wall. In the end I went with Vulkan and everything works great. |
Beta Was this translation helpful? Give feedback.
-
Dear Gurus
Please help.
i did everything listed on the project page:
Makefile:604: *** I ERROR: For CUDA versions < 11.7 a target CUDA architecture must be explicitly provided via CUDA_DOCKER_ARCH. Stop.
Windows 10
i was trying to install in Ooba conda environment (Ooba works fine for me)
CUDA Toolkit is installed (nvcc --version)
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:09:35_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
EDIT 01: i have added SET CUDA_DOCKER_ARCH=all
now during build the following error happens:
nvcc -std=c++11 -O3 -use_fast_math --forward-unknown-to-host-compiler -Wno-deprecated-gpu-targets -arch=all -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -I. -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IF:\TEXTGEN\text-generation-webui\installer_files\env/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include -Xcompiler "-std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Xassembler -muse-unaligned-vector-move -Wno-array-bounds -Wno-pedantic" -c ggml-cuda.cu -o ggml-cuda.o
nvcc fatal : Cannot find compiler 'cl.exe' in PATH
make: *** [Makefile:451: ggml-cuda.o] Error 1
EDIT 02: after adding Desktop development with C++ to Visual Studio Community 2022 17.9.1 (i already had Python there installed)
and adding C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\Hostx64\x64 to PATH to run C compiler (cl.exe) this results in the following error:
nvcc -std=c++11 -O3 -use_fast_math --forward-unknown-to-host-compiler -Wno-deprecated-gpu-targets -arch=all -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -I. -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IF:\TEXTGEN\text-generation-webui\installer_files\env/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include -Xcompiler "-std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Xassembler -muse-unaligned-vector-move -Wno-array-bounds -Wno-pedantic" -c ggml-cuda.cu -o ggml-cuda.o
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
ggml-cuda.cu
cl : Command line error D8021 : invalid numeric argument '/Wextra'
make: *** [Makefile:451: ggml-cuda.o] Error 2
EDIT 03:
installing gcc C++ had not helped: incompatible compiler OS
installing CLang has not helper either
after a lot of googling and experimenting cmake .. -DLLAMA_CUBLAS=ON finally worked:
a) path to MS Visual C compiler (CL.EXE) C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\Hostx64\x64
b) path to CMAKE C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin
c) path to CUDA Toolkit C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin
to: C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations
Why this is not in build instructions? Is it too obvious?
there were a lot of warnings during compilation
now binaries are in X:\TEXTGEN\text-generation-webui\LLAMA.CPP\build\bin\Release
should i move them to another folder inside LLAMA.CPP ?
(quantization example suggests binaries are under the root of LLAMA.CPP, doesn't it?)
what other settings should i change?
how do i verify compiled software working as intended?
The biggest trouble:
built locally: quantize.exe 11,532,800 bytes
pre-built quantize.exe 59,904 bytes
from llama-b2254-bin-win-cublas-cu12.2.0-x64 release
THE DIFFERENCE IN SIZE IS TREMENDOUS
(though i quantized TinyLlama-1.1B-Chat-v1.0 downloaded from HF with both instances of quantize.exe into Q4_K_M
just for testing, and resulting GGUF files appeared to be identical)
still i have no explanation of this 200x difference in size. help!
i have no idea what i did wrong. please help!
Any hints are helpful. i am not a programmer, and these models are definitely not plug-and-play for me.
Beta Was this translation helpful? Give feedback.
All reactions