ROCm support #55

mjsulik · 2022-09-23T04:49:03Z

mjsulik
Sep 23, 2022

It looks like CUDA can be much faster than --device cpu, especially as model size increases.
For example, https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/#whisper-advanced-usage

At least one person has appeared to have success using an AMD consumer GPU (RX 6800). https://news.ycombinator.com/item?id=32933236

Options for --device appear to include: cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, ort, mps, xla, lazy, vulkan, meta, hpu, privateuseone

I have the appropriate version of torch installed on Linux, but I'm not sure how to enable ROCm using --device

Answered by jongwook

Sep 23, 2022

We haven't tested ROCm, but from this documentation it seems that you can keep using cuda if the ROCm version is properly installed.

PyTorch for HIP intentionally reuses the existing torch.cuda interfaces. This helps to accelerate the porting of existing PyTorch code and models because very few code changes are necessary, if any.

View full answer

jongwook · 2022-09-23T06:29:31Z

jongwook
Sep 23, 2022
Maintainer

We haven't tested ROCm, but from this documentation it seems that you can keep using cuda if the ROCm version is properly installed.

PyTorch for HIP intentionally reuses the existing torch.cuda interfaces. This helps to accelerate the porting of existing PyTorch code and models because very few code changes are necessary, if any.

5 replies

mjsulik Sep 23, 2022
Author

Thank you, using "--device cuda" was successful after correctly configuring ROCm/HIP. In case it helps anyone else, I needed to install rocm-libs and set environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 to make this work.

DrMemoryFish Sep 28, 2022

how would i do this for gforce rtx 2060?

marcin-przywoski Sep 29, 2022

@mjsulik can you tell me how much of a performance gain it was? Were you using WSL2?

mjsulik Sep 29, 2022
Author

I was using Linux, not WSL2.

I actually didn't finish transcription for a file using the CPU because it was so slow on the medium.en model. I can't quantify the performance difference, but it was definitely large! If your only option is to use the CPU, then I think you'd want to start with a small model size (tiny or base) to find out how long it takes and then you could evaluate if the quality is acceptable. The main Github page for this project (https://github.com/openai/whisper) seems to assume you're using a GPU (e.g., it references the VRAM needed for different models), but I'd assume that the time scaling across models (e.g., x32 for tiny, x16 for base) would be consistent whether you're using CPU or GPU.

EchedelleLR Dec 27, 2022

Thank you, using "--device cuda" was successful after correctly configuring ROCm/HIP. In case it helps anyone else, I needed to install rocm-libs and set environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 to make this work.

Could you attach more instructions @mjsulik ? I am unsure if you imply all the libraries following the AMD documentation, the ROCm usecase (guessing is being installed as that) or the HIP usecase.

BranML · 2022-12-09T07:24:54Z

BranML
Dec 9, 2022

If it supports ROCm, it will be great.

0 replies

chowder3907 · 2022-12-09T08:53:17Z

chowder3907
Dec 9, 2022

You can already use pytorch-rocm to take advantage of AMD GPUs. Install it through pip before installing openai

1 reply

GrahamboJangles Feb 17, 2023

What's the secret to installing pytorch-rocm, because when I try it says it doesn't exist.

NielsMayer · 2022-12-27T22:21:27Z

NielsMayer
Dec 27, 2022

ROCm works fine with Whisper. I'm using it with "Radeon RX Vega (VEGA10, DRM 3.42.0, 5.15.0-48-generic, LLVM 12.0.0)".
With this GPU, I can process media using up to the 'medium' model. The large model is too big for this old 8Gb RxVega 56 Card.

A few important steps in getting it working (i'm on Kubuntu 20.04.5 w/ kernel 5.15.0-56-generic , YMMV):

(1) sudo apt install clinfo mesa-opencl-icd
(2) add your username to group 'render' in addition to 'video' to allow access to /dev/dri/* (and logout/login on desktop, or login to a new session if on server to propagate these important permissions)
(3) see if your card shows up in 'clinfo'
(4) sudo pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.2
(5) in python3 see if torch sees the GPU and CUDA:

>>> import torch
>>> torch.cuda.is_available() 
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'Radeon RX Vega'

(6) Now install whisper and check "whisper --help" to see if it outputs:
--device DEVICE device to use for PyTorch inference (default: cuda)

With my GPU, whisper outputs the following warning:

MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx900_56.kdb Performance may degrade. Please follow instructions to install:
https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package

However I've seen suggestions that the above warning is spurious, and in either case, applies to the AMD proprietary drivers whereas I'm using the open-source default 'amdgpu' drivers.

Anybody have further info on resolving this warning for open source 'amdgpu' users?? Will I see better performance using AMD's proprietary drivers??

........

On a different system, I had the hare-brained idea of trying to use my AMD 4750G "APU" (Ryzen with integrated AMD GPU) with pytorch. In this case I ended up using the "nightly" ROCm build versions of "torch 2.0.0.dev20221219+rocm5" and torchaudio and torchvision suggested by https://pytorch.org/ .

After much failure, trying a hint from https://stackoverflow.com/questions/73229163/amd-rocm-with-pytorch-on-navi10-rx-5700-rx-5700-xt ,
I set environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 . With this, at least at the pytorch level, it claims CUDA works on this APU:

$ HSA_OVERRIDE_GFX_VERSION=10.3.0 python3
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'AMD Radeon Graphics'

Note that without the env-var, whisper pukes:

$ whisper --help
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)

With the env-var, whisper claims cuda works:

$ HSA_OVERRIDE_GFX_VERSION=10.3.0 whisper --help
...
--device DEVICE       device to use for PyTorch inference (default: cuda)
...

However running a transcription with most models runs out of memory and dumps core (bug that it dumps core??):

Memory access fault by GPU node-1 (Agent handle: 0x8a46980) on address (nil). Reason: Page not present or supervisor privilege.
Aborted (core dumped)

With the 'base' or 'tiny' models, it no longer dumps core, but instead hangs forever at 100% CPU, outputting nothing.

Furthermore, in this 'hung' state, whisper no longer can be ^C'd to kill the hanging process (bug?). Rather the process needs to be killed externally, or you need to background the process and then "kill %1" it.

Note that radeontop(1) (from https://github.com/clbr/radeontop not apt) indicates how little memory is available on AMD 4750G with a running KDE desktop:

472M / 391M VRAM 120.68% │
1837M / 3063M GTT  59.98%

(NB: i've also tried this AMD4750G APU system "headless" so that GPU is only used for whisper and not to run the KDE desktop, and it still hangs or crashes as above).

And with that, I've given up on the AMD 4750G APU and continue to use my old Power-hog RxVega 56 (165W) with the 'medium' model quite successfully: https://www.youtube.com/watch?v=AFk5g7NJ1Ko https://rumble.com/v1n7cx8-trainspodder-and-whisper-transcribes-radio-w-good-proper-noun-spelling-infe.html

In contrast, for the RxVega56 radeontop(1) reports (on a "headless" system where I'm using the GPU 100% for whisper).

49M / 8119M VRAM   0.60%
11M / 8164M GTT   0.13%

4 replies

colemannugent Mar 8, 2023

Got it working on a 5700XT using these instructions. The "medium" model uses ~97% of my 8 GB of VRAM.

I had to add --fp16 False to my whisper invocation otherwise I got nonsense results from the transcription.

I'm on Ubuntu 22.04.2 LTS with kernel 5.15.0-60-generic.

jp-berg Mar 30, 2023

Adding for less technical people like me: Order of installation is important. Install all components in a fresh virtual enviroment! And ROCM 5.45 is currently not working for me (with Vega56), you will need 5.2, e.g. pip install torch==1.13.0+rocm5.2 torchvision==0.14.0+rocm5.2 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/rocm5.2

Whitesttax Mar 30, 2023

Adding for less technical people like me: Order of installation is important. Install all components in a fresh virtual enviroment! And ROCM 5.45 is currently not working for me (with Vega56), you will need 5.2, e.g. pip install torch==1.13.0+rocm5.2 torchvision==0.14.0+rocm5.2 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/rocm5.2

I was trying to install on wsl2 so I figured that's why it's not working. I have very little experience with linux, if I decide to install it is that really all the lines I have to run to get it working?

gnevan Jun 25, 2023

Suppose you should be able to allocate more memory for your APU in BIOS, I think up to 2GB (depending on your motherboard).

albcunha · 2023-03-08T18:47:06Z

albcunha
Mar 8, 2023

Hijacking this thread, I had a hard time get things to work on docker.
We have a RX6700XT. On medium, it transcribes 8x normal speed, or more. We also implemented SILERO VAD, so that may cut some time too.
Here is a docker file and docker compose that I use for rocm on a AMD RX6700XT:
#docker-compose.yml

version: "3.8"
services:
  worker:
      restart: always
      container_name: transcricao-redis-worker
      image: gaeco/masterimage:1.0
      # stdin_open: true # docker run -i
      # tty: true        # docker run -t
      # entrypoint: /bin/sh
      depends_on:
        - redis
      command: python3 worker.py
      shm_size: 8gb
      group_add:
        - video
        - render
      cap_add:
         - SYS_PTRACE 
         
      security_opt:
        - seccomp=unconfined
      
      devices:
        - /dev/kfd
        - /dev/dri

  
#dockerfile
FROM rocm/dev-ubuntu-20.04:5.4-complete@sha256:fa56caaeb6b6a61a04e679cfb9ebb585cd2634585b03b01d867c5df7025840b3

# FROM rocm/pytorch-nightly:2023-01-22-rocm5.4
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
ENV LD_LIBRARY_PATH=/opt/rocm/lib
ARG DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
ENV APP_HOME=/home/user
ENV PATH=$PATH:/home/user/.local/bin:/opt/rocm-5.4/lib:/opt/rocm-5.4/bin:/opt/rocm-5.4.2/opencl/bin
RUN apt-get update
RUN apt-get install -y --fix-missing ffmpeg libjpeg-dev python3-dev python3-pip git

RUN mkdir –m777 $APP_HOME
WORKDIR $APP_HOME


RUN mkdir -p –m777 $APP_HOME/logs 
RUN mkdir -p –m777 $APP_HOME/.cache/whisper
RUN mkdir -p –m777 $APP_HOME/.cache/torch/hub

RUN adduser --disabled-password --gecos '' --shell /bin/bash user
# RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
RUN usermod -a -G render,video user
RUN chown -R user:user $APP_HOME/

USER user

WORKDIR $APP_HOME

COPY --chown=user:user . $APP_HOME/

RUN pip install wheel setuptools
RUN pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.2
RUN pip install -r requirements.txt 

USER root
RUN sudo ln -s /opt/rocm-5.4.0/lib/libroctx64.so /opt/rocm-5.4.0/lib/libroctx64.so.1
RUN chown -R user:user $APP_HOME/.cache/whisper
RUN chown -R user:user $APP_HOME/.cache/torch/hub
USER user

Please, consider that I know nothing, so there my be errors or unecessary steps on these files, but I took a long time to figure it out.

You also need to install rocm drivers on your host machine first. First I suggest to get: torch.is_cuda_available() to True and doing some testes locally before going to docker.

7 replies

Whitesttax Mar 14, 2023

Thank you for the explanation. I'm trying to follow it exactly as you wrote, this line gives me trouble:

wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
gpg: no valid OpenPGP data found.

Tried: Remove the Certification Check, Install certificates, Change the Path in the .bashrc File
"http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key" gives 404 error on browser, maybe that's why?

mjsulik Mar 14, 2023
Author

I'm not able to find that any longer in the documentation, so maybe it is no longer needed. Are you able to run the following commands successfully without the key?

That comes from this guide, and I'm not sure if that key is really necessary. https://sep5.readthedocs.io/en/latest/Installation_Guide/Installation-Guide.html

The AMD installation guide differs at this point:

Ubuntu v20.04

echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/5.4.3 focal main' | sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt-get update

Ubuntu v22.04

echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/5.4.3 jammy main' | sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt-get update

RaymondTheBrave0 Nov 9, 2023

Can anyone tell me what is the bests setup to try and get Roc support for AMD Radeon X570 GPU to work for Whisper.

I have Linux Mint 21.2 at the moment running on a Aorus Ultra X570 Mobo with 32GB Ram and with a Ryzen 5 Cpu. I have tried so many things to try and get Whisper to use the GPU (8GB GDDR5 Ram) and have now decided that using a VM on Virtualbox would be a good way to see if I can get it to work. I was looking at using Ubuntu as my base operating system. Is this a good choice and will Virtualbox allow pytorch and Roc functionality to see the GPU properly so that it will use it.

If someone could suggest the steps to take in order to get a successful run of whisper using my AMD GPU on my system.

I have started now using anaconda which seems to help keep things in some sort of order and knowing what has been installed.

If I issue the clinfo command I get the following:-

clinfo
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 23.0.4-0ubuntu1~22.04.1
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA

Platform Name Clover
Number of devices 1
Device Name AMD Radeon RX 570 Series (polaris10, LLVM 15.0.7, DRM 3.42, 5.15.0-88-generic)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 23.0.4-0ubuntu122.04.1
Device Numeric Version 0x401000 (1.1.0)
Driver Version 23.0.4-0ubuntu122.04.1
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Max compute units 32
Max clock frequency 1284MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple (kernel) 64

Which looks promising and if gointo python I get the following responses:-

python3
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux

import torch
torch.cuda.is_available()
True
torch.cuda.device_count()
1
torch.cuda.get_device_name(0)
'AMD Radeon RX 570 Series'

However when I issued this command I got a blank, echo $CUDA_VISIBLE_DEVICES
Should it not be set to at least 1.

I have been able to run whisper but it does not use my AMD Gpu based on Mission Center graph of GPU usage, which I hope I can believe. It was also runs very slowly and the CPU usage is up at 60%.

RaymondTheBrave0 Nov 9, 2023

I have realised that in trying to use a Virtualbox environment and I am going to have passthru issues as the Ubuntu cannot see the hardware directly so I would assume I cant test whisper using the AMD GPU. So I will not going ahead with a VM test. I will look at maybe setting up a USB Live bootable environment to see if I can get whisper using my AMD GPU.

webdevjosue Nov 10, 2023

I have realised that in trying to use a Virtualbox environment and I am going to have passthru issues as the Ubuntu cannot see the hardware directly so I would assume I cant test whisper using the AMD GPU. So I will not going ahead with a VM test. I will look at maybe setting up a USB Live bootable environment to see if I can get whisper using my AMD GPU. amd rocm gpu support rdna3
You may have better chance on 7900xt/xtx from what I gather it's cpu only without cuda cores.

viebrix · 2024-01-18T17:17:41Z

viebrix
Jan 18, 2024

Here is a short description how to use Whisper with older AMD Cards (GFX803) RX 580
https://github.com/viebrix/pytorch-gfx803-for-Whisper

Yesterday I managed to get Whisper (or Whisper-WEBUI) start and running with GPU (GFX803) RX 580 (8GB).

1 reply

BernardBurke Jul 11, 2024

viebrix - can i say a HUGE thank you to you.

I'm the latest (oldest) arriver to the world of GPUs (I'm a grandparent aged techo).

I want to start running more stuff locally, so I started down the path of buy affordable GPUs and play with openai-whisper etc on my local linux (mint 21.1, 5.15.0-113 generic). I bought a couple of cheap 8gb RX580s, with a specific requirement that they fit in my NUC style systems.

Then I learned about rocm and cuda and thought I'd gone down a silly road in my purchase.

I just followed all of your instructions at https://github.com/viebrix/pytorch-gfx803-for-Whisper and I can report success!!

Thank you again!! Instead of finding I'd wasted my money, I'm now keen to see what I can do for others (via your great work).

I'm thinking about a docker image that might save on all the torch and torchvision builds, that I would make public. Do you think this would be helpful to the community?

Cheers and virtual beers from Sydney Australia

viebrix · 2024-07-11T17:57:06Z

viebrix
Jul 11, 2024

@BernardBurke thanks for your feedback, makes me glad to here that my instructions helped!
Using docker is a good idea. I haven't used docker by myself, but it would make sense.
I have stable diffusion with comfyui installed on the same desktop and I had sometimes troubles with differend torch version between whisper and stablediffusion. There is indeed a virtual environment, but some nodes in comfyui and also some extensions in automatic1111 web-ui had some troubles. Probable I made some mistakes with not activation the virtual environment before installing some dependencies, but docker would solve all this problems.
Another point in favour is that some extensions and custom nodes in comfyui / automatic111 web-ui are a risk (potential malware). Docker would improve this risky situation. Same situation in weaker form (less risky) applies also to dependencies in whisper-webui.

here is a link to the instruction for stable diffusion:
https://github.com/viebrix/pytorch-gfx803
But this could break your whisper installation..

Greetings from the other side of the world (from vienna / austria)
viebrix

0 replies

BernardBurke · 2024-07-11T23:56:52Z

BernardBurke
Jul 11, 2024

Thanks viebrix, I've done a little bit of docker work before. I should take this one (build a docker for people who, like you and me, have an older AMD gpu that is going to waste for some self hosted ML workloads). And, good choice on stable diffusion as the next target. I'll take a look. One question.. doing the build of torch and torchvision, I received hundreds of compiler (and other warnings). I didn't even bother to track what was going on. If I'm to build a docker script based on headless linux mint, for example (I'll explain my choice below), I'm inclined to create a venv of a specific version of python etc and just install the prebuild wheels (that I built on the same version of mint). If I do a good job, regenerating the docker build should be mostly automated (but I'm wary of the many compiler errors on the build.. I should document it to potential users, but, I don't think anyone is going to 'fix' this kind of thing. What do you think? On my choice of Mint... I remain ever hopeful that more people will move their daily driver to linux (it's rare, but it still happens). I choose Mint as my daily driver so I can be as I think most Windows or OSX folks will be able to use it more or less immediately... and because I use it all day every day, I'm pretty experienced in answering 'user' questions. I'm part way through an attempted Daily Driver move to Nixos... are you familiar with the nix package manager? I really think it's worth learning. It has the potential to make OSX a decent opensource option (which it just isn't today IMO). Cheers.. I'll let you know where I'm at with the docker world. PS: on one final note.. I have a *really * old GM204 [GeForce GTX 970] that came from a daughter's ex (gamer) boyfriend, hooked up to a 12 year old 4Gigi i7 with 16 gb of ram. It is *much *faster at whisper transcribing (almost twice as fast from my first few tries) than the 580 we're discussing... and these old GPUs are dirt cheap on eBay.

…

-- Ben Burke Level 2, 11 York St Sydney 2000 Cell +61418674002 Want to see when I'm available to meet? <https://www.benburke.org/calendar> (this link takes you to a simple view of my Calendar, and show's you when I'm free...)

On Fri, 12 Jul 2024 at 03:57, viebrix ***@***.***> wrote: @BernardBurke <https://github.com/BernardBurke> thanks for your feedback, makes me glad to here that my instructions helped! Using docker is a good idea. I haven't used docker by myself, but it would make sense. I have stable diffusion with comfyui installed on the same desktop and I had sometimes troubles with differend torch version between whisper and stablediffusion. There is indeed a virtual environment, but some nodes in comfyui and also some extensions in automatic1111 web-ui had some troubles. Probable I made some mistakes with not activation the virtual environment before installing some dependencies, but docker would solve all this problems. Another point in favour is that some extensions and custom nodes in comfyui / automatic111 web-ui are a risk (potential malware). Docker would improve this risky situation. Same situation in weaker form (less risky) applies also to dependencies in whisper-webui. here is a link to the instruction for stable diffusion: https://github.com/viebrix/pytorch-gfx803 But this could break your whisper installation.. Greetings from the other side of the world (from vienna / austria) viebrix — Reply to this email directly, view it on GitHub <#55 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE27KUB3EMNYV6ZKF7OR2H3ZL3BQTAVCNFSM6AAAAAAQTUPUV2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMBSGM4TSNY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

viebrix · 2024-07-14T12:55:00Z

viebrix
Jul 14, 2024

@BernardBurke
I think NVIDIA's cuda runs much better than rocm. Whisper does not need much VRAM therefore it could be that your geforce is running faster than the newer AMD. But Stable Diffusion needs more VRAM. (Not using very simple examples, but trying more - with controlnet or with SDXL model needs more ram) Therefore a card with minimum 8GB VRAM (better 12GB) would be better. I switched two weeks ago to a RTX 4060ti (with 16GB VRAM). Whisper was no problem with my rx580 - but complex workflows in stable diffusion crashed constantly my linux mint. Because rocm is not error free. Not all cuda commands implemented straight forward in rocm and so there are memory leaks. After examining these problems - I waited month to buy this new NVIDIA gpu - because I hoped AMD would initiate more in compatibility with pytorch. But than I read, that they stopped a project, which was implemented - since one and a half year. This project completely implemented this compatibility independent from HIP. I do not remember the name of the project. They open sourced it - if I remember correctly. But they didn't support it any more.
I like AMD I have a cpu from AMD and I'm happy with it, but the KI and special the pytorch topic is a real problem. Alone the thing you have to compile the whole rocm thing with special flags for older cards is a harassment.
But I thing: already having amd cards why not using them for KI. They can successfully doing a lot of KI tasks. There are only some limits - which NVIDIA do not have. In comfy-ui (stable diffusion) there are a lot of optimizations for lower vram and amd, but however OS crashes happens sometimes.
I'm also using Mint on my private PC daily. At work I have to use Windows. There are a lot of other thoughts and answers I have to your words, but I don't want to "spam" this Issue Task here with topics they do not fit. (hope everything is readable, my english is not very good..)

Edit: the project I read about is ZLUDA:
https://www.phoronix.com/news/ZLUDA-New-Activity-AMD-CUDA

0 replies

BernardBurke · 2024-07-15T22:27:30Z

BernardBurke
Jul 15, 2024

viebrix, Thanks for that - I think your English is awesome (and I don't have any other languages - always feel somewhat dumb for this reason. Even when I've worked overseas, everyone in the IT field spoke English.. anyway). I see what you are saying, and as I learn more about the GPU (and NPU) world, I'll no doubt spend some more money. One part of my life now (after 41 years in tech) is to try and make IT accessible for young people, especially those who don't have family money (or much of their own). Getting some kind of GPU assisted machine learning locally sounds like a good plan. I just ran across this - seems someone has done much of the docker work already https://github.com/robertrosenbusch/gfx803_rocm61_pt24 Cheers -- Ben Burke Level 2, 11 York St Sydney 2000 Cell +61418674002 Want to see when I'm available to meet? <https://www.benburke.org/calendar> (this link takes you to a simple view of my Calendar, and show's you when I'm free...)

…

On Sun, 14 Jul 2024 at 22:55, viebrix ***@***.***> wrote: @BernardBurke <https://github.com/BernardBurke> I think NVIDIA's cuda runs much better than rocm. Whisper does not need much VRAM therefore it could be that your geforce is running faster than the newer AMD. But Stable Diffusion needs more VRAM. (Not using very simple examples, but trying more - with controlnet or with SDXL model needs more ram) Therefore a card with minimum 8GB VRAM (better 12GB) would be better. I switched two weeks ago to a RTX 4060ti (with 16GB VRAM). Whisper was no problem with my rx580 - but complex workflows in stable diffusion crashed constantly my linux mint. Because rocm is not error free. Not all cuda commands implemented straight forward in rocm and so there are memory leaks. After examining these problems - I waited month to buy this new NVIDIA gpu - because I hoped AMD would initiate more in compatibility with pytorch. But than I read, that they stopped a project, which was implemented - since one and a half year. This project completely implemented this compatibility independent from HIP. I do not remember the name of the project. They open sourced it - if I remember correctly. But they didn't support it any more. I like AMD I have a cpu from AMD and I'm happy with it, but the KI and special the pytorch topic is a real problem. Alone the thing you have to compile the whole rocm thing with special flags for older cards is a harassment. But I thing: already having amd cards why not using them for KI. They can successfully doing a lot of KI tasks. There are only some limits - which NVIDIA do not have. In comfy-ui (stable diffusion) there are a lot of optimizations for lower vram and amd, but however OS crashes happens sometimes. I'm also using Mint on my private PC daily. At work I have to use Windows. There are a lot of other thoughts and answers I have to your words, but I don't want to "spam" this Issue Task here with topics they do not fit. (hope everything is readable, my english is not very good..) — Reply to this email directly, view it on GitHub <#55 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE27KUA3PMF5DKEIBXBEFM3ZMJYLZAVCNFSM6AAAAAAQTUPUV2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMBUGM2TQMA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

shvedes · 2024-10-25T01:34:13Z

shvedes
Oct 25, 2024

A little update of NielsMayer instruction.

rocm5.2 isn't working for me (guess it outdated), so i tried to install rocm5.6. (Just copied this from here)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6

After this, install whisper and set HSA_OVERRIDE_GFX_VERSION=10.3.0 and run whisper.

Tested on Arch Linux, 6700XT

0 replies

ROCm support #55

Replies: 11 comments · 18 replies

jongwook Sep 23, 2022 Maintainer

mjsulik Sep 23, 2022 Author

mjsulik Sep 29, 2022 Author

mjsulik Mar 14, 2023 Author

Replies: 11 comments 18 replies

jongwook
Sep 23, 2022
Maintainer

mjsulik Sep 23, 2022
Author

mjsulik Sep 29, 2022
Author

mjsulik Mar 14, 2023
Author