Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add xformers to cirun #1011

Merged
merged 2 commits into from
Jun 18, 2024
Merged

add xformers to cirun #1011

merged 2 commits into from
Jun 18, 2024

Conversation

h-vetinari
Copy link
Member

xformers compiles a lot of symbols per architecture, and times out with our default set of architectures, which should look like this:

    if [[ ${cuda_compiler_version} == 11.8 ]]; then
        export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0;8.6;8.9+PTX"
    elif [[ ${cuda_compiler_version} == 12.0 ]]; then
        export TORCH_CUDA_ARCH_LIST="5.0;6.0;6.1;7.0;7.5;8.0;8.6;8.9;9.0+PTX"

To avoid timing out with the 6h limit on azure, we're currently just able to build

    if [[ ${cuda_compiler_version} == 11.8 ]]; then
        export TORCH_CUDA_ARCH_LIST="5.0;8.9+PTX"
    elif [[ ${cuda_compiler_version} == 12.0 ]]; then
        export TORCH_CUDA_ARCH_LIST="5.0;7.0;8.0;9.0+PTX"

i.e. 2-4 instead of 9 GPU architectures. I think it would be good to build the full set of architectures on the cirun server. Shouldn't need a particularly beefy machine, just more time.

CC @jaimergp @isuruf @jakirkham @conda-forge/xformers

@h-vetinari h-vetinari requested a review from a team as a code owner June 7, 2024 04:54
@jaimergp
Copy link
Member

jaimergp commented Jun 7, 2024

If you just need more time, shouldn't it suffice with cpu-large?

@h-vetinari
Copy link
Member Author

If you just need more time, shouldn't it suffice with cpu-large?

It's a CUDA build, so I think it needs the GPU agents? Not sure if ptxas works on a CPU agent, but happy to try. 🤷

It's possible that things might also work with a large agent, but given the length of compilation (probably close to ~24h in AZP-equivalent runtime), I think using xlarge is justified.

@jakirkham
Copy link
Member

CUDA builds typically can be done on CPU. This actually is how RAPIDS builds its packages

Think ptxas should work on a CPU only worker. Even if we find issues, it may be worth the effort to fix them since CPU workers are simply more common. Plus it would make it easier to debug builds offline when needed

That said, if there is a testing need, it could make sense to use the GPU runners for that

On a different note, there are a few other feedstocks here that could be moved to CPU only. Noted them in a separate issue: #1012

requests/xformers.yml Outdated Show resolved Hide resolved
@jakirkham jakirkham requested a review from jaimergp June 11, 2024 00:48
@h-vetinari
Copy link
Member Author

Are we OK now with trying the CPU runners? Or should we add both in this PR (but prefer the CPU one on the feedstock if it works)?

@jaimergp
Copy link
Member

Oops, almost forgot about this. Merging!

@jaimergp jaimergp merged commit b7c99fc into conda-forge:main Jun 18, 2024
1 check passed
@jaimergp
Copy link
Member

A new PR will pop up in https://github.com/conda-forge/.cirun/pulls. Feel free to merge that if I don't see it before :P

@jaimergp
Copy link
Member

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants