Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune torch cuda arch list to match upstream #306

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

isuruf
Copy link
Member

@isuruf isuruf commented Dec 26, 2024

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-admin
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
  • ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12499338117. Examine the logs at this URL for more detail.

@hmaarrfk
Copy link
Contributor

It seems like a mistake to remove 8.9 to me. This targets 4090s and 6000 ada

@isuruf
Copy link
Member Author

isuruf commented Dec 26, 2024

Is there a difference between the code generated for 8.6 vs code generated for 8.9? Removing 8.9 here does not mean that those targets are dropped. Just that 8.9 compute capability devices will use code compiled for 8.6.

@hmaarrfk
Copy link
Contributor

Is there a difference between the code generated for 8.6 vs code generated for 8.9? Removing 8.9 here does not mean that those targets are dropped. Just that 8.9 compute capability devices will use code compiled for 8.6.

I’m honestly really not sure and it is difficult to tell without further inspection. How can one tell in practice?

@isuruf
Copy link
Member Author

isuruf commented Dec 26, 2024

You have to run benchmarks to see if it does make sense. Since this is what upstream wheels do and most people use those wheels, I think we would have seen reports upstream if there was a difference.

@hmaarrfk
Copy link
Contributor

I think we would have seen reports upstream if there was a difference.

I think very few people have skills to compile pytorch from source.

I can run a benchmark in 1 week if you can prepare one for me. I have a 4090.

@hmaarrfk
Copy link
Contributor

I think there are a few differences in the capabilities of 8.6 and 8.9 that we should keep both...
https://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions

they seem hard to trigger and would take us a lot of time to design benchmarks for (i think)

@hmaarrfk
Copy link
Contributor

It also seems that many lesser 40 series would benefit. Not just “those with highest budgets”

@hmaarrfk hmaarrfk marked this pull request as draft December 27, 2024 13:37
Copy link

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks are always great to inform decisions on what to build yes or no, so 👍🏼 from me for waiting for those. In principle this change seems correct though, and in the absence of evidence to the contrary of significant value of making different choices, it seems like the upstream TORCH_CUDA_ARCH_LIST should always be matched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants