-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda112 & add pytorch to ml-notebook & add cupy #345
Conversation
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
/condalock |
Thank you so much for working on this! 🙏 Can you help me understand--if we are adding pytorch to ml-notebook, is there still any point to maintaining the separate pytorch-notebook? |
No, there is no point. I simply didn't want to propose that big change here because I wasn't sure if it would be a welcome idea (but we can definitely do it!) More generally, I don't see a need to worry about separating pytorch and tensorflow going forward. We will likely face some issues from time to time, but the tensorflow and pytorch feedstocks are currently maintained by the same set of volunteers, so the migrations happen around the same time. In fact, the trickiest package here is In summary, if we can add a recent enough tensorflow on top of pangeo-notebook with all these geo packages, we should be able to add pytorch as well. As far as I am concerned, I would treat jax and tensorflow very similarly. I don't have that deep of an experience with the rapids-ai collection, but I can look into that next. My impression is that they're on the easier side. I am still investigating what's blocking us from getting to tensorflow 2.8.1, but I think it is some deep conflict caused by icu and/or absl stuff. Once we get tensorflow 2.8.1, we will have higher performance than the NGC containers (in my testing) and we get the cuda version of jaxlib as well (they have similar migrations). |
Could @weiji14 weigh on the pytorch discussion? Do you have any issues with merging the two notebooks? |
Update: I would abandon this PR if people are happy to wait a few days for #346 --- that will bring everything to the very cutting edge. We are just waiting on rasterio now. |
Nope, as long as pytorch is available somewhere! I do worry though about the larger docker file size (especially if the RAPIDS AI stuff is meant to be added later), but other than that, ok with simplifying maintenance down to just one ml-notebook image.
Yeah, for better or worse, the dev version of |
Image sizes:
Will likely grow by a little bit in #346 (up to 10.5GB, I would guess) update: build from other PR:
|
I am going to close this PR now. All issues are resolved with the merge in rasterio feedstock. A seamless update should be possible without tinkering (in a few hours). I am happy to help, but maybe having the bots do it makes more sense for continuity? Let me know if you face any issues. I recommend:
Happy to produce another PR if desired. |
fixes #320
fixes #322
@rabernat @scottyhq this
minimalPR takes tensorflow to cuda112, adds cupy, and adds pytorch in ml-notebookIdeally, we should add
jaxlib==*=*cuda*
(and maybe evenadded a proposal!) but that will have to wait a little longerpytorch==*=*cuda*
if you want to just have one "ML" notebookHelpful notes for review:
pytorch-gpu
is just a meta package and so don't worry about its disappearance.