Use python slim instead of Cuda base image #1205

technillogue · 2023-07-12T21:10:24Z

One weird trick to save 20GB of user bandwidth and cut our cold start times in half.

I'm not sure how to set up the config to default to false though.

Signed-off-by: technillogue <[email protected]>

Signed-off-by: Mattt Zmuda <[email protected]>

andreasjansson · 2023-07-13T20:21:42Z

Defaulting use_cuda_base_image to true means that we're completely backwards compatible but people might not discover that they can get a much smaller image.

Defaulting to false means that some high percentage of users will get smaller images, but for some low percentage they'll get a really hard-to-debug error message.

This is tricky... We could improve those percentages by magically setting use_cuda_base_image based on presence of tensorflow or jax, but that makes the hard-to-debug error message even more opaque for people who actually need the cuda devel base image.

Unrelated, but we've also had people ask to use custom base images. I'm not sure if we should encourage that since it most likely creates even harder-to-debug errors.

mattt · 2023-07-13T21:46:19Z

@technillogue Thanks so much for putting this together. I'm encouraged by how far we can get by moving away from CUDA dev base images.

I originally imagined this as a build flag (cog build --use-cuda-base-image=false). Making that work is more involved, but may be a better fit. mattt/use-cuda-base-image-flag

Expanding on what @andreasjansson said, and following up on our conversation yesterday, I think we should start by implementing this as an opt-in feature. That solves an immediate need we have, and gives us a good opportunity to continue validating this functionality without the risk of breaking existing Cog models.

At the same time, this should be a big improvement for a majority of users, and it'd be nice to migrate everyone to this sooner than later. I wrote up an idea for how we could solve that here: #1207

technillogue · 2023-07-13T23:36:39Z

mattt/use-cuda-base-image-flag looks really good, I would be really happy to have that while we figure out the right way to handle #1207. I hadn't thought about what to do with the cuda and cudnn options. We might still need cuda, separately from the base image, for picking the right torch for users (since we're relying on torch's CUDA libraries).

Like we discussed, we'd like to also support custom dockerfiles that we add labels to, but not custom base images that might not have apt.

bfirsh · 2023-07-14T22:51:11Z

Could this be detected automatically? Andreas hints at a hard to debug error message. Could we automatically detect that error message and hint at the solution?

I like the idea of trying to do as much automatically as possible, then giving users a way to bail out if they need to.

mattt · 2023-07-27T12:50:19Z

#1214 implements this as a --use-cuda-base-image build flag that defaults to "auto", which we currently take to be true in all cases.

@bfirsh, to your point, we can absolutely iterate towards making automatic behavior better. We should eventually have it so that "auto" implies false in most cases, except the ones @andreasjansson alluded to where there are known issues.

@technillogue has a PR to allow users to specify a custom base image in #1229.

technillogue and others added 4 commits July 12, 2023 16:50

initial draft

9610a27

Signed-off-by: technillogue <[email protected]>

fix some tests

244fe56

Signed-off-by: technillogue <[email protected]>

set flag to be true by default and add it to the config schema

49448ad

Signed-off-by: technillogue <[email protected]>

Fix failing tests

bacacc9

Signed-off-by: Mattt Zmuda <[email protected]>

mattt mentioned this pull request Jul 13, 2023

Restructure Cog configuration options for base image selection #1207

Closed

mattt mentioned this pull request Jul 18, 2023

Add --use-cuda-base-image flag #1214

Merged

mattt closed this Jul 27, 2023

mattt deleted the syl/slim-base-image branch July 31, 2023 11:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use python slim instead of Cuda base image #1205

Use python slim instead of Cuda base image #1205

technillogue commented Jul 12, 2023

andreasjansson commented Jul 13, 2023

mattt commented Jul 13, 2023 •

edited

Loading

technillogue commented Jul 13, 2023

bfirsh commented Jul 14, 2023

mattt commented Jul 27, 2023

Use python slim instead of Cuda base image #1205

Use python slim instead of Cuda base image #1205

Conversation

technillogue commented Jul 12, 2023

andreasjansson commented Jul 13, 2023

mattt commented Jul 13, 2023 • edited Loading

technillogue commented Jul 13, 2023

bfirsh commented Jul 14, 2023

mattt commented Jul 27, 2023

mattt commented Jul 13, 2023 •

edited

Loading