Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GitHub Container Registry as default image location #782

Open
thesuperzapper opened this issue Jul 5, 2022 · 14 comments
Open

Use GitHub Container Registry as default image location #782

thesuperzapper opened this issue Jul 5, 2022 · 14 comments

Comments

@thesuperzapper
Copy link
Member

thesuperzapper commented Jul 5, 2022

/kind feature

After PR kubeflow/kubeflow#6548, we are now using kubeflownotebookswg/{IMAGE_NAME} (DockerHub) as the default image location for our images.

I think we should continue to push to kubeflownotebookswg/{IMAGE_NAME} (and possibly public.ecr.aws/j1r0q0g6/notebooks, pending ECR getting fixed in kubeflow/testing#1008), but we should use ghcr.io/kubeflow/{IMAGE_NAME} as the "default" location of our images, this is because:

  1. People will trust images from ghcr.io/kubeflow/{IMAGE_NAME} more than kubeflownotebookswg/{IMAGE_NAME} (as it is clearly owned/managed by the Kubeflow GitHub org)
  2. We can have more fine-grained permissions on GHCR compared with the free DockerHub, which only allows 1 user, with super-admin access
  3. I have mirrored all the old image tags from public.ecr.aws/j1r0q0g6/notebooks/{IMAGE_NAME} to ghcr.io/kubeflow/kubeflow/{IMAGE_NAME}, so people can still use older versions by simply changing the tag.
  4. DockerHub removes image tags that are not pulled regularly, but GHCR will not.

Related Issues:

@thesuperzapper
Copy link
Member Author

/cc @kimwnasptd @NickLoukas

@kimwnasptd
Copy link
Member

I don't mind pushing images to as many registries as possible either.

Regarding the point of trusting ghcr.io more than docker.io/kubeflownotebookswg, I think it's fine. We've seen other projects in Kubeflow use a similar DockerHub account, like AutoML and Training.

Regarding the default image registry to use, doesn't ghcr.io provide some rate limiting as well? My question is, why should we go with either ghcr.io or docker.io as the default and not with ECR for which we've also included in the price estimation? kubeflow/testing#1006 (comment)

We also discussed this in the release team meeting today(11/7/2021) so I'll also cc @annajung @surajkota @DomFleischmann @mstopa

@surajkota
Copy link

surajkota commented Jul 14, 2022

Also to consider using Github registry is not free, free tier is much less than public ECR.

GH pricing (repositories are part of Github packages): https://docs.github.com/en/billing/managing-billing-for-github-packages/about-billing-for-github-packages#about-billing-for-github-packages

ECR pricing: https://aws.amazon.com/ecr/pricing/

We got the credits approved for ECR, who is expected to pay for GHCR?

People will trust images ...

ECR supports custom alias: https://docs.aws.amazon.com/AmazonECR/latest/public/public-registry-settings.html

@thesuperzapper
Copy link
Member Author

@surajkota @kimwnasptd

I know it's not very clear from this page, but there are actually NO usage limits for public packages on GHCR (outside of abuse prevention measures), the "GitHub Free" limits are for private packages only.

The official GHCR docs only briefly mention public packages, but when they do, they say that "GitHub Packages usage is free for public packages":

Screen Shot 2022-07-15 at 08 58 58

For example, there are no limits on our public ghcr.io/kubeflow/kubeflow/central-dashboard image:

  • there is no storage size limit
  • unauthenticated users can pull it as much as they like

NOTE: we have already MASSIVELY exceeded the "GitHub Free" storage limit with our public packages (they total to over 1TB of storage), so we clearly we are not subject to those limits!


Given this, I really think we should make the default image locations in our manifests the ghcr.io/kubeflow ones.

@kimwnasptd
Copy link
Member

@thesuperzapper have you also setup the repo to use the credentials for GH registry? Can we push images there right now with the GH actions?

@thesuperzapper
Copy link
Member Author

@thesuperzapper have you also setup the repo to use the credentials for GH registry? Can we push images there right now with the GH actions?

@kimwnasptd Yes, the GITHUB_TOKEN of actions run in the kubeflow/kubeflow repo are able to push to all the images.


Here is the "Package Settings" page for ghcr.io/kubeflow/kubeflow/jupyter-web-app:

(NOTE: only GitHub org-Admins and listed package-owners can see this page)

Screen Shot 2022-07-20 at 19 53 35

@kimwnasptd
Copy link
Member

@thesuperzapper then we switch to GHCR for this release, and if in the future they change their pricing plans then lets consider ECR.

What's the ETA for a PR for this? Any chance you'll manage to make this within the week to include this before the distribution testing?

@thesuperzapper
Copy link
Member Author

@kimwnasptd I am not sure what you mean by a PR? We can just update the image push locations (and authenticate Docker with the GITHUB_TOKEN environment variable, or use the docker-login GitHub action step).

Also, aren't you and @NickLoukas already working on making the actions to build/push images?

@kimwnasptd
Copy link
Member

@kimwnasptd I am not sure what you mean by a PR?

In order to push to a new registry we need to:

  1. Update the GH Actions to use a new name for building/pushing images to
  2. Update all the manifests in the repo to use the new images

Also, aren't you and @NickLoukas already working on making the actions to build/push images?

I thought you would also do the work for your own proposal. We are swamped with the release work right now and won't have cycles for this one.

@thesuperzapper
Copy link
Member Author

@apo-ger is there any chance you would have cycles to extend the docker pipelines to push tags for both kubeflownotebookswg/xxxx (DockerHub), and ghcr.io/kubeflow/kubeflow/notebook-servers/ (GHCR)?

(I only ask because of your great work in PR kubeflow/kubeflow#6555)

I would be very grateful if you would be willing to take a look, as I strongly think GHCR should be the "default" image home in our manifests for 1.6, but lack a lot of free time to get this in!

@thesuperzapper
Copy link
Member Author

@kimwnasptd @apo-ger we really need to setup the CI/CD to push to both kubeflownotebookswg/xxxx (DockerHub), and ghcr.io/kubeflow/kubeflow/notebook-servers/ (GHCR), do you have time to help with this?

@sergeyshevch
Copy link

Hi! Is there any progress? For now, we have issues with dockerhub limits every day. As another option kubeflow team can contact dockerhub staff to get an opensource project label. That will remove image pull limit

@midhun1998
Copy link
Member

@thesuperzapper I'm willing to add a PR to push the image to GHCR along with DockerHub. Let me know if this is already being worked on. 🙂

@andreyvelich
Copy link
Member

Let's discuss the default container registry for Kubeflow images in the community repo.
/transfer community

@google-oss-prow google-oss-prow bot transferred this issue from kubeflow/kubeflow Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants