Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transparent Proxy Cache #21342

Open
ianseyer opened this issue Dec 19, 2024 · 5 comments
Open

Transparent Proxy Cache #21342

ianseyer opened this issue Dec 19, 2024 · 5 comments
Labels
area/proxy-cache kind/requirement New feature or idea on top of harbor

Comments

@ianseyer
Copy link

ianseyer commented Dec 19, 2024

We would like our proxy-cache to be invisible to use. Currently, it requires specifying the proxy-cache AND the upstream repository, e.g. harbor.com/proxyProject/upstreamRepo/image:tag. Let's assume we have a proxy-cache project called "prod" that we want to use to proxy-pull from upstream.com/prod. This means that, in order to pull images, we have to use harbor.com/prod/prod/image:tag.

Instead, we would like the upstream repository name to be assumed by the proxy-cache project name, so we can instead just pull harbor.com/prod/image:tag.

This would make it much easier to migrate from a non-harbor registry to a harbor registry, while pulling the same images.

I imagine this could be implemented by a config option on the project called "transparent" or something similar, where it would automatically use the proxy-cache project name on the upstream request.

@Vad1mo
Copy link
Member

Vad1mo commented Dec 20, 2024

There are various solutions floating around, but it mostly depends on the container runtime, and this is where it currently stalls. There is no clear spec on that topic. I recently had a chat with @phillebaba or @sudo-bmitch. However, I don't recall the details on what is currently possible or where it is stuck.

From the Harbor perspective, I can say that if there is a spec, we will implement and support that.

@Vad1mo Vad1mo added the kind/requirement New feature or idea on top of harbor label Dec 20, 2024
@Vad1mo
Copy link
Member

Vad1mo commented Dec 20, 2024

After rereading your request over again, I understand the problem a bit better.

to make it precise you want to be able to map docker.io/library/alpine to registry.goharbor.io/library/alpine instead of what you would do currently registry.goharbor.io/dockerproxy/library/alpine

This is basically what we already have on the replication level, that is called "Flattening:"

Flattening:

Flatten 1 Level
Reduce the nested repository structure when copying images. Assuming that the nested repository structure is 'a/b/c/d/img' and the destination namespace is 'ns', the corresponding results of each item are as below:
'Flatten All Levels'(Used prior v2.3): 'a/b/c/d/img' -> 'ns/img'
'No Flatting': 'a/b/c/d/img' -> 'ns/a/b/c/d/img
'Flatten 1 Level'(Default): 'a/b/c/d/img' -> 'ns/b/c/d/img'
'Flatten 2 Levels': 'a/b/c/d/img' -> 'ns/c/d/img'
'Flatten 3 Levels': 'a/b/c/d/img' -> 'ns/d/img'

I see one additional advantage here, when you create a proxy and add Flattening you can restrict what you want to proxy, for example, you can allow only to proxy docker.io/library, this is what a lot of enterprises allow.

@ianseyer
Copy link
Author

Ah! That is a great point. Yes, it would be identical to flattening. And that's a good advantage to point out.

If you have any tips/recommendations on where or how you would like this to be implemented, I am happy to open a PR. This is directly in the path of a successful Harbor migration for us.

@phillebaba
Copy link

phillebaba commented Dec 20, 2024

The change in the distribution spec @Vad1mo is referring to is probably opencontainers/distribution-spec#66 which proposes to add a ns query parameter to all requests. This way mirror registries would be aware of the original registry.

Containerd already adds this to its requests when pulling from a configured mirror. It is a feature I have been using for while in https://github.com/spegel-org/spegel. I guess the problem is that it would be difficult to add a feature which a limited amount of clients currently implement. If the PR was merged and added to the distribution spec it would be easier to implement this and push other clients to support the updated spec.

Another possible solution is to look at what Dragonfly does. It requires you to add an addition header to the mirror configuration X-Dragonfly-Registry = ["https://example.com"] which could be read by mirror registries. Personally I feel like it is a sub optimal solution was it requires the user to configure this properly, additionally this needs to be configured for each registry to having a configuration apply to all registries is not possible.
https://d7y.io/docs/operations/integrations/container-runtime/containerd/#multiple-registries

@ianseyer
Copy link
Author

ianseyer commented Dec 20, 2024

I see. That is good context.

Are we opposed to, in the meantime, adding flattening to proxy requests via a feature-flagged piece of middleware? A checkbox akin to, when creating a proxy-cache project, "transparent?" That proposal is over 5 years old and has seen little (though valuable) activity this year. I also know that distribution is slow moving, especially with 3.0 work underway.

In full transparency: I plan on getting this to work (because of #21330), as it is quickly becoming a requirement for us; would rather my fork be done in accordance with how the maintainers would like to see it architected should it become a viable upstream release.

I am prone to agree with one of the earlier comments: why should a client care if the server-side image is being proxied?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy-cache kind/requirement New feature or idea on top of harbor
Projects
None yet
Development

No branches or pull requests

4 participants