Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Private Registry Mirrors #285

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions text/0000-private-mirrors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Meta
[meta]: #meta
- Name: Private Registry Mirrors
- Start Date: 2023-05-12
- Author(s): @jabrown85
- Status: Draft <!-- Acceptable values: Draft, Approved, On Hold, Superseded -->
- RFC Pull Request: (leave blank)
- CNB Pull Request: (leave blank)
- CNB Issue: (leave blank)
- Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC)

# Summary
[summary]: #summary

As a platform operator, we'd like to be able to configure a private registry mirror for the most popular public registries. This will help reduce dependency on the public registries during builds, reducing the overhead and complexity of dealing with external rate limits and other restrictions. This will also help reduce the risk of service interruptions and reduce the amount of public network traffic during builds.

# Definitions
[definitions]: #definitions

- Public Registry: A registry that is publicly accessible, such as Docker Hub, Quay.io, etc.
- Private Registry: A registry that is not publicly accessible, such as a registry hosted on a private network.
- Registry Mirror: A private registry that mirrors a public registry. This is typically used to reduce the amount of public network traffic and to reduce the risk of service interruptions. They are treated as a drop-in replacement for the public registry.

# Motivation
[motivation]: #motivation

- Why should we do this?
As a platform operator, we'd like to protect our Cloud Native Buildpack operations from rate limits and other service issues that may occur on public registries. While private mirrors are relatively easy to set up for k8s nodes workloads, it is difficult to configure Cloud Native Buildpacks to use them as images are requested while running inside `lifecycle` container processes. We'd like to be able to configure CNB to use a private mirror for the registries without affecting the resulting image.

Importantly, the presence of a mirror should be invisible outside of the platform. The resulting image should not contain any references to the mirror. This will allow the image to be used in any environment, regardless of whether the mirror is available. For example, the metadata set on the resulting image would reference the original registry URL, not the mirror URL. This will allow the resulting image to be used as if there was no mirror configured. This is important for future actions against the resulting image, such as `pack rebase` or `pack inspect`.

- What use cases does it support?

If a public registry was having service interruptions, an operator that had a previously configured mirror would be able to continue to build images without interruption.

If a public registry introduced lower rate limits, an operator that had a previously configured mirror would be able to continue to build images without interruption or fear of hitting the rate limit.

- What is the expected outcome?

Operators concerned with the reliability of their builds that use public images will be able to configure private registry mirrors for the most popular registries.

# What it is
[what-it-is]: #what-it-is

An operator may configure registry mirror(s) via `CNB_REGISTRY_MIRRORS`. This will allow `lifecycle` to use the mirror for all images that are requested during builds.

# How it Works
[how-it-works]: #how-it-works

Out of scope - setting up the mirror itself. This is a separate concern that is not specific to Cloud Native Buildpacks.

Once a mirror for one or more public registry has been setup, the platform operator can configure Cloud Native Buildpacks to use the mirror. This will permit `lifecycle` to use the mirrors for all images that are requested during builds using the `CNB_REGISTRY_MIRRORS` environment variable that would otherwise be requested from the public registry.

The `CNB_REGISTRY_MIRRORS` environment variable will be a list of mirror configurations. Each mirror configuration will be a key/value pair, where the key is the registry URL and the value is the mirror URL. The key/value pairs will be separated by a semicolon (`;`).

For example, if we wanted to configure a mirror for Docker Hub and Quay.io, we could set the `CNB_REGISTRY_MIRRORS` environment variable to the following:

```
docker.io=https://docker.mirror.example.com;quay.io=https://quay.mirror.example.com
```

When `lifecycle` requests an image during any phase (`analyze`, `restore`, `export`, `rebase`), it will first check the `CNB_REGISTRY_MIRRORS` environment variable. If the requested image's registry is configured in the `CNB_REGISTRY_MIRRORS` environment variable, it will use the mirror URL instead of the original registry URL.

If the private registry requires authentication, authentication to the registry will be handled by the existing `CNB_REGISTRY_AUTH` value. If the private registry does not require authentication, no additional configuration is required.

If registry mirrors are configured for specific images in configuration (e.g. `stack.toml` or `run.toml`), the `CNB_REGISTRY_MIRRORS` will be processed with each registry attempt. For example, the following `stack.toml` configures mirrors for `public/stack:run-image`:

```toml
[run-image]
image = "public/stack:run-image"
mirrors = ["quay.io/public/stack:run-image"]
```

If `CNB_REGISTRY_MIRRORS` has the value of:

```
docker.io=https://docker.mirror.example.com;quay.io=https://quay.mirror.example.com
```

When `lifecycle` attempts to resolve the `public/stack:run-image` image, `lifecycle` will attempt to fetch the image from `docker.mirror.example.com/public/stack:run-image`. If the [Run Image Resolution](https://github.com/buildpacks/spec/blob/main/platform.md#run-image-resolution) resulted in `quay.io/public/stack:run-image` being chosen, `lifecycle` will attempt to fetch the image from `quay.mirror.example.com/public/stack:run-image` instead. This new processing happens AFTER the [Run Image Resolution](https://github.com/buildpacks/spec/blob/main/platform.md#run-image-resolution) has executed. The run image selection will NOT take `CNB_REGISTRY_MIRRORS` into account, but the final image resolution will. Think of `CNB_REGISTRY_MIRRORS` as a just-in-time override of the final image resolution.
Copy link
Member

@natalieparellano natalieparellano Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud here, I'm not sure if this is relevant or not...

There exists today a pack config run-image-mirrors that will

There is also a pack config registry-mirrors that will

  • opaquely translate the registry before fetching ANY image for ANY operation
  • transparently translate the registry for the run image reference AFTER run-image-mirrors have been applied

It's worth noting that while these translations update the -run-image provided to the lifecycle, they do NOT update stack.toml/run.toml i.e., these mirrors are "hidden" today in the sense that they are not persisted as metadata on the app image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I remember - it would be helpful to describe how CNB_REGISTRY_MIRRORS env var for the lifecycle would be exposed in pack given the above stuff that exists today. I would really hope that we could avoid adding yet another toggle here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buildpacks/platform-maintainers @jjbustamante do you have any thoughts here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to search in past issues and pull request to understand the reason for pack config mirrors commands. I found the following:

The order of preference applied for for run image when pack build is executed:

  1. --run-image flag
  2. default run image mirror, as discussed in the --publish/daemon cases above
    2.1 when publish, default to using the run-image mirror in the registry you are publishing the image to
    2.2 when daemon, defaults to using the run-image mirror in the same registry (colocated) as the builder
  3. mirrors set using pack config run-image-mirrors
  4. mirrors defined on the builder

Based on this, the new CNB_REGISTRY_MIRRORS seems to be like a new option to override point 4, right? something like:

  1. expose some way to set CNB_REGISTRY_MIRRORS and pass it through the lifecycle
  2. mirrors defined on the builder


# Migration
[migration]: #migration

This is a new feature and will not affect older platforms.

# Drawbacks
[drawbacks]: #drawbacks

Why should we *not* do this?

Complexity. We'll have to teach `lifecycle` how to use the configured mirrors. This will add complexity to the codebase and will require additional testing.

Breaking expectations. If a platform operator were to fall behind or modify an image in the mirror, the resulting image would not match the image built by an end user against the public registry.

# Alternatives
[alternatives]: #alternatives

- What other designs have been considered?

Platforms could use oci layout to try and fetch images from a local cache before going to the public registry. This would require the platform to have a local cache of all images that may be requested during builds. This would be difficult to maintain and would require a lot of disk space.

- Why is this proposal the best?

This may not be the best, but that is why we are proposing it. We'd like to hear from the community about other options.

- What is the impact of not doing this?

Operators will have to continue to deal with the reliability of public registries.

# Prior Art
[prior-art]: #prior-art

Discuss prior art, both the good and bad.

# Unresolved Questions
[unresolved-questions]: #unresolved-questions

- How will we teach kaniko-style extensions to use the mirrors?
- In what situations should `lifecycle` fallback to the original registry if the mirror is unavailable?
- Should the mirror be put into the metadata at all for SBOM type of reasons? We know it shouldn't be written as the base image, but it could be written to a new key in the metadata.

# Spec. Changes (OPTIONAL)
[spec-changes]: #spec-changes
Does this RFC entail any proposed changes to the core specifications or extensions? If so, please document changes here.
Examples of a spec. change might be new lifecycle flags, new `buildpack.toml` fields, new fields in the buildpackage label, etc.
This section is not intended to be binding, but as discussion of an RFC unfolds, if spec changes are necessary, they should be documented here.

# History
[history]: #history

<!--
## Amended
### Meta
[meta-1]: #meta-1
- Name: (fill in the amendment name: Variable Rename)
- Start Date: (fill in today's date: YYYY-MM-DD)
- Author(s): (Github usernames)
- Amendment Pull Request: (leave blank)

### Summary

A brief description of the changes.

### Motivation

Why was this amendment necessary?
--->