-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add oci.manifest.digest, container.image.repo_digests and make container.image.tag array #159
Conversation
Signed-off-by: ChrsMark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think OCI terminology makes more sense here and suggest using oci.manifest.digest
and oci.manifest.tag`.This would cover images and artifacts and would also rely on well-known and standard terminology.
Hey @lmolkova thanks for the feedback here! From my perspective it would give a better user experience to provide a more generic naming. In a use case where a collector ships data to a data-store when the end user will go to search for them it would be easier to have all the related In addition I don't see any To my mind we should be following what runtimes/orchestrators provide which at the same time follow the OCI under the hood. Following the OCI looks more like an implementation detail and should not be exposed to the end user to my mind since it slightly changes the scope of interest. You can also have a look at my comment at #48 (comment) where I analyse how the various runtimes/orchestrators define the specific fields. There I also mention that k8s is based on Container Runtime Interface (CRI) which follow the OCI spec but nowhere this detail is exposed to the end user. Indeed the CRI reports as Let me know what you think :). |
@ChrsMark I agree on the tag part - there is no formal tag definition in the OCI. I don't however agree that OCI manifest digest should be recorded with It describes not just container images, but non-runnable artifacts, VM images, helm charts, or anything else. As an owner of Azure Container Registry SDK, when I report client calls, I would not even know what users are pulling or pushing, or how they intend to use the image/artifact, but I know how it's represented with manifests. Distribution platforms such as image/artifact registries don't know either - they just implement OCI/docker v2 APIs on blobs of arbitrary data.
OCI manifest is a standard thing defined by the spec, while container image is a vague thing. k8s docs refer to the OCI spec. TL;DR: OCI manifest digest covers a wider set of use cases, and allows to have consistent attributes in client libraries, artifact registries, and container environments. It provides common unambiguous terminology. The only downside, is that there is a tiny learning curve to discover OCI. From my point of view, the benefits of |
Thanks for the detailed explanation @lmolkova ! To your use-case I see the point, however I would avoid using a generic naming like Also, I made some extra research to try and get things together and I think that for the So the image ID is the equivalent of the The current PR intends to add the Digest information that is capable to be used for downloading an image. Example: ➜ ~ docker pull prom/prometheus:v2.16.0@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
docker.io/prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a: Pulling from prom/prometheus
Digest: sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
Status: Image is up to date for prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
docker.io/prom/prometheus:v2.16.0@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
➜ ~ docker manifest inspect --verbose prom/prometheus:v2.16.0 | jq '.[0].Descriptor'
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a",
"size": 2824,
"platform": {
"architecture": "amd64",
"os": "linux"
}
}
➜ ~ docker inspect prom/prometheus:v2.16.0 --format 'Id: {{.Id}}
Repo Digest: {{index .RepoDigests }}'
Id: sha256:e935122ab143a64d92ed1fbb27d030cf6e2f0258207be1baf1b509c466aeeb42
Repo Digests: [prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a prom/prometheus@sha256:e4ca62c0d62f3e886e684806dfe9d4e0cda60d54986898173c1083856cfda0f4]
➜ ~ docker manifest inspect --verbose prom/prometheus:v2.16.0 | jq '.[0].SchemaV2Manifest.config'
{
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 6669,
"digest": "sha256:e935122ab143a64d92ed1fbb27d030cf6e2f0258207be1baf1b509c466aeeb42"
}
So to summarize it, we already have the So here we talk about adding the I would propose sth like @lmolkova out of curiosity which part of the OCI spec would you be willing to use in your use-case specifically? The one that is depicted as Also I think having input from more people here to get more opinions would help a lot :). |
@ChrsMark , this is where we disagree. Attribute-based correlation is one of the features OTel provides. If we give the same thing multiple names we would not be able to correlate using attributes. E.g. when using an artifact, I want to:
Assuming a bright future where I can get access to all this telemetry, using Querying gets more complicated with Once you add config digest and layers digest into the picture, the bigger the need for unambiguous and externally defined image id becomes (which is defined in OCI manifest digest). I'm still struggling to understand the problem with |
Thank's @lmolkova! Let me try to collect my concerns bellow:
Still open question: I'm not totally against the |
Thank you for the update!
I.e. in your Prometheus example above:
I don't see how calling it
If I defined oci namespace for my SDK, I'd start with:
thinking more about it, it'd be useful for me to record config and layers digests on telemetry, so I'd also consider
Since SDK (or registry) has no knowledge of the container environment the image/artifact will be used in (or if it will be used in the container environment at all), it would not know anything about containers or their ids, it will only be able to use the manifest digest |
Hey @lmolkova and thank you for the feedback! I see how |
Signed-off-by: ChrsMark <[email protected]>
Signed-off-by: ChrsMark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and thank you for the great discussion, @ChrsMark !
One small comment from me on the container.image.id
- the note on it seems out of date - it says "OCI defines a digest of manifest ." (I can't leave a comment on it)
I'd either remove this sentence completely or change it to something along the following lines:
"The container.image.id
of the same image running in different environments don't not always match. The oci.manifest.digest
attribute, however, is the same for a given image in all container runtimes that follow OCI specification."
Signed-off-by: ChrsMark <[email protected]>
Signed-off-by: ChrsMark <[email protected]>
Thanks for reviewing this folks! In adacd83 , I changed the fields to plural form since I think it's more accurate based on https://opentelemetry.io/docs/specs/otel/common/attribute-naming/#name-pluralization-guidelines. Runtimes also report those in plural form. I also tuned the descriptions accordingly. |
Sounds reasonable to me, but we would have duplication of data right? As container.image.digests would be a superset of |
I think it depends on the perspective/method we collect these data:
|
@lmolkova what are your thoughts on this? It would be nice if we can move this one forward and conclude into sth soon :). Thanks! |
There is 1:1 relationship between container and an image. One container can run only one image. There is also 1:1 relationship between an image and it's manifest. So having multiple digests for one container does not make sense. The same image can be pushed to multiple repositories, but if it's the same image, it will have the same oci digest anywhere (as it's a sha256 of manifest json and uniquely identifies a specific version of image). Check out answers on this thread: https://stackoverflow.com/questions/45533005/why-digests-are-different-depend-on-registry If we need to support docker v1 or something else where the same image can have multiple manifest digests, let's create |
All right, based on the above discussions I have changed the
I hope that covers all that we have discussed so far. |
49a0e82
to
47fb33f
Compare
Signed-off-by: ChrsMark <[email protected]>
47fb33f
to
c883557
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small consideration is that the PR does "3" things now: Change tag to array, introduce OCI and a new image attribute for the digests.
I'd probably split this, at least the tag array in a separate PR, but given this has been open for a while and had extensive discussions, I'm approving to avoid even more work on @ChrsMark side.
@lmolkova I think the requested changes are now covered :) . Are we good to go with this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one small comment, otherwise LGTM. Thank you!
Signed-off-by: ChrsMark <[email protected]>
83f35c4
to
e5a3293
Compare
@open-telemetry/specs-semconv-maintainers this one should be ready for merge? |
This PR adds
container.image.digest
oci.manifest.digest
,container.image.repo_digests
fields and makecontainer.image.tag
an array of strings (renamed tocontainer.image.tags
).This is to cover #48.
Also related to #72.
More analysis can be found at #48 (comment).
cc: @AlexanderWert @kaiyan-sheng @mlunadia