Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add oci.manifest.digest, container.image.repo_digests and make container.image.tag array #159

Merged
merged 22 commits into from
Sep 11, 2023

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Jul 4, 2023

This PR adds container.image.digest oci.manifest.digest, container.image.repo_digests fields and make container.image.tag an array of strings (renamed to container.image.tags).
This is to cover #48.

Also related to #72.

More analysis can be found at #48 (comment).

cc: @AlexanderWert @kaiyan-sheng @mlunadia

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think OCI terminology makes more sense here and suggest using oci.manifest.digest and oci.manifest.tag`.This would cover images and artifacts and would also rely on well-known and standard terminology.

@ChrsMark
Copy link
Member Author

Hey @lmolkova thanks for the feedback here!

From my perspective it would give a better user experience to provide a more generic naming. In a use case where a collector ships data to a data-store when the end user will go to search for them it would be easier to have all the related container fields under the container.* namespace. I suspect that the average user would struggle to understand what oci.manifest.digest stands for even if they know what they are looking for.

In addition I don't see any tags specific attribute in the OCI spec: https://github.com/opencontainers/image-spec/blob/main/manifest.md. Do you see sth there that totally reflects the container.image.tag? If only .digest is reflected in OCI directly then that's an extra reason for not splitting the fields in container.* and oci.* at the same time.

To my mind we should be following what runtimes/orchestrators provide which at the same time follow the OCI under the hood. Following the OCI looks more like an implementation detail and should not be exposed to the end user to my mind since it slightly changes the scope of interest. You can also have a look at my comment at #48 (comment) where I analyse how the various runtimes/orchestrators define the specific fields. There I also mention that k8s is based on Container Runtime Interface (CRI) which follow the OCI spec but nowhere this detail is exposed to the end user. Indeed the CRI reports as RepoDigests while in OCI it's just digest from what we can see.

Let me know what you think :).

@lmolkova
Copy link
Contributor

lmolkova commented Jul 18, 2023

@ChrsMark I agree on the tag part - there is no formal tag definition in the OCI.

I don't however agree that OCI manifest digest should be recorded with container.image.digest.
OCI manifest is wider and more general-purpose thing than container image digest.

It describes not just container images, but non-runnable artifacts, VM images, helm charts, or anything else.

As an owner of Azure Container Registry SDK, when I report client calls, I would not even know what users are pulling or pushing, or how they intend to use the image/artifact, but I know how it's represented with manifests. Distribution platforms such as image/artifact registries don't know either - they just implement OCI/docker v2 APIs on blobs of arbitrary data.

container.image.digest focuses on the containers and the execution side only. Introducing it would cause the following problem:

  • container/artifact registries and their client libraries will have to use container.image attributes for non-containerized and not runnable things.
  • OR we'll have to introduce vm.image, helm.chart, oci.manifest namespaces, etc

OCI manifest is a standard thing defined by the spec, while container image is a vague thing. k8s docs refer to the OCI spec.

TL;DR: OCI manifest digest covers a wider set of use cases, and allows to have consistent attributes in client libraries, artifact registries, and container environments. It provides common unambiguous terminology.

The only downside, is that there is a tiny learning curve to discover OCI. From my point of view, the benefits of oci.manifest.digest outweigh it.

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 19, 2023

Thanks for the detailed explanation @lmolkova !

To your use-case I see the point, however I would avoid using a generic naming like oci.image.digest to report a Container's Image Digest specifically. Based on your example the oci.image.digest might be populated for other entities not just containers so I think it would better to use a specific field for the container entity. I would prefer using container.image.digest and then registry.image.digest to refer to Registry Images in general. I would see value in a query like container.image.digest==registry.image.digest since the second field creates a super set that includes the first one.

Also, I made some extra research to try and get things together and I think that for the container.image.digest we need to clarify some things along with the container.image.id. Let me try to summarize my findings:

So the image ID is the equivalent of the .config.digest from
https://github.com/opencontainers/image-spec/blob/main/manifest.md
We already depict this at container.image.id since https://github.com/open-telemetry/semantic-conventions/pull/39/files#r1202644093.

The current PR intends to add the Digest information that is capable to be used for downloading an image.
In Docker and the CRI it is called RepoDigest and it is not part of the Image Manifest at https://github.com/opencontainers/image-spec/blob/main/manifest.md.
Indeed it seems to be a different field which is part of the Manifest List or the "fat manifest" as it is called (instead of the Image Manifest). But still the ID and the Digest are depicted in different parts of the spec.

Example:

~ docker pull prom/prometheus:v2.16.0@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
docker.io/prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a: Pulling from prom/prometheus
Digest: sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
Status: Image is up to date for prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a
docker.io/prom/prometheus:v2.16.0@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a~ docker manifest inspect --verbose prom/prometheus:v2.16.0 | jq '.[0].Descriptor'
{
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "digest": "sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a",
  "size": 2824,
  "platform": {
    "architecture": "amd64",
    "os": "linux"
  }
}~ docker inspect prom/prometheus:v2.16.0 --format 'Id: {{.Id}}                                                                            
Repo Digest: {{index .RepoDigests }}' 
Id: sha256:e935122ab143a64d92ed1fbb27d030cf6e2f0258207be1baf1b509c466aeeb42                                                                            
Repo Digests: [prom/prometheus@sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a prom/prometheus@sha256:e4ca62c0d62f3e886e684806dfe9d4e0cda60d54986898173c1083856cfda0f4]~ docker manifest inspect --verbose prom/prometheus:v2.16.0 | jq '.[0].SchemaV2Manifest.config'                                           
{
  "mediaType": "application/vnd.docker.container.image.v1+json",
  "size": 6669,
  "digest": "sha256:e935122ab143a64d92ed1fbb27d030cf6e2f0258207be1baf1b509c466aeeb42"
}

So to summarize it, we already have the container.image.id which is the equivalent of Image ID in Docker, Kubernetes and CRI as it is also mentioned at https://github.com/open-telemetry/semantic-conventions/blob/main/model/resource/container.yaml#L35-L47. In terms of OCI it is the digest of the .config section of the manifest.

So here we talk about adding the Repo Digest identifier. Thus, if my claims above are correct we already skipped using the oci. specific namespace and to my mind it would be better if we continue doing this. It's easier to just use a mapping more friendly to the Container runtimes' (like ImageID, RepoDigest) users rather than going deep into the OCI spec which took me significant time to analyse and correlate with the fields provided by the runtimes. Plus the fact that the intend to use oci would extend the scope.

I would propose sth like container.image.digest and registry.image.digest (for registry use cases) as I already mentioned.

@lmolkova out of curiosity which part of the OCI spec would you be willing to use in your use-case specifically? The one that is depicted as Repo Digest and is used for downloading an image or the Image ID one?

Also I think having input from more people here to get more opinions would help a lot :).

@lmolkova
Copy link
Contributor

lmolkova commented Jul 25, 2023

To your use-case I see the point, however I would avoid using a generic naming like oci.image.digest to report a Container's Image Digest specifically. Based on your example the oci.image.digest might be populated for other entities not just containers so I think it would better to use a specific field for the container entity.

@ChrsMark , this is where we disagree. Attribute-based correlation is one of the features OTel provides. If we give the same thing multiple names we would not be able to correlate using attributes.

E.g. when using an artifact, I want to:

  • record data how the artifact is obtained regardless of container environment
  • record how it's pushed to the registry on the client side
  • record how it's pushed on the registry side
  • record how it's pulled on both sides
  • record how it's used (e.g. as a resource attribute on an application)

Assuming a bright future where I can get access to all this telemetry, using oci.manifest.digest I can find everything that has ever happened to this specific artifact digest.

Querying gets more complicated with container.image.digest, and registry.artifact.digest, now you need to know they are both available and are the same thing.

Once you add config digest and layers digest into the picture, the bigger the need for unambiguous and externally defined image id becomes (which is defined in OCI manifest digest).

I'm still struggling to understand the problem with oci.manifest.digest except a small learning curve. Can you explain if there are other reasons to make attribute-based correlation more complicated and use ambiguous and vague definitions while we have a spec that defines id?

@ChrsMark
Copy link
Member Author

I'm still struggling to understand the problem with oci.manifest.digest except a small learning curve. Can you explain if there are other reasons to make attribute-based correlation more complicated and use ambiguous and vague definitions while we have a spec that defines id?

Thank's @lmolkova! Let me try to collect my concerns bellow:

  1. The learning curve is not quite small, see my explanation at Add oci.manifest.digest, container.image.repo_digests and make container.image.tag array #159 (comment) of how the OCI information is depicted in runtimes. Also there was a struggle to clearly define those at https://github.com/open-telemetry/semantic-conventions/pull/39/files#r1202644093. And still the schema needs to be validated again, see point 4 bellow. All these make me feel that the learning curve would not be negligible.
  2. Coming from an Infrastructure Observability background I would assume that developers crafting Observability tools/collectors would find it more straight forward to use a terminology more relative to container runtimes (ImageDigest, ImageID). Otherwise they also need to go through the learning curve.
  3. If we follow the oci.image.digest we might hit an issue with this field containing a super sets of objects/entities: A container image is an OCI image but an OCI image is not necessarily a container image. So if I plot in a pie chart the top 10 OCI images this might be misleading if I don't know that this dataset does not only contain container image information. This is an example illustrating the points 1 and 2. It's a clear issue of being generic VS being specific.
  4. Last but not least, what is the plan about the container.image.id field that we have already introduced with https://github.com/open-telemetry/semantic-conventions/pull/39/files#r1202644093? As I explained at Add oci.manifest.digest, container.image.repo_digests and make container.image.tag array #159 (comment) we would need to validate it again since this is also part of the OCI spec. @joaopgrassi @marcsanmi what are your thoughts on this?

Still open question:
A) which part of the OCI spec would you be willing to use in your use-case specifically? The one that is depicted as Repo Digest and is used for downloading an image or the Image ID one?

I'm not totally against the oci.* proposal but I would like to address the above first.

@lmolkova
Copy link
Contributor

lmolkova commented Jul 26, 2023

@ChrsMark

Thank you for the update!

  1. I'm not sure I understand the concern, please take a look at p4 below
  2. My take (coming from an observability perspective) is that using standard, unambiguous terminology and being able to correlate telemetry using attributes is quite important.
  3. It probably means that the pie chart needs to take other information into account - such as manifest media type. BTW, for the registry scenario, it does not matter what type manifest has since the registry pulls and pushes arbitrary data and does not care.
  4. The container.image.id, as we discussed before is a container-runtime-specific thing that represents the image id - in case of docker it's (as you figured out) it's, the config digest, in the case of k8s it's something else. The same image has different ids on docker and k8s.
    If there are other environments, container.image.id would potentially be different in each of them, oci.manifest.digest would be a cross-environment, unified digest, common across different environments.

I.e. in your Prometheus example above:

container.image.id=sha256:e935122ab143a64d92ed1fbb27d030cf6e2f0258207be1baf1b509c466aeeb42
oci.manifest.digest=sha256:efd99a6be65885c07c559679a0df4ec709604bcdd8cd83f0d00a1a683b28fb6a.

I don't see how calling it container.image.digest would make it any easier to understand or use (on the contrary, we'll have yet another attribute to correlate it to registry telemetry). Plus if I google for container.image.digest I get nothing definitive, oci.manifest.digest brings me right to the spec.

A) which part of the OCI spec would you be willing to use in your use-case specifically? The one that is depicted as Repo Digest and is used for downloading an image or the Image ID one?

If I defined oci namespace for my SDK, I'd start with:

  • oci.manifest.digest
  • oci.manifest.media_type
  • oci.schema_version (2 by default)

thinking more about it, it'd be useful for me to record config and layers digests on telemetry, so I'd also consider

  • oci.blob.digest (blob is coming from docker v2 API, but maybe descriptor would be a better term)
  • oci.blob.media_type - this can also be used to distinguish configs from layers

Since SDK (or registry) has no knowledge of the container environment the image/artifact will be used in (or if it will be used in the container environment at all), it would not know anything about containers or their ids, it will only be able to use the manifest digest

@ChrsMark
Copy link
Member Author

Hey @lmolkova and thank you for the feedback! I see how container.image.id is different and now things are clear to me.
I think we can decide on using the oci.manifest.digest, I will update this PR accordingly :).

@ChrsMark ChrsMark changed the title Add container.image.digest and make container.image.tag array Add oci.manifest.digest and make container.image.tag array Jul 31, 2023
docs/resource/oci.md Outdated Show resolved Hide resolved
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and thank you for the great discussion, @ChrsMark !

One small comment from me on the container.image.id - the note on it seems out of date - it says "OCI defines a digest of manifest ." (I can't leave a comment on it)

I'd either remove this sentence completely or change it to something along the following lines:

"The container.image.id of the same image running in different environments don't not always match. The oci.manifest.digest attribute, however, is the same for a given image in all container runtimes that follow OCI specification."

@ChrsMark
Copy link
Member Author

ChrsMark commented Aug 3, 2023

Thanks for reviewing this folks! In adacd83 , I changed the fields to plural form since I think it's more accurate based on https://opentelemetry.io/docs/specs/otel/common/attribute-naming/#name-pluralization-guidelines. Runtimes also report those in plural form.

I also tuned the descriptions accordingly.
@lmolkova let me know what you think about the oci.manifest.digests description specifically.

@joaopgrassi
Copy link
Member

Sounds reasonable to me, but we would have duplication of data right? As container.image.digests would be a superset of oci.manifest.digest. In cases where @lmolkova described where it deals with "non container" scenarios we would only populate oci.manifest.digest if I got it right?

@ChrsMark
Copy link
Member Author

ChrsMark commented Aug 23, 2023

I think it depends on the perspective/method we collect these data:

but we would have duplication of data right? As container.image.digests would be a superset of oci.manifest.digest.

  1. In cases we only have access to the already stored container images of our observed system we can retrieve the information from the CRI's API and hence we would populate container.image.digests accordingly. We don't have a straight forward way to retrieve the single manifest's digest so far (in that particular usecase). In some cases we might be able to populate both fields but that's a fair compromise (==data duplication) for now. Maybe in the long run the CRI definition will change (to singular) and we can deprecate the digests in order to only use the oci one.

In cases where @lmolkova described where it deals with "non container" scenarios we would only populate oci.manifest.digest

  1. In cases we have direct access to know the specific manifest's digest we will be using the oci.manifest.digest (for example in download time/execution).

@ChrsMark
Copy link
Member Author

@lmolkova what are your thoughts on this? It would be nice if we can move this one forward and conclude into sth soon :). Thanks!

@lmolkova
Copy link
Contributor

lmolkova commented Aug 29, 2023

There is 1:1 relationship between container and an image. One container can run only one image.

There is also 1:1 relationship between an image and it's manifest.

So having multiple digests for one container does not make sense.

The same image can be pushed to multiple repositories, but if it's the same image, it will have the same oci digest anywhere (as it's a sha256 of manifest json and uniquely identifies a specific version of image). Check out answers on this thread: https://stackoverflow.com/questions/45533005/why-digests-are-different-depend-on-registry

If we need to support docker v1 or something else where the same image can have multiple manifest digests, let's create container.docker.repo_digests or something similar.

@ChrsMark
Copy link
Member Author

All right, based on the above discussions I have changed the oci.manifest. field accordingly to singular and introduced a new one to reflect what CRI's api provides:

  1. oci.manifest.digest
  2. container.image.repo_digests to reflect what CRI provides (https://github.com/kubernetes/cri-api/blob/c75ef5b473bbe2d0a4fc92f82235efd665ea8e9f/pkg/apis/runtime/v1/api.proto#L1238) for infra observability and security use-cases.

I hope that covers all that we have discussed so far.

Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small consideration is that the PR does "3" things now: Change tag to array, introduce OCI and a new image attribute for the digests.

I'd probably split this, at least the tag array in a separate PR, but given this has been open for a while and had extensive discussions, I'm approving to avoid even more work on @ChrsMark side.

@ChrsMark ChrsMark changed the title Add oci.manifest.digest and make container.image.tag array Add oci.manifest.digest, container.image.repo_digests and make container.image.tag array Aug 30, 2023
@ChrsMark
Copy link
Member Author

@lmolkova I think the requested changes are now covered :) . Are we good to go with this one?

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one small comment, otherwise LGTM. Thank you!

@ChrsMark
Copy link
Member Author

ChrsMark commented Sep 7, 2023

@open-telemetry/specs-semconv-maintainers this one should be ready for merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants