Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add geo fields from ECS into client and server attributes #351

Closed
wants to merge 6 commits into from

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Sep 27, 2023

Fixes #1033

Changes

This PR is to add geo fields from Elastic Common Schema(ECS) into otel client and server attributes. These fields can be used to carry data about a specific location such as city name, country name, longitude, latitude, postal code, and ect.

Merge requirement checklist

@kaiyan-sheng kaiyan-sheng requested review from a team September 27, 2023 16:57
@trisch-me
Copy link
Contributor

I think it's better to change this PR to add geo to the registry

@kaiyan-sheng kaiyan-sheng changed the title Add geo fields from ECS Add geo fields from ECS into client and server attributes Nov 13, 2023
model/registry/client.yaml Outdated Show resolved Hide resolved
model/registry/client.yaml Outdated Show resolved Hide resolved
model/registry/server.yaml Outdated Show resolved Hide resolved
model/registry/server.yaml Outdated Show resolved Hide resolved
model/registry/server.yaml Outdated Show resolved Hide resolved
model/registry/server.yaml Outdated Show resolved Hide resolved
model/registry/client.yaml Outdated Show resolved Hide resolved
model/registry/client.yaml Outdated Show resolved Hide resolved
model/registry/client.yaml Outdated Show resolved Hide resolved
model/registry/server.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@trisch-me trisch-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but we will need to think about re-usage of the fields, currently we are repeating them a lot

@trask
Copy link
Member

trask commented Nov 14, 2023

can we add geo.yaml instead?

@AlexanderWert
Copy link
Member

AlexanderWert commented Nov 15, 2023

can we add geo.yaml instead?

@trask In the registry we structure the files by the top-level namespace (in that case server and client) for better orientation. I'd like to keep it consistent.

But, as soon as we have a way of reusing / nesting attribute definitions under other namespaces (see comment above), we would move all geo.* fields to a dedicated geo.yaml and only reuse in other yaml files like client and server.

@kaiyan-sheng
Copy link
Contributor Author

can we add geo.yaml instead?

In addition to what Alex said, I just want to point out in ECS, geo fields are not expected to be used directly at the root of the events.

@trask
Copy link
Member

trask commented Nov 15, 2023

as soon as we have a way of reusing / nesting attribute definitions under other namespaces

can we use adding the geo fields as motivation to move this forward instead of adding the duplication? or is the proposal that we would still have duplication even after reuse / nesting is introduced?

@kaiyan-sheng
Copy link
Contributor Author

kaiyan-sheng commented Nov 15, 2023

can we use adding the geo fields as motivation to move this forward instead of adding the duplication? or is the proposal that we would still have duplication even after reuse / nesting is introduced?

@trask We would like to use the geo fields as motivation to move forward with reusing / nesting attribute definitions. That's why we want to add geo fields into 2 attributes (client and server) in this PR to present the need for reusing definitions.

Do you prefer to have geo fields added into for example only client here, and then enable the reusing attribute definition feature, and then add geo fields for server?

@trask
Copy link
Member

trask commented Nov 15, 2023

I was thinking that by introducing reuse / nesting first, we might avoid some churn, but i'm not sure what the plan is for reuse / nesting so definitely could be mistaken

@kaiyan-sheng
Copy link
Contributor Author

@trask We thought about this too but since we don't know when the nested fields will be implemented, we would like to add the geo fields first and then clean up later.

@trask
Copy link
Member

trask commented Nov 17, 2023

we don't know when the nested fields will be implemented

ya, this is a bit of my worry, so trying to leverage the motivation for adding geo fields into adding support for nested fields in general 😄

but really, I don't mean to hold this up, I'm happy to go along with whichever approach @open-telemetry/specs-semconv-maintainers want to go here

@joaopgrassi
Copy link
Member

joaopgrassi commented Nov 20, 2023

I wonder, could we add geo.yaml and then still have geoclient.yaml and geoserver.yaml and those two simply ref the attributes from geo.yaml but under the prefix server/client in each file?

@AlexanderWert
Copy link
Member

I wonder, could we add geo.yaml and then still have geoclient.yaml and geoserver.yaml and those two simply ref the attributes from geo.yaml but under the prefix server/client in each file?

That exactly would be the idea with reuse under new namespace. But, IIUC, the following is not possible with the syntax / tooling yet:

and those two simply ref the attributes from geo.yaml but under the prefix server/client in each file

@@ -15,9 +15,35 @@ This also covers UDP network interactions where one side initiates the interacti
| Attribute | Type | Description | Examples |
|---|---|---|---|
| `client.address` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>Client address - domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [1] | `client.example.com`; `10.1.2.80`; `/tmp/my.sock` |
| `client.port` | int | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>Client port number. [2] | `65123` |
| `client.geo.city_name` | string | City name. | `Montreal`; `Berlin` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect any client.geo and server.geo attributes to be present simultaneously? I thought the purpose of a global attribute registry is to represent attributes that mean the same wherever they are used, and that suits well for the geo.* attributes. That is, geo.city_name means the same whether it's used in the context of client or server or any other that's described at the end at https://www.elastic.co/guide/en/ecs/current/ecs-geo.html

Even if the client.geo.x and server.geo.x attributes are required in the same object, I think this problem is unsolved currently - for eg., one cannot have two url.full attributes in the same object if I have to represent the referring url or the url being redirected to in the case of 302.

I think we should omit the client and server prefix.

Copy link
Member

@AlexanderWert AlexanderWert Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect any client.geo and server.geo attributes to be present simultaneously?

Why not? From a web-server access log you can easily derive both and can be of interest for observability users.

I thought the purpose of a global attribute registry is to represent attributes that mean the same wherever they are used, and that suits well for the geo.* attributes. That is, geo.city_name means the same whether it's used in the context of client or server or any other that's described at the end at https://www.elastic.co/guide/en/ecs/current/ecs-geo.html

That's right! But, the meaning doesn't change if you put an attribute into a more specific context, right? geo.city_name describes a city. client.geo.city_name still describes a city, but the city of the client, while server.geo.city_name describes the city of the server. And both can coexist on a single signal / event. I agree though that we should not define geo.city_name multiple times in different places, and that's what #339 aims at. I also created a concrete proposal on how to model it yaml: open-telemetry/build-tools#240

Even if the client.geo.x and server.geo.x attributes are required in the same object, I think this problem is unsolved currently - for eg., one cannot have two url.full attributes in the same object if I have to represent the referring url or the url being redirected to in the case of 302.

@scheler This is exactly what we are trying to solve with #339. For the geo example that means:

  • we will define geo.* fields only once in the registry
  • and then, geo.* fields will be (explicitly) reusable under different namespaces (while keeping their original definition just applied into the context of the target namespace). This will result in client.geo.* and server.geo.* anyways.
  • this will also solve the problem of having the same fields on a single signal with different context (as they would have different namespaces)

I think we should omit the client and server prefix.

  • We are just defining it here explicitly (for now) because we are still working on the reuse concept for semantic conventions and don't want to block Geo fields because of that.
  • If you mean that we should not have a prefix for geo at all (even with the reuse concept), then I don't fully get the reasoning, yet. As described above it solves a concrete problem we have today, that you also mentioned yourself with being able to have the same group of attributes twice on a signal with different context.
  • If we would omit the client, server prefixes for geo, we would break ECS with no reason (no actual conflict with semconv or attribute naming guidelines), which has been explicitly stated as something to avoid whenever possible in the original OTEP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexanderWert thanks for the clarification. One issue though is that one cannot use geo.city_name itself. Is that intended here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scheler Yes it's intended. In Elastic common schema, geo fields are not expected to be used directly at the root of the events.

Copy link

github-actions bot commented Feb 3, 2024

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 3, 2024
@joaopgrassi
Copy link
Member

Hi @kaiyan-sheng !

We changed how the CHANGELOG.md is managed. Please take a look at https://github.com/open-telemetry/semantic-conventions/blob/main/CONTRIBUTING.md#adding-a-changelog-entry to see what needs to be done. Sorry for the disruption.

@github-actions github-actions bot removed the Stale label Feb 13, 2024
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 29, 2024
Copy link

github-actions bot commented Mar 8, 2024

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add Geo fields from Elastic Common Schema
8 participants