-
Notifications
You must be signed in to change notification settings - Fork 182
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add remaining url fields to the registry (#496)
Co-authored-by: Joao Grassi <[email protected]>
- Loading branch information
1 parent
5d7e487
commit 1e7bb0e
Showing
3 changed files
with
122 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
# | ||
# If your change doesn't affect end users you should instead start | ||
# your pull request title with [chore] or use the "Skip Changelog" label. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement | ||
|
||
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) | ||
component: url | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Add remaining ECS fields to the url namespace | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
# The values here must be integers. | ||
issues: [496] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,15 +9,35 @@ linkTitle: URL | |
<!-- semconv registry.url(omit_requirement_level) --> | ||
| Attribute | Type | Description | Examples | | ||
|---|---|---|---| | ||
| `url.domain` | string | Domain extracted from the `url.full`, such as "opentelemetry.io". [1] | `www.foo.bar`; `opentelemetry.io`; `3.12.167.2`; `[1080:0:0:0:8:800:200C:417A]` | | ||
| `url.extension` | string | The file extension extracted from the `url.full`, excluding the leading dot. [2] | `png`; `gz` | | ||
| `url.fragment` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | `SemConv` | | ||
| `url.full` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [1] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | | ||
| `url.full` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [3] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | | ||
| `url.original` | string | Unmodified original URL as seen in the event source. [4] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `search?q=OpenTelemetry` | | ||
| `url.path` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component | `/search` | | ||
| `url.query` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [2] | `q=OpenTelemetry` | | ||
| `url.port` | int | Port extracted from the `url.full` | `443` | | ||
| `url.query` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [5] | `q=OpenTelemetry` | | ||
| `url.registered_domain` | string | The highest registered url domain, stripped of the subdomain. [6] | `example.com`; `foo.co.uk` | | ||
| `url.scheme` | string | ![Stable](https://img.shields.io/badge/-stable-lightgreen)<br>The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `https`; `ftp`; `telnet` | | ||
| `url.subdomain` | string | The subdomain portion of a fully qualified domain name includes all of the names except the host name under the registered_domain. In a partially qualified domain, or if the the qualification level of the full name cannot be determined, subdomain contains all of the names below the registered domain. [7] | `east`; `sub2.sub1` | | ||
| `url.top_level_domain` | string | The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for example.com is `com`. [8] | `com`; `co.uk` | | ||
|
||
**[1]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it SHOULD be included nevertheless. | ||
**[1]:** In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to the domain field. If the URL contains a [literal IPv6 address](https://www.rfc-editor.org/rfc/rfc2732#section-2) enclosed by `[` and `]`, the `[` and `]` characters should also be captured in the domain field. | ||
|
||
**[2]:** The file extension is only set if it exists, as not every url has a file extension. When the file name has multiple extensions `example.tar.gz`, only the last one should be captured `gz`, not `tar.gz`. | ||
|
||
**[3]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it SHOULD be included nevertheless. | ||
`url.full` MUST NOT contain credentials passed via URL in form of `https://username:[email protected]/`. In such case username and password SHOULD be redacted and attribute's value SHOULD be `https://REDACTED:[email protected]/`. | ||
`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) and SHOULD NOT be validated or modified except for sanitizing purposes. | ||
|
||
**[2]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. | ||
**[4]:** In network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not. | ||
`url.original` might contain credentials passed via URL in form of `https://username:[email protected]/`. In such case password and username SHOULD NOT be redacted and attribute's value SHOULD remain the same. | ||
|
||
**[5]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. | ||
|
||
**[6]:** This value can be determined precisely with the [public suffix list](http://publicsuffix.org). For example, the registered domain for "foo.example.com" is "example.com". Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk". | ||
|
||
**[7]:** The subdomain portion of "www.east.mydomain.co.uk" is "east". If the domain has multiple levels of subdomain, such as "sub2.sub1.example.com", the subdomain field should contain "sub2.sub1", with no trailing period. | ||
|
||
**[8]:** This value can be determined precisely with the [public suffix list](http://publicsuffix.org). | ||
<!-- endsemconv --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,11 +4,30 @@ groups: | |
type: attribute_group | ||
prefix: url | ||
attributes: | ||
- id: scheme | ||
- id: domain | ||
type: string | ||
brief: > | ||
Domain extracted from the `url.full`, such as "opentelemetry.io". | ||
note: > | ||
In some cases a URL may refer to an IP and/or port directly, | ||
without a domain name. In this case, the IP address would go to the domain field. | ||
If the URL contains a [literal IPv6 address](https://www.rfc-editor.org/rfc/rfc2732#section-2) | ||
enclosed by `[` and `]`, the `[` and `]` characters should also be captured in the domain field. | ||
examples: ["www.foo.bar", "opentelemetry.io", "3.12.167.2", "[1080:0:0:0:8:800:200C:417A]"] | ||
- id: extension | ||
type: string | ||
brief: > | ||
The file extension extracted from the `url.full`, excluding the leading dot. | ||
note: > | ||
The file extension is only set if it exists, as not every url has a file extension. | ||
When the file name has multiple extensions `example.tar.gz`, only the last one should be captured `gz`, not `tar.gz`. | ||
examples: [ "png", "gz" ] | ||
- id: fragment | ||
stability: stable | ||
type: string | ||
brief: 'The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol.' | ||
examples: ["https", "ftp", "telnet"] | ||
brief: > | ||
The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | ||
examples: ["SemConv"] | ||
- id: full | ||
stability: stable | ||
type: string | ||
|
@@ -23,19 +42,66 @@ groups: | |
`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) | ||
and SHOULD NOT be validated or modified except for sanitizing purposes. | ||
examples: ['https://www.foo.bar/search?q=OpenTelemetry#SemConv', '//localhost'] | ||
- id: original | ||
type: string | ||
brief: > | ||
Unmodified original URL as seen in the event source. | ||
note: > | ||
In network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often | ||
just represented as a path. This field is meant to represent the URL as it was observed, complete or not. | ||
`url.original` might contain credentials passed via URL in form of `https://username:[email protected]/`. | ||
In such case password and username SHOULD NOT be redacted and attribute's value SHOULD remain the same. | ||
examples: ["https://www.foo.bar/search?q=OpenTelemetry#SemConv", "search?q=OpenTelemetry"] | ||
- id: path | ||
stability: stable | ||
type: string | ||
brief: 'The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component' | ||
examples: ['/search'] | ||
brief: > | ||
The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component | ||
examples: ["/search"] | ||
- id: port | ||
type: int | ||
brief: > | ||
Port extracted from the `url.full` | ||
examples: [443] | ||
- id: query | ||
stability: stable | ||
type: string | ||
brief: 'The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component' | ||
brief: > | ||
The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component | ||
examples: ["q=OpenTelemetry"] | ||
note: Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. | ||
- id: fragment | ||
note: > | ||
Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. | ||
- id: registered_domain | ||
type: string | ||
brief: > | ||
The highest registered url domain, stripped of the subdomain. | ||
examples: ["example.com", "foo.co.uk"] | ||
note: > | ||
This value can be determined precisely with the [public suffix list](http://publicsuffix.org). | ||
For example, the registered domain for "foo.example.com" is "example.com". | ||
Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk". | ||
- id: scheme | ||
stability: stable | ||
type: string | ||
brief: 'The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component' | ||
examples: ["SemConv"] | ||
brief: > | ||
The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | ||
examples: ["https", "ftp", "telnet"] | ||
- id: subdomain | ||
type: string | ||
brief: > | ||
The subdomain portion of a fully qualified domain name includes all of the names except the host name | ||
under the registered_domain. In a partially qualified domain, or if the the qualification level of the | ||
full name cannot be determined, subdomain contains all of the names below the registered domain. | ||
examples: ["east", "sub2.sub1"] | ||
note: > | ||
The subdomain portion of "www.east.mydomain.co.uk" is "east". If the domain has multiple levels of subdomain, | ||
such as "sub2.sub1.example.com", the subdomain field should contain "sub2.sub1", with no trailing period. | ||
- id: top_level_domain | ||
type: string | ||
brief: > | ||
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. | ||
For example, the top level domain for example.com is `com`. | ||
examples: ["com", "co.uk"] | ||
note: > | ||
This value can be determined precisely with the [public suffix list](http://publicsuffix.org). |