-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[System] Add dimensions to system package metrics data_streams only, except core data_streams #6118
[System] Add dimensions to system package metrics data_streams only, except core data_streams #6118
Conversation
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
🌐 Coverage report
|
Signed-off-by: Tetiana Kravchenko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
system.process.cmdline
should be considered as a dimension field.
Please ignore the above review comment. |
CC: @ruflin current system package migration PR. |
@@ -202,6 +209,13 @@ | |||
description: Process metrics. | |||
type: group | |||
fields: | |||
- name: pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: it implies that new time series will be created when restarting a process, the same for k8s environment since we have this setting in manifest: https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-managed-kubernetes.yaml#L29
@@ -202,6 +209,13 @@ | |||
description: Process metrics. | |||
type: group | |||
fields: | |||
- name: pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a field, for which mapping was completely missing before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lalit-satapathy yes. this field is defined in ecs https://www.elastic.co/guide/en/ecs/current/ecs-process.html
note: there are other process.* fields in document, I've added here only the relevant for this PR field
document example with process.* fields:
packages/system/docs/README.md
Outdated
@@ -1357,7 +1358,7 @@ This data should be available without elevated permissions. | |||
| host.os.full.text | Multi-field of `host.os.full`. | match_only_text | | | | |||
| host.os.kernel | Operating system kernel version as a raw string. | keyword | | | | |||
| host.os.name | Operating system name, without the version. | keyword | | | | |||
| host.os.name.text | Multi-field of `host.os.name`. | match_only_text | | | | |||
| host.os.name.text | Multi-field of `host.os.name`. | text | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this change coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't touch any ecs fields in this PR, I thought that this is coming from the ecs definition, when running elastic-package build
but, you are right - it should still be match_only_text
from the ecs fields definition - https://github.com/elastic/ecs/blob/8.0/experimental/generated/beats/fields.ecs.yml#L3857-L3866
it appeared that this field is defined in fields
folder twice - in agent.yml
integrations/packages/system/data_stream/diskio/fields/agent.yml
Lines 159 to 169 in 72dfc88
- name: os.name | |
level: extended | |
type: keyword | |
ignore_above: 1024 | |
multi_fields: | |
- name: text | |
type: text | |
norms: false | |
default_field: false | |
description: Operating system name, without the version. | |
example: Mac OS X |
ecs.yml
- integrations/packages/system/data_stream/diskio/fields/ecs.yml
Lines 19 to 20 in 72dfc88
- external: ecs | |
name: host.os.name |
I will remove the custom definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - b762ecc
there are lots of duplicated fields, to not mix up fields clean up and tsdb dimensions pr - I am not going to introduce more fields removal here. @elastic/elastic-agent-data-plane could you please take care of it?
I think that elastic-package check
could verify that there are no duplicated fields to avoid such situations, wdyt? should it be created as a feature request for elastic-package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: elastic/elastic-package#1287, there is already available a check for the duplicated fields, it is only needed to update the format_version
packages/system/docs/README.md
Outdated
@@ -1733,7 +1737,7 @@ This data should be available without elevated permissions. | |||
| host.mac | Host mac addresses. | keyword | | | | |||
| host.name | Name of the host. It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use. | keyword | | | | |||
| host.network.in.bytes | The number of bytes received on all network interfaces by the host in a given period of time. | scaled_float | byte | counter | | |||
| host.network.in.packets | The number of packets received on all network interfaces by the host in a given period of time. | long | | | | |||
| host.network.in.packets | The number of packets received on all network interfaces by the host in a given period of time. | scaled_float | | counter | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks off. Why is this suddenly a scaled float but at the same time a counter which I assume is always a long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same here: the same field defined twice: once in fields.yml
:
integrations/packages/system/data_stream/network/fields/fields.yml
Lines 56 to 62 in 72dfc88
- name: network.in.bytes | |
type: scaled_float | |
format: bytes | |
unit: byte | |
metric_type: counter | |
description: | | |
The number of bytes received on all network interfaces by the host in a given period of time. |
another one in agent.yml
:
integrations/packages/system/data_stream/network/fields/agent.yml
Lines 206 to 210 in 72dfc88
- name: network.in.bytes | |
type: long | |
format: bytes | |
description: > | |
The number of bytes received on all network interfaces by the host in a given period of time. |
I am not sure what is a correct 😕 Could be that the files evaluation sequence changed that caused this
@elastic/elastic-agent-data-plane I need some help here, what is the correct value for this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tempted to say the one in fields.yml
is wrong. Two questions:
- Can you check what is loaded today if you install the package? Hopefully long.
- Can you check what we have in metricbeat for the same?
If we load long today and it is also the one used in Metricbeat, lets fix the fields.yml
file. Otherwise lets continue the discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check what is loaded today if you install the package? Hopefully long.
Can you check what we have in metricbeat for the same?
also long
- https://github.com/elastic/beats/blob/main/metricbeat/module/system/network/_meta/fields.yml#L30-L33
I've reverted network changes for this PR, will create an another one
Signed-off-by: Tetiana Kravchenko <[email protected]>
Could we take out the process and network changes to get this in quickly and follow up with the other two separately? |
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
Signed-off-by: Tetiana Kravchenko <[email protected]>
@ruflin done, process and network data_stream changes were reverted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As this PR does mostly add dimensions + some cleanups, this should be low risk.
@@ -1350,7 +1351,7 @@ This data should be available without elevated permissions. | |||
| host.hostname | Hostname of the host. It normally contains what the `hostname` command returns on the host machine. | keyword | | | | |||
| host.id | Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of `beat.name`. | keyword | | | | |||
| host.ip | Host ip addresses. | ip | | | | |||
| host.mac | Host MAC addresses. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen. | keyword | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where this change is coming from but I assume pulled in from some ECS changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same issue as above - there are duplication one in ecs file, another in
integrations/packages/system/data_stream/diskio/fields/agent.yml
Lines 134 to 138 in 1314197
- name: mac | |
level: core | |
type: keyword | |
ignore_above: 1024 | |
description: Host mac addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets ignore it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As this PR does mostly add dimensions + some cleanups, this should be low risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As this PR does mostly add dimensions + some cleanups, this should be low risk.
Referring to #5193 (comment), please include agent.id as the dimension field. |
I see it is already handled. Please ignore this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@elastic/elastic-agent-data-plane could you please review this PR? I can't merge it since |
Can we make this beta and update the stack.version as in this PR: #6129 |
Please confirm that system dashboards loading fine (without any errors) in the TSDB mode? |
@lalit-satapathy before doing it - this PR also should be merged #6395, there are some missing metric_type metadata fields. after merging this and #6395, tsdb can be enabled for all except: |
@lalit-satapathy yes, dashboards and inventory page works (note: tested on linux system only, so the windows dashboard have some errors regarding missing metrics) |
Package system - 1.30.0 containing this change is available at https://epr.elastic.co/search?package=system |
…except core data_streams (elastic#6118) * add dimensions to system package, metrics data_streams only Signed-off-by: Tetiana Kravchenko <[email protected]> * revert k8s package Signed-off-by: Tetiana Kravchenko <[email protected]> * revert k8s changes; add extra dimensions for diskio and network Signed-off-by: Tetiana Kravchenko <[email protected]> * filesystem: add mount_point and device_name Signed-off-by: Tetiana Kravchenko <[email protected]> * move core and process datastream to different PR Signed-off-by: Tetiana Kravchenko <[email protected]> * update PR number Signed-off-by: Tetiana Kravchenko <[email protected]> * move agent.id field to ecs.yaml Signed-off-by: Tetiana Kravchenko <[email protected]> * run elastic-package build Signed-off-by: Tetiana Kravchenko <[email protected]> * run elastic-package build Signed-off-by: Tetiana Kravchenko <[email protected]> * elastic-package check Signed-off-by: Tetiana Kravchenko <[email protected]> * add dimensions to process data_stream Signed-off-by: Tetiana Kravchenko <[email protected]> * clean up some duplicated fields Signed-off-by: Tetiana Kravchenko <[email protected]> * revert network data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> * rever process data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> * adjust the changelog description Signed-off-by: Tetiana Kravchenko <[email protected]> * revert process data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> --------- Signed-off-by: Tetiana Kravchenko <[email protected]>
…except core data_streams (#6118) * add dimensions to system package, metrics data_streams only Signed-off-by: Tetiana Kravchenko <[email protected]> * revert k8s package Signed-off-by: Tetiana Kravchenko <[email protected]> * revert k8s changes; add extra dimensions for diskio and network Signed-off-by: Tetiana Kravchenko <[email protected]> * filesystem: add mount_point and device_name Signed-off-by: Tetiana Kravchenko <[email protected]> * move core and process datastream to different PR Signed-off-by: Tetiana Kravchenko <[email protected]> * update PR number Signed-off-by: Tetiana Kravchenko <[email protected]> * move agent.id field to ecs.yaml Signed-off-by: Tetiana Kravchenko <[email protected]> * run elastic-package build Signed-off-by: Tetiana Kravchenko <[email protected]> * run elastic-package build Signed-off-by: Tetiana Kravchenko <[email protected]> * elastic-package check Signed-off-by: Tetiana Kravchenko <[email protected]> * add dimensions to process data_stream Signed-off-by: Tetiana Kravchenko <[email protected]> * clean up some duplicated fields Signed-off-by: Tetiana Kravchenko <[email protected]> * revert network data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> * rever process data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> * adjust the changelog description Signed-off-by: Tetiana Kravchenko <[email protected]> * revert process data_stream changes Signed-off-by: Tetiana Kravchenko <[email protected]> --------- Signed-off-by: Tetiana Kravchenko <[email protected]>
What does this PR do?
Add/adjust dimensions fields:
dimension:true
, there exists time_series_dimension: true annotation in the indexdimension:true
, there exists time_series_dimension: true annotation in the index ✅system.diskio.name
dimension:true
, there exists time_series_dimension: true annotation in the index ✅dimension:true
, there exists time_series_dimension: true annotation in the index ✅-dimension:true
, there exists time_series_dimension: true annotation in the index ✅dimension:true
, there exists time_series_dimension: true annotation in the index ✅dimension:true
, there exists time_series_dimension: true annotation in the index ✅system.network.name
- will be added in separate PRdimension:true
, there exists time_series_dimension: true annotation in the index ✅process.pid
- will be added in separate PRdimension:true
, there exists time_series_dimension: true annotation in the indexdimension:true
, there exists time_series_dimension: true annotation in the index ✅dimension:true
, there exists time_series_dimension: true annotation in the index ✅dimension:true
, there exists time_series_dimension: true annotation in the index ✅Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Related issues
Screenshots