Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing kubernetes.state_namespace datastream in k8s integration #3100

Closed
5 of 7 tasks
gizas opened this issue Jul 19, 2023 · 11 comments
Closed
5 of 7 tasks

Introducing kubernetes.state_namespace datastream in k8s integration #3100

gizas opened this issue Jul 19, 2023 · 11 comments
Assignees
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Comments

@gizas
Copy link
Contributor

gizas commented Jul 19, 2023

Describe the enhancement:

Currently in our kubernetes integration we dont collect the namespace metrics from Kube State metrics. For more info see https://github.com/kubernetes/kube-state-metrics/blob/main/docs/namespace-metrics.md

This enhancement request will introduce the new kubernetes.state_namespace datastream collection. The main goal is to collect the kube_namespace_status_condition that will help users identify the status of namespace creation.

Describe a specific use case for the enhancement or feature:
Users want easily to understand if their namespace creation is running or failing. Until now this was not feasible and users could not track the namespace status

What is the definition of done?

@gizas gizas changed the title Introudusing kubernetes.state_namespace datastream in k8s integration Introdusing kubernetes.state_namespace datastream in k8s integration Jul 19, 2023
@constanca-m constanca-m self-assigned this Aug 23, 2023
@constanca-m
Copy link
Contributor

constanca-m commented Aug 23, 2023

Adding kubernetes.namespace metricset problems:

  1. kubernetes.namespace already exists as a field. Because of this, we cannot create a new metricset with the same name. Otherwise, we will keep running into this error:

    {\"type\":\"document_parsing_exception\",\"reason\":\"[1:2039] object mapping for [kubernetes.namespace] tried to parse field [namespace] as object, but found a concrete value\"}, dropping event!","service.name":"metricbeat","ecs.version":"1.6.0"}
    

    Workarounds are:

    • Change the existent field kubernetes.namespace so we can create a new metricset with the same name. However, this would cause breaking changes: we would have to update all metricsets that use it and the dashboards.
    • Name this metricset differently. It breaks the logic of the names of all metricsets, but it does not cause any breaking change.
      • Question: What would be the best name? namespace_metrics?
  2. If we have the add_cloud_metadata: processor, the document will look like:

    {
       ...
       "kubernetes":{
          "labels":{
             "kubernetes_io/metadata_name":"kube-public"
          },
          "namespace":{
             "name":"kube-public",
             "uid":"87c48044-5618-4233-8ad5-73b7c5fc588b"
          },
          "namespace_metrics":{
             "created":{
                "sec":1692688692
             },
             "name":"kube-public",
             "status":{
                "active":true,
                "terminating":false
             }
          }
       },
       "metricset":{
          "name":"state_namespace_metrics",
          "period":10000
       },
       ...
    }

    We have again an error because kubernetes.namespace here is an object and in other metricsets it is a string:

    "reason\":\"[1:134] failed to parse field [kubernetes.namespace] of type [keyword] in document with id 'r1vVIYoB0cxQspDSlwP3'. Preview of field's value: '{uid=05308492-9399-43a8-b40e-958645650e2d, name=kube-system}'\",\"caused_by\":{\"type\":\"illegal_state_exception\",\"reason\":\"Can't get text on a START_OBJECT at 1:68\"}}
    

    So we need to drop these fields before they are sent to ES.

@ChrsMark
Copy link
Member

I see this is a tough one 😞 . In order to be precise we would need to either convert kubernetes.namespace from string to an object or come with sth like namespace_state. This would end up with sth like the following:

{
   ...
   "kubernetes":{
      "labels":{
         "kubernetes_io/metadata_name":"kube-public"
      },
      "namespace": "kube-public",
      "namespace_uuid": "asdf",
      "namespace_labels": {},
      "namespace_annotations": {},
      "namespace_state":{
         "created":{
            "sec":1692688692
         },
         "status":{
            "active":true,
            "terminating":false
         }
      }
   },
   "metricset":{
      "name":"state_namespace",
      "period":10000
   },
   ...
}

@constanca-m do you think that namespace_state (describes the state of the namespace) would be representative of the data we plan to collect?
Also metricset.name should be aligned with what we have in other metricsets: equals to state_namespace.
And we should store the name of the namespace within the kubernetes.namespace field.

Note that we already have the kubernetes.namespace_labels and others for the same reasons: https://github.com/elastic/elastic-agent-autodiscover/blob/8d7196695cfe60ff29a977e1f1f99759001a1f29/kubernetes/metadata/namespace_test.go#L78-L84

In any other case we would need to hit the breaking change's impact...

cc: @tommyers-elastic @ruflin @gizas

@constanca-m
Copy link
Contributor

Yes, for now I am using state_namespace. Although it is not ideal, it seems better than changing kubernetes.namespace in everything. It looks like this:

{
    "@timestamp": "2019-03-01T08:05:34.853Z",
    "event": {
        "dataset": "kubernetes.state_namespace",
        "duration": 115000,
        "module": "kubernetes"
    },
    "kubernetes": {
        "state_namespace": {
            "created": {
                "sec": 1691566342
            },
            "name": "local-path-storage",
            "status": {
                "active": true,
                "terminating": false
            }
        }
    },
    "metricset": {
        "name": "state_namespace",
        "period": 10000
    },
    "service": {
        "address": "127.0.0.1:55555",
        "type": "kubernetes"
    }
}

The problem now is to find a solution to avoid introducing kubernetes.namespace object when adding the kubernetes processor.

@ChrsMark
Copy link
Member

What is exactly the issue with the kubernetes processor (add_kubernetes_metadata?)?
In all cases (k8s provider, add_kubernetes_metadata processor, and k8s module) we use the k8s metadata generators which for the Namespace resource flatten the additional information like namespace_labels, namespace_annotations and namespace_uuid like in the example I shared previously.

The codebase that handles that can be found at https://github.com/elastic/elastic-agent-autodiscover/blob/8d7196695cfe60ff29a977e1f1f99759001a1f29/kubernetes/metadata/namespace.go#L98.

Based on this I don't see where the issue would be. Is there anything else that I miss here 🤔 ?

@constanca-m
Copy link
Contributor

(I think) add_cloud_metadata (not add_kubernetes_metadata) adds this object to the document:

"kubernetes":{
      "namespace":{
         "name":"kube-public",
         "uid":"87c48044-5618-4233-8ad5-73b7c5fc588b"
      }
}

And this causes an exception, because, once again, we have an object in kubernetes.namespace field instead of a string. @ChrsMark

@ChrsMark
Copy link
Member

That's really weird :). add_cloud_metadata should not deal with these fields at all.

Do you hit this in latest main or you are trying with a custom branch?

@constanca-m
Copy link
Contributor

I am running the metricbeat.yml inside the directory test/docs/01_playground. I know it does not come from the new metricset, because when running the tests, the data.json appears correct:

{
    "@timestamp": "2019-03-01T08:05:34.853Z",
    "event": {
        "dataset": "kubernetes.state_namespace",
        "duration": 115000,
        "module": "kubernetes"
    },
    "kubernetes": {
        "state_namespace": {
            "created": {
                "sec": 1691566342
            },
            "name": "default",
            "status": {
                "active": true,
                "terminating": false
            }
        }
    },
    "metricset": {
        "name": "state_namespace",
        "period": 10000
    },
    "service": {
        "address": "127.0.0.1:55555",
        "type": "kubernetes"
    }
}

But when I ran that manifest, then we have that new kubernetes.namespace object. I looked at it and I assume that what is adding is the processor add_cloud_metada. @ChrsMark

@constanca-m
Copy link
Contributor

I tried running metricbeat.yml now for 2 situations:

  1. Only state_namespace (the name of our new metricset) enabled, as well as add_cloud_metadata.
  2. Remove add_cloud_metadata.

Unfortunately, in both cases I ran into the same error:

 failed to parse field [kubernetes.namespace] of type [keyword] in document with id 'DlxJJooB0cxQspDS7jdw'. Preview of field's value: '{uid=3cecf383-9b5d-4195-85c0-3271616b002a, name=kube-node-lease}'\",\"caused_by\":{\"type\":\"illegal_state_exception\",\"reason\":\"Can't get text on a START_OBJECT at 1:423\"}}

I don't know what is causing this object to appear since running the tests on the metricset goes without a problem... 😕

@constanca-m
Copy link
Contributor

constanca-m commented Aug 24, 2023

And running it with every metricset disabled apart from state_namespace (I have system module enabled as well, but I don't think that makes a difference), and add_metadata and add_cloud_metadada disabled, the document still looks like this:

{
   "agent":{
      "ephemeral_id":"8538f6e5-3134-4fad-a55b-d55702897288",
      "id":"9da9b5b9-7f5a-42a0-85a8-e98ad350e8ae",
      "name":"kind-control-plane",
      "type":"metricbeat",
      "version":"8.11.0"
   },
   "ecs":{
      "version":"8.0.0"
   },
   "event":{
      "dataset":"kubernetes.state_namespace",
      "duration":306861293,
      "module":"kubernetes"
   },
   "host":{
      "name":"kind-control-plane"
   },
   "kubernetes":{
      "labels":{
         "kubernetes_io/metadata_name":"kube-system"
      },
      "namespace":{
         "name":"kube-system",
         "uid":"9db79c06-db65-4081-923f-b78deb19eda7"
      },
      "state_namespace":{
         "created":{
            "sec":1692858979
         },
         "name":"kube-system",
         "status":{
            "active":true,
            "terminating":false
         }
      }
   },
   "metricset":{
      "name":"state_namespace",
      "period":10000
   },
   "orchestrator":{
      "cluster":{
         "name":"kind",
         "url":"kind-control-plane:6443"
      }
   },
   "service":{
      "address":"http://kube-state-metrics:8080/metrics",
      "type":"kubernetes"
   }
}

What could be adding the kubernetes.namespace and kubernetes.labels?

Edit: I just realized commenting add_metadata: true was not enough, as it seems to be enabled by default. That was the source of the error.

@ChrsMark
Copy link
Member

Yeap, most probably it is sth like https://github.com/elastic/beats/blob/4ac4973250d39884be409efb593e1ad416515336/metricbeat/module/kubernetes/state_container/state_container.go#L138 that adds the metadata and treats the Namespace as a generic resource.

You will need to change https://github.com/elastic/beats/blob/8effe2400de59546c6243caa36f4ae6356ac6091/metricbeat/module/kubernetes/util/kubernetes.go#L241 and call directly a namespace specific MetaGen (which treats the metadata properly with underscores as mentioned at #3100 (comment)) like what we do for the Service, Node, Pod resources. There is one that you could re-use called namespaceMeta a few lines above.

Hope that helps :).

@constanca-m
Copy link
Contributor

You nailed it @ChrsMark , it was exactly that! :) It runs without errors now, I still have a few doubts but I will write them in the description of the PR once I open it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

No branches or pull requests

3 participants