Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Update namespace filter on data streams #10213

Merged
merged 14 commits into from
Jul 16, 2024

Conversation

constanca-m
Copy link
Contributor

@constanca-m constanca-m commented Jun 21, 2024

Proposed commit message

Some datastreams have present the namespace filter. I will update it to:

      - name: namespace
        type: text
        title: Namespace
        description: Enrich events with metadata from the resources in this namespace. Empty value means all namespaces.
        multi: false
        required: false
        show_user: false

The namespace filter should be moved to the group level.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  1. Build Kubernetes integration:
elastic-package build
  1. Update registry:
elastic-package stack up --services package-registry
  1. Install elastic agent with image 8.15.0-SNAPSHOT. It should use a policy with Kubernetes integration.

Related issues

Relates elastic/beats#38978

Results

See #10213 (comment).

@constanca-m constanca-m added Integration:kubernetes Kubernetes Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team [elastic/obs-cloudnative-monitoring] labels Jun 21, 2024
@constanca-m constanca-m self-assigned this Jun 21, 2024
@constanca-m constanca-m requested a review from a team as a code owner June 21, 2024 14:12
@constanca-m
Copy link
Contributor Author

/test

@@ -41,12 +41,6 @@ streams:
multi: false
required: false
show_user: false
- name: namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -43,7 +43,8 @@ streams:
show_user: false
- name: namespace
type: text
title: Namespace to watch resources from for metadata
title: Namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PVC needs it, PV does not

@@ -10,7 +10,7 @@ categories:
- kubernetes
conditions:
kibana:
version: "^8.14.0"
version: "^8.15.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This will work for 8.14 as well, with only exception of namespace watcher. But it is not a feature of 8.15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, but in that case 8.14 already has the namespace filters for most of data streams. I wanted to push for 8.15 to be on track with the other related PRs. Should I fix in 8.14 instead - and maybe remove namespace filter for state_namespace?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require then another PR. I would suggest to keep 8.15 but only merge it after the feature freeze of 8.15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but only merge it after the feature freeze of 8.15

Yes, I agree. I have added a reminder for the day of the release of 8.15, so we won't have to backport in integrations

@@ -60,6 +60,13 @@ streams:
multi: true
required: false
show_user: false
- name: namespace
Copy link
Contributor

@tetianakravchenko tetianakravchenko Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add namespace to the top level? similar to the condition

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. Some data streams have the option node that I considered at the same level as namespace, and node is also not at the top level. Should I change this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to put the namespace filter in each data stream? Can't we put in the group level and check if it applies for all the data streams of the group?
It feels a bit cumbersome for a user to set the namespace value so many times. And a mistake in one of the data streams would break the others as well, because the watchers are shared.

I agree with @MichaelKatsoulis on this regard, that we shouldn't provide the option to set it for each datastream, but on the higher level instead to avoid possible issues, since watchers are shared.

I would split node and namespace work and don't touch node in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied in #10213 (comment).

packages/kubernetes/changelog.yml Outdated Show resolved Hide resolved
@MichaelKatsoulis
Copy link
Contributor

Do we need to put the namespace filter in each data stream? Can't we put in the group level and check if it applies for all the data streams of the group?
It feels a bit cumbersome for a user to set the namespace value so many times. And a mistake in one of the data streams would break the others as well, because the watchers are shared.
Of course, we should check the update process and how a value in namespace set in a previous version would or not be carried to the newer version. This could be a breaking change....

About the text Enrich events with metadata from the resources in this namespace. Empty value means all namespaces.
I would change it to Enrich events from resources of the selected namespace with kubernetes metadata. Empty value means all namespaces.

Also for this PR to be ready for merge, it should be tested that setting the namespace field does what it supposes to.
We should have screenshots of metadata added only for resources in a selected namespace.

@constanca-m
Copy link
Contributor Author

Test for using namespace inside data stream cronjob:

1st test

  1. Setting namespace: default:

image

We see that the namespace_uid is only added to cronjobs in default namespace.

  1. Removing namespace: default:

The results are still the same as 1. So I don't think the watchers are updated to all namespaces.

2nd test

I am restarting the agent. I don't have the namespace set:

image

The results are as expected.

I will now filter by namespace: default, without restarting the agent again:

image

Conclusion: The watchers are only updated when the agent is restarted.

@constanca-m
Copy link
Contributor Author

Do we need to put the namespace filter in each data stream? Can't we put in the group level and check if it applies for all the data streams of the group?

I have just tried this:

image

With the same tests as #10213 (comment).

And the results:
image

So it does not work @MichaelKatsoulis @tetianakravchenko. What do you think?

@constanca-m
Copy link
Contributor Author

It seems that having the namespace option in integrations will bring quite a few problems @gizas @tetianakravchenko @MichaelKatsoulis . The watchers are shared, so I could start the Pod watcher in data stream pod from namespace-A, and then for state_container for namespace-B. What then? It does not work, maybe this option should not be available here.

@MichaelKatsoulis
Copy link
Contributor

The watchers are shared, so I could start the Pod watcher in data stream pod from namespace-A, and then for state_container for namespace-B. What then?

@constanca-m Yes this is tricky and error prone. That is why I believe the way to do this, is to have the namespace option in group level as you tried in #10213 (comment).

If you add the namespace in the group level in the

, then I believe you should go to the data streams that the namespace option is needed and configure the handlebar files like:

{{#if namespace}}
namespace: {{namespace}}
{{/if}}

Looking a the files changed in this PR, I don't see any update in the handlebar files? How will the namespace option be passed to the metricsets? For example in state_pod there is already there

@constanca-m
Copy link
Contributor Author

@MichaelKatsoulis I tried setting the namespace per group. I reported the results in #10213 (comment). It does not work, so unfortunately that is not an option. The only way it works is to set the namespace per metricset, which is very prone to errors.

@constanca-m
Copy link
Contributor Author

Looking a the files changed in this PR, I don't see any update in the handlebar files?

Thank you, I wasn't aware of those files. I have updated in the latest commit.

Example of adding namespace option to pod data stream:
Screenshot from 2024-07-02 09-10-58

And we can see the option is no longer there for state node too:
image

Same for other data streams.

@elastic elastic deleted a comment from elasticmachine Jul 2, 2024
@MichaelKatsoulis
Copy link
Contributor

@MichaelKatsoulis I tried setting the namespace per group. I reported the results in #10213 (comment). It does not work, so unfortunately that is not an option. The only way it works is to set the namespace per metricset, which is very prone to errors.

It wasn't working because you hadn't set the namespace to all hbs files probably. If the namespace var is per group, then the hbs file can read the variable as they already do with condition var.
Also the reason that an agent restart is needed when you make a change in the namespace option, is because in https://github.com/elastic/beats/blob/53bf67ce40fe6564efd15f25cd6da0249f360a12/metricbeat/module/kubernetes/util/kubernetes.go#L343 we only check for a change in the nodescope to update a watcher.
Changing a data streams only re renders that part of the agent config. The watchers of the module have already started with the previous namespace option, so they don't get restarted.
I think that is fine.

@constanca-m
Copy link
Contributor Author

It wasn't working because you hadn't set the namespace to all hbs files probably.

I don't think this is it. I tried with state_cronjob with the hbs file and it does not work.

@constanca-m
Copy link
Contributor Author

It wasn't working because you hadn't set the namespace to all hbs files probably.

Thank you @MichaelKatsoulis , this was the problem.

I deployed 2 cronjobs in default and kube-system namespace. I set the integration for state data streams like this:

image

I checked the data from the state_cronjob data stream, and only the default cronjob had metadata as expected:

Screenshot from 2024-07-02 14-22-02

@@ -31,6 +31,13 @@ streams:
required: true
show_user: true
default: true
- name: namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to be more aligned with the rest of the data stream groups, we should also add the namespace option in the central manifest in the events group.

- name: events
    title: Kubernetes Event Metrics

This group only contains events data stream but as we do it for the rest, we should be uniform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the latest commit.

@MichaelKatsoulis
Copy link
Contributor

@lucabelluccini I wanted to inform you that with this PR we introduce a small breaking change to Kubernetes Integration.

This is actual a fix as the way a parameter was set in the data streams until 1.63.0 version would not work as expected due to changes (for cpu and memory consumption improvements) we have in beats 8.14.0 version.

In more details some of the data streams of Kubernetes Integration included a namespace option parameter which was supposed to configure the following:

title: Namespace
description: Enrich events with metadata from the resources in this namespace. Empty value means all namespaces.

The problem was that this option even if set, would be overwritten by the other data streams(which did not even had this option), due to the underlying architecture of the metadata enrichment process with kubernetes api watchers mechanism.

To tackle this issue, we decided to move the namespace option to the group level of data streams(wherever it makes sense).
Now the user can set the namespace option in a more central way, and thus not get overriden from other data streams.

The breaking change would be that, if a user has a previous version of kubernetes integration where in some data streams the namespace option is set (although not working properly. We really expect very few users to use this), and then decides to upgrade to this new version. This would mean that the namespace option will not be inherited. They will need to set it properly in the group level.
We really thing that this change is important because:

  • If a user does not have permissions to access another namespace other than default (or whatever) , then by that option they can stop unwanted calls to the rest of the namespaces which would cause error logs.

Should we include this information somewhere in order for support to have in mind in an unlike future SDH?

@MichaelKatsoulis
Copy link
Contributor

@constanca-m we should create a follow up issue to check the right place for node option as well.
Currently it is in state data streams like it was with namespace option before
Screenshot 2024-07-09 at 3 27 05 PM (2)

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@@ -1,4 +1,9 @@
# newer versions go on top
- version: 1.64.0
changes:
- description: Update namespace filter on data streams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be reflected in the description that namespace filter was moved to the group level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

packages/kubernetes/changelog.yml Outdated Show resolved Hide resolved
@constanca-m constanca-m enabled auto-merge (squash) July 16, 2024 06:14
@constanca-m constanca-m merged commit 3f34e40 into elastic:main Jul 16, 2024
3 checks passed
@elasticmachine
Copy link

💚 Build Succeeded

History

cc @constanca-m

@elasticmachine
Copy link

Package kubernetes - 1.64.0 containing this change is available at https://epr.elastic.co/search?package=kubernetes

@constanca-m constanca-m deleted the namespace-filter branch July 16, 2024 07:14
tetianakravchenko added a commit to tetianakravchenko/integrations that referenced this pull request Jul 16, 2024
* Update namespace filter

Co-authored-by: Tetiana Kravchenko <[email protected]>

---------

Co-authored-by: Tetiana Kravchenko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:kubernetes Kubernetes Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team [elastic/obs-cloudnative-monitoring]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants