Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/k8scluster] add support for observing resources for a specific namespace #35727

Merged
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
f259d1d
replace deprecated fake client initialisation
bacherfl Oct 9, 2024
981f1c2
add e2e test scenario for namespaced k8s cluster receiver
bacherfl Oct 10, 2024
bfc23d8
remove non-fitting resources from namespaced e2e test
bacherfl Oct 10, 2024
5b9c6ac
add namespace to role binding
bacherfl Oct 10, 2024
46d11fd
adapt test expectations
bacherfl Oct 10, 2024
dc3b2f3
revert accidental change
bacherfl Oct 10, 2024
1b53fa0
omit cluster scoped observers if namespace filter is set
bacherfl Oct 10, 2024
e18f4e5
[revert when done] - adapt test setup to run locally
bacherfl Oct 16, 2024
873b489
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Oct 17, 2024
fd62f43
Merge branch 'feat/9401/namespaced-cluster-receiver' of https://githu…
bacherfl Oct 17, 2024
a2990f3
fix informer setup
bacherfl Oct 17, 2024
c8b1f29
update expected metrics
bacherfl Oct 17, 2024
f94b075
do not try to observe cluster resource quotas when namespace limit ha…
bacherfl Oct 17, 2024
b0710fb
extend readme and add changelog entry
bacherfl Oct 17, 2024
3fa57f5
add resourcequota access to role
bacherfl Oct 17, 2024
6f751bc
add check for mutually exclusive options in config validation
bacherfl Oct 21, 2024
680aaad
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Oct 22, 2024
aebbbc4
remove namespace check from validation and log hint about non-observa…
bacherfl Oct 22, 2024
90ac34b
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Oct 22, 2024
7fecd06
remove sidecar deployment recommendation
bacherfl Oct 29, 2024
e2c43ac
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 5, 2024
2070e2a
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 6, 2024
4ef2cad
fix merge conflicts
bacherfl Nov 6, 2024
28862e6
fix path to testobjects directoy
bacherfl Nov 6, 2024
582f06c
fix test cleanup
bacherfl Nov 6, 2024
9916d6f
increase number of expected metrics before continuing with assertions
bacherfl Nov 6, 2024
4251eec
adapt expected.yaml to account for newly added k8s objects
bacherfl Nov 6, 2024
db9746f
adapt assertions
bacherfl Nov 6, 2024
1968099
use separate namespace for namespace-scoped test
bacherfl Nov 6, 2024
fcbebe3
fix path to testobjects directory
bacherfl Nov 6, 2024
a4f9130
ensure namespace is created before applying other tests
bacherfl Nov 6, 2024
19f947f
fix namespace in confmap
bacherfl Nov 6, 2024
dca10d1
adapt check for required number of metrics
bacherfl Nov 6, 2024
742c663
fix test expectations
bacherfl Nov 7, 2024
75c8b0e
trigger CI
bacherfl Nov 7, 2024
bc00b77
increase timeout and ignore k8s.hpa.current_replicas metric value
bacherfl Nov 7, 2024
1defed9
add debug logs
bacherfl Nov 7, 2024
4bb2346
reduce job sleep time
bacherfl Nov 7, 2024
6315edd
adapt expections
bacherfl Nov 7, 2024
14aa5c6
adapt expections
bacherfl Nov 7, 2024
10e9267
remove debug logs that are hopefully not needed anymore
bacherfl Nov 7, 2024
ca738c1
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 8, 2024
c1810ec
revert changes to test logic
bacherfl Nov 8, 2024
4a673dc
extract common helper functions
bacherfl Nov 8, 2024
86eeb7e
re-add comment for generating golden file
bacherfl Nov 8, 2024
d5cb4d2
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 8, 2024
f75b5c5
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 11, 2024
e7ed86b
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
bacherfl Nov 18, 2024
3a9d281
Merge branch 'main' into feat/9401/namespaced-cluster-receiver
TylerHelmuth Nov 18, 2024
e2d8720
make generate
evan-bradley Nov 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/k8sclusterreceiver-namespaced.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: k8sclusterreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add support for limiting observed resources to a specific namespace.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [9401]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: This change allows to make use of this receiver with `Roles`/`RoleBindings`, as opposed to giving the collector cluster-wide read access.

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
92 changes: 92 additions & 0 deletions receiver/k8sclusterreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ The following allocatable resource types are available.
- storage
- `metrics`: Allows to enable/disable metrics.
- `resource_attributes`: Allows to enable/disable resource attributes.
- `namespace`: Allows to observe resources for a particular namespace only. If this option is set to a non-empty string, `Nodes`, `Namespaces` and `ClusterResourceQuotas` will not be observed.

Example:

Expand Down Expand Up @@ -273,6 +274,97 @@ subjects:
EOF
```

As an alternative to setting up a `ClusterRole`/`ClusterRoleBinding`, it is also possible to limit the observed resources to a
particular namespace by setting the `namespace` option of the receiver. This allows the collector to only rely on `Roles`/`RoleBindings`,
instead of granting the collector cluster-wide read access to resources.
Note however, that in this case the following resources will not be observed by the `k8sclusterreceiver`:

- `Nodes`
- `Namespaces`
- `ClusterResourceQuotas`

To use this approach, use the commands below to create the required `Role` and `RoleBinding`:

```bash
<<EOF | kubectl apply -f -
metadata:
name: otelcontribcol
labels:
app: otelcontribcol
namespace: default
rules:
- apiGroups:
- ""
resources:
- events
- pods
- pods/status
- replicationcontrollers
- replicationcontrollers/status
- services
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
EOF
```

```bash
<<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: otelcontribcol
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: otelcontribcol
subjects:
- kind: ServiceAccount
name: otelcontribcol
namespace: default
EOF
```

### Deployment

Create a [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) to deploy the collector.
Expand Down
6 changes: 6 additions & 0 deletions receiver/k8sclusterreceiver/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ type Config struct {

// MetricsBuilderConfig allows customizing scraped metrics/attributes representation.
metadata.MetricsBuilderConfig `mapstructure:",squash"`

// Namespace to fetch resources from. If this is set, certain cluster-wide resources such as Nodes or Namespaces
// will not be able to be observed. Setting this option is recommended in environments where due to security restrictions
// the collector can not be granted cluster-wide permissions.
Namespace string `mapstructure:"namespace"`
ChrsMark marked this conversation as resolved.
Show resolved Hide resolved
}

func (cfg *Config) Validate() error {
Expand All @@ -48,5 +53,6 @@ func (cfg *Config) Validate() error {
default:
return fmt.Errorf("\"%s\" is not a supported distribution. Must be one of: \"openshift\", \"kubernetes\"", cfg.Distribution)
}

return nil
}
150 changes: 138 additions & 12 deletions receiver/k8sclusterreceiver/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ package k8sclusterreceiver

import (
"context"
"path/filepath"
"strings"
"testing"
"time"
Expand All @@ -25,28 +26,30 @@ import (
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatatest/pmetrictest"
)

const expectedFile = "./testdata/e2e/expected.yaml"
const expectedFileClusterScoped = "./testdata/e2e/cluster-scoped/expected.yaml"
const expectedFileNamespaceScoped = "./testdata/e2e/namespace-scoped/expected.yaml"

const testObjectsDirClusterScoped = "./testdata/e2e/cluster-scoped/testobjects"
const testObjectsDirNamespaceScoped = "./testdata/e2e/namespace-scoped/testobjects"
const testKubeConfig = "/tmp/kube-config-otelcol-e2e-testing"
const testObjectsDir = "./testdata/e2e/testobjects/"

// TestE2E tests the k8s cluster receiver with a real k8s cluster.
// TestE2EClusterScoped tests the k8s cluster receiver with a real k8s cluster.
// The test requires a prebuilt otelcontribcol image uploaded to a kind k8s cluster defined in
// `/tmp/kube-config-otelcol-e2e-testing`. Run the following command prior to running the test locally:
//
// kind create cluster --kubeconfig=/tmp/kube-config-otelcol-e2e-testing
// make docker-otelcontribcol
// KUBECONFIG=/tmp/kube-config-otelcol-e2e-testing kind load docker-image otelcontribcol:latest
func TestE2E(t *testing.T) {

func TestE2EClusterScoped(t *testing.T) {
var expected pmetric.Metrics
expected, err := golden.ReadMetrics(expectedFile)
expected, err := golden.ReadMetrics(expectedFileClusterScoped)
require.NoError(t, err)

k8sClient, err := k8stest.NewK8sClient(testKubeConfig)
require.NoError(t, err)

// k8s test objs
testObjs, err := k8stest.CreateObjects(k8sClient, testObjectsDir)
testObjs, err := k8stest.CreateObjects(k8sClient, testObjectsDirClusterScoped)
require.NoErrorf(t, err, "failed to create objects")

t.Cleanup(func() {
Expand All @@ -58,17 +61,16 @@ func TestE2E(t *testing.T) {
defer shutdownSink()

testID := uuid.NewString()[:8]
collectorObjs := k8stest.CreateCollectorObjects(t, k8sClient, testID, "")
collectorObjs := k8stest.CreateCollectorObjects(t, k8sClient, testID, filepath.Join(".", "testdata", "e2e", "cluster-scoped", "collector"))

t.Cleanup(func() {
for _, obj := range append(collectorObjs) {
require.NoErrorf(t, k8stest.DeleteObject(k8sClient, obj), "failed to delete object %s", obj.GetName())
}
})

wantEntries := 10 // Minimal number of metrics to wait for.
wantEntries := expected.ResourceMetrics().Len() // Minimal number of metrics to wait for.
ChrsMark marked this conversation as resolved.
Show resolved Hide resolved
waitForData(t, wantEntries, metricsConsumer)
// golden.WriteMetrics(t, expectedFile, metricsConsumer.AllMetrics()[len(metricsConsumer.AllMetrics())-1])
ChrsMark marked this conversation as resolved.
Show resolved Hide resolved

replaceWithStar := func(string) string { return "*" }
shortenNames := func(value string) string {
Expand Down Expand Up @@ -125,6 +127,127 @@ func TestE2E(t *testing.T) {
"k8s.job.desired_successful_pods",
"k8s.job.failed_pods",
"k8s.job.max_parallel_pods",
"k8s.hpa.current_replicas",
"k8s.job.successful_pods"),
pmetrictest.ChangeResourceAttributeValue("container.id", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("container.image.name", containerImageShorten),
pmetrictest.ChangeResourceAttributeValue("container.image.tag", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.cronjob.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.daemonset.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.deployment.name", shortenNames),
pmetrictest.ChangeResourceAttributeValue("k8s.deployment.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.hpa.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.job.name", shortenNames),
pmetrictest.ChangeResourceAttributeValue("k8s.job.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.namespace.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.node.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.pod.name", shortenNames),
pmetrictest.ChangeResourceAttributeValue("k8s.pod.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.replicaset.name", shortenNames),
pmetrictest.ChangeResourceAttributeValue("k8s.replicaset.uid", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("k8s.statefulset.uid", replaceWithStar),
pmetrictest.IgnoreScopeVersion(),
pmetrictest.IgnoreResourceMetricsOrder(),
pmetrictest.IgnoreMetricsOrder(),
pmetrictest.IgnoreScopeMetricsOrder(),
pmetrictest.IgnoreMetricDataPointsOrder(),
),
)
}

// TestE2ENamespaceScoped tests the k8s cluster receiver with a real k8s cluster.
// The test requires a prebuilt otelcontribcol image uploaded to a kind k8s cluster defined in
// `/tmp/kube-config-otelcol-e2e-testing`. Run the following command prior to running the test locally:
//
// kind create cluster --kubeconfig=/tmp/kube-config-otelcol-e2e-testing
// make docker-otelcontribcol
// KUBECONFIG=/tmp/kube-config-otelcol-e2e-testing kind load docker-image otelcontribcol:latest
func TestE2ENamespaceScoped(t *testing.T) {

var expected pmetric.Metrics
expected, err := golden.ReadMetrics(expectedFileNamespaceScoped)
require.NoError(t, err)

k8sClient, err := k8stest.NewK8sClient(testKubeConfig)
require.NoError(t, err)

// k8s test objs
testObjs, err := k8stest.CreateObjects(k8sClient, testObjectsDirNamespaceScoped)
require.NoErrorf(t, err, "failed to create objects")

t.Cleanup(func() {
require.NoErrorf(t, k8stest.DeleteObjects(k8sClient, testObjs), "failed to delete objects")
})

metricsConsumer := new(consumertest.MetricsSink)
shutdownSink := startUpSink(t, metricsConsumer)
defer shutdownSink()

testID := uuid.NewString()[:8]
collectorObjs := k8stest.CreateCollectorObjects(t, k8sClient, testID, filepath.Join(".", "testdata", "e2e", "namespace-scoped", "collector"))

t.Cleanup(func() {
for _, obj := range append(collectorObjs) {
require.NoErrorf(t, k8stest.DeleteObject(k8sClient, obj), "failed to delete object %s", obj.GetName())
}
})

wantEntries := expected.ResourceMetrics().Len() // Minimal number of metrics to wait for.
waitForData(t, wantEntries, metricsConsumer)

replaceWithStar := func(string) string { return "*" }
shortenNames := func(value string) string {
if strings.HasPrefix(value, "coredns") {
return "coredns"
}
if strings.HasPrefix(value, "kindnet") {
return "kindnet"
}
if strings.HasPrefix(value, "kube-apiserver") {
return "kube-apiserver"
}
if strings.HasPrefix(value, "kube-proxy") {
return "kube-proxy"
}
if strings.HasPrefix(value, "kube-scheduler") {
return "kube-scheduler"
}
if strings.HasPrefix(value, "kube-controller-manager") {
return "kube-controller-manager"
}
if strings.HasPrefix(value, "local-path-provisioner") {
return "local-path-provisioner"
}
if strings.HasPrefix(value, "otelcol") {
return "otelcol"
}
if strings.HasPrefix(value, "test-k8scluster-receiver-cronjob") {
return "test-k8scluster-receiver-cronjob"
}
if strings.HasPrefix(value, "test-k8scluster-receiver-job") {
return "test-k8scluster-receiver-job"
}
return value
}
containerImageShorten := func(value string) string {
return value[(strings.LastIndex(value, "/") + 1):]
}
require.NoError(t, pmetrictest.CompareMetrics(expected, metricsConsumer.AllMetrics()[len(metricsConsumer.AllMetrics())-1],
pmetrictest.IgnoreTimestamp(),
pmetrictest.IgnoreStartTimestamp(),
pmetrictest.IgnoreMetricValues(
"k8s.container.cpu_request",
"k8s.container.memory_limit",
"k8s.container.memory_request",
"k8s.container.restarts",
"k8s.cronjob.active_jobs",
"k8s.deployment.available",
"k8s.deployment.desired",
"k8s.job.active_pods",
"k8s.job.desired_successful_pods",
"k8s.job.failed_pods",
"k8s.job.max_parallel_pods",
"k8s.hpa.current_replicas",
"k8s.job.successful_pods"),
pmetrictest.ChangeResourceAttributeValue("container.id", replaceWithStar),
pmetrictest.ChangeResourceAttributeValue("container.image.name", containerImageShorten),
Expand Down Expand Up @@ -167,9 +290,12 @@ func startUpSink(t *testing.T, mc *consumertest.MetricsSink) func() {
}

func waitForData(t *testing.T, entriesNum int, mc *consumertest.MetricsSink) {
timeoutMinutes := 3
timeoutMinutes := 6
ChrsMark marked this conversation as resolved.
Show resolved Hide resolved
require.Eventuallyf(t, func() bool {
return len(mc.AllMetrics()) > entriesNum
if len(mc.AllMetrics()) == 0 {
return false
}
return mc.AllMetrics()[len(mc.AllMetrics())-1].ResourceMetrics().Len() == entriesNum
}, time.Duration(timeoutMinutes)*time.Minute, 1*time.Second,
"failed to receive %d entries, received %d metrics in %d minutes", entriesNum,
len(mc.AllMetrics()), timeoutMinutes)
Expand Down
Loading