Fluentd drainer job cannot complete when old chunks with no associated config are present #1310

aslafy-z · 2022-10-19T14:20:31Z

aslafy-z
Oct 19, 2022
Collaborator

Describe the bug:
leftover buffers with no associated configs (flow / output does not exist anymore) are never drained, so drain-watch does not kill fluentd and buffer-volume-sidecar and the drainer jobs finishes with error because of timeout.

Expected behaviour:
Drainer job skips old chunks with no associated config.
Better: Drainer job is executed on config update so no orphan chunks stays in the buffers.

Steps to reproduce the bug:

Delete an output with chunks still in the pipe.
Once Fluentd is reloaded, these chunks (in /buffers) are now unmatched and will never be sent to the destination and stay forever in the folder.
If you scale down this fluentd, the drainer job will stay stuck forever as the script since this condition never becomes true https://github.com/banzaicloud/logging-operator/blob/76533182e425e00ade40d8d9d8b3ffadda6c4548/drain-watch-image/drain-watch.sh#L23

Workaround:

for each errored drainer pods (with associated logging-operator-logging-fluentd-XX pod):

remove the drainer pod and wait for it to be recreated
exec to the drainer pod and empty the /buffers directory

Environment details:

Kubernetes version (e.g. v1.15.2): v1.20.9
Cloud-provider/provisioner (e.g. AKS, GKE, EKS, PKE etc): RKE
logging-operator version (e.g. 2.1.1): 3.17.9
Install method (e.g. helm or static manifests): helm
Logs from the misbehaving component (and any other relevant logs): Let me know if you need something
Resource definition (possibly in YAML format) that caused the issue, without sensitive data: N/A

/kind bug

siliconbrain · 2022-10-19T19:12:56Z

siliconbrain
Oct 19, 2022
Maintainer

TL;DR: This is by design and the suggested solution is what you mention as workaround. The buffer dropping flag could be implemented as an enhancement iff it's feasible.

By default, it's better to call operators' attention to the situation (by the way of job failure) than to lose logs silently. There could be a configuration option to drop orphaned buffers, but I'm not sure how complex it would be to determine whether a buffer is orphaned.

Also, I'm not sure how practical it would be to run the drain job on config update, because it might lengthen the config update or even make it fail e.g. if the update was done to get rid of an obsolete/erroneous output that won't accept logs anyway.

0 replies

2023-04-12T09:11:47Z

stale[bot]
bot Apr 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions!

0 replies

pepov · 2023-04-18T12:54:00Z

pepov
Apr 18, 2023
Maintainer

We've been discussing a possible solution where we would create configuration snapshots and drainer pods would be tied to a snapshot instead of a moving target.

In my opinion this should be an opt-in mechanism that would work in an immutable manner.

That means configuration changes wouldn't automatically trigger a change in the existing deployment, but rather spin up a new cluster with the new config and redirect traffic there.

Once the new cluster is up and running the existing cluster can be scaled down with the original config with the drainer as usual.

0 replies

pepov · 2023-04-18T14:26:27Z

pepov
Apr 18, 2023
Maintainer

After thinking a little bit more about this I came up with these ideas:

The problem

Live configuration changes are useful in most situations but can be painful in certain environments, where scaling down aggregator pods is a regular activity. In terms of configuration I primarily mean output configurations, where buffers are persisted on disk, thus are a moving target for the drainer pods if the config changes.

Proposed solution

I would like to see a solution where I can say that certain changes should trigger a completely different aggregator (fluentd/syslog-ng) deployment with the new config, where the existing deployment continues using the same config and eventually will be scaled down properly.

I can think of two different approaches in mind.

Using a webhook with statefulsets

Similar to how we create a hash based on the configuration, we could use that hash to create isolated configs.

A webhook would understand the config hash and would mutate the pod to use a separate directory for its buffers under /buffers/<confighash> for example.

When there is a new configuration and the pods are redeployed by the statefulset controller, the webhook would watch pod deletions and would create a job to drain the buffers with the previous configuration. This would require the PV to be mounted as ReadWriteMany.

In case there is an ordering guarantee, the mutating webhook should actually block deletion until the drainer completes, otherwise it can happen simultaneously.

I beleive this would actually make the drainer job logic obsolete in the operator.

Using our own workload controller

We could implement our own workload controller specifically tuned for controlling log aggregation workloads. This could be a much heavier lift, but would avoid the issues of mutating webhooks and would mean much bigger flexibility and freedom.

0 replies

gthieleb · 2023-06-09T12:16:24Z

gthieleb
Jun 9, 2023

We have a usecase that on removing outputs and flows in our rancher projects associated fluentbit buffers still remains. We do not have administrative privileges for the logging operator and can only maintain the output and flow CRDs. Perhaps the mentioned feature would help us, currently there seems no other possibility than opening tickets or directly contacting the admin team to clear the buffers manually.

1 reply

pepov Jun 12, 2023
Maintainer

fluentbit buffers are out of scope of the current drainer implementation unfortunately. That would be a completely different feature to implement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube Logging

Fluentd drainer job cannot complete when old chunks with no associated config are present #1310

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Kube Logging

Fluentd drainer job cannot complete when old chunks with no associated config are present #1310

aslafy-z Oct 19, 2022 Collaborator

Replies: 5 comments · 1 reply

siliconbrain Oct 19, 2022 Maintainer

stale[bot] bot Apr 12, 2023

pepov Apr 18, 2023 Maintainer

pepov Apr 18, 2023 Maintainer

The problem

Proposed solution

Using a webhook with statefulsets

Using our own workload controller

gthieleb Jun 9, 2023

pepov Jun 12, 2023 Maintainer

aslafy-z
Oct 19, 2022
Collaborator

Replies: 5 comments 1 reply

siliconbrain
Oct 19, 2022
Maintainer

stale[bot]
bot Apr 12, 2023

pepov
Apr 18, 2023
Maintainer

pepov
Apr 18, 2023
Maintainer

gthieleb
Jun 9, 2023

pepov Jun 12, 2023
Maintainer