Improve journald support #411

tallclair · 2024-07-20T18:47:53Z

/kind feature

Problem

Running Kubelet as a systemd service logging to journald is not ideal today:

Jun 17 12:46:23 gke-test-default-r55g kubelet[118528]: I0617 12:46:23.668491  118528 kubelet.go:2447] "SyncLoop (probe)" probe="liveness" status="unhealthy" pod="default/test"
Jun 17 12:46:24 gke-test-default-r55g kubelet[118528]: E0617 12:46:24.369353  118528 kubelet.go:1875] "Unable to attach or mount volumes for pod; skipping pod" err="unmounted volumes=[data], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition" pod="default/test"

Logs redundant information with the journal log:
- date / time (0617 12:46:23.668491)
- PID (118528)
Severity level is not parsed by journald

The redundant information can be suppressed with the skip_headers flag, but that also suppresses the severity & code location.

Proposal

The simplest option would be to add a journald option that suppresses the redundant headers, and sets the severity prefix in the journald format (https://www.freedesktop.org/software/systemd/man/latest/sd-daemon.html).

Alternatively, klog could write to journald directly, e.g. something like https://git.frostfs.info/TrueCloudLab/zapjournald, but I don't know enough about the tradeoffs here. I suggest starting with the first option.

The text was updated successfully, but these errors were encountered:

pohly · 2024-07-21T09:30:02Z

With structured, contextual logging, Kubernetes follows the logr model which only has "info" and "error" severity. But I suppose that difference would be nice to represent.

Do you care about writing JSON output to the journald log?

tallclair · 2024-07-23T21:28:00Z

Do you care about writing JSON output to the journald log?

Good question. I think I'd still expect the redundant information to be suppressed in this case, but I think some follow-up investigation is needed to see how the logs are typically ingested.

dgrisonnet · 2024-07-25T16:50:50Z

/triage accepted

huww98 · 2024-08-15T12:53:45Z

Journald itself also stores structured logs (please see journalctl -o json), I think we can map logr fields into journald fields, logr severity to journald priority (to get colorful output with journalctl) if we write to journald socket directly. I can help with this, if you agree. I've merged journald support to ceph before.

But different from logr, journalctl does not show extra fields after message with its default output format. So maybe we should still format the message ourself. Maybe we can also replicate some common fields (or just all the fields) to journald fields for efficient filtering? e.g. how about enabling journalctl -u kubelet POD=default/test or journalctl CODE_FILE=kubelet.go

pohly · 2024-08-19T07:25:57Z

if we write to journald socket directly

This could be useful, in particular when it ensures that multi-line log entries are treated as one entry. I'm a bit worried that this might happen in text mode with multi-line text values. The text logger does a single os.Stderr.Write, but that does not necessarily mean that journald can read the entire entry with a single read. Does it assume "one line = one entry" or use SOCK_DGRAM?

Maybe we can also replicate some common fields (or just all the fields) to journald fields for efficient filtering?

On caveat is that we only have recommendations for common fields, but no schema and no enforcement of those recommendations other than code reviews. Having said that, some fields like "pod" are used fairly consistently, so exposing that in journald could be useful.

Either way, this changes externally visible behavior of kubelet and potentially other components. We also need to consider how to roll it out. In other words, do we need a KEP? @tallclair, what do you think? Is it worth asking in a SIG Node meeting as this is most useful for kubelet?

huww98 · 2024-08-19T07:33:05Z

The text logger does a single os.Stderr.Write, but that does not necessarily mean that journald can read the entire entry with a single read. Does it assume "one line = one entry" or use SOCK_DGRAM?

journald has its own protocol. we can write the total length of the message before the message text, so that multi-line log can be recognized as one entry.

pohly · 2024-08-19T12:45:02Z

Suppose we don't use that protocol. How does systemd capture the stderr output from a process, i.e. what are the message boundaries in that case?

huww98 · 2024-08-19T12:47:48Z

I think it will just use \n as the separator.

pohly · 2024-08-20T09:22:01Z

Then we shouldn't remove the klog header while using stderr because the message boundaries could get lost.

tallclair · 2024-09-19T21:41:14Z

Yes, I can follow up with with sig-node.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 20, 2024

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve journald support #411

Improve journald support #411

tallclair commented Jul 20, 2024

pohly commented Jul 21, 2024

tallclair commented Jul 23, 2024

dgrisonnet commented Jul 25, 2024

huww98 commented Aug 15, 2024

pohly commented Aug 19, 2024

huww98 commented Aug 19, 2024

pohly commented Aug 19, 2024

huww98 commented Aug 19, 2024

pohly commented Aug 20, 2024

tallclair commented Sep 19, 2024

Improve journald support #411

Improve journald support #411

Comments

tallclair commented Jul 20, 2024

Problem

Proposal

pohly commented Jul 21, 2024

tallclair commented Jul 23, 2024

dgrisonnet commented Jul 25, 2024

huww98 commented Aug 15, 2024

pohly commented Aug 19, 2024

huww98 commented Aug 19, 2024

pohly commented Aug 19, 2024

huww98 commented Aug 19, 2024

pohly commented Aug 20, 2024

tallclair commented Sep 19, 2024