Replies: 3 comments
-
We are now at 8000Mi memory request/limit for syslog-ng and it is still failing. Couple more questions/observations:
|
Beta Was this translation helpful? Give feedback.
-
At 12Gi and still failing we randomly decided to disable the disk buffer. After 4 hours no OOM so far or any other crash |
Beta Was this translation helpful? Give feedback.
-
In the logging operator logiwsize is calculated from maxconnections (maxconnection * 100) which is calculated based This means in case if 50 nodes altogether the logiwsize is set to 100 000 and based on those docs you can use that many batch-lines. However I would expect to hit a limit on loki's side with that and also that could lead to higher memory consumption and much bigger latency (no more of course than what batch-timeout allows). Regarding sizing the PVC, I think it is always a good idea to leave a little bit more room for the disk buffers then required. Regarding Load balancing: it depends on the fluentbit networking settings and on the kubernetes service load balancing implementation. You can tune the TCP keepalive settings (keepalive max recycle more specifically) if you need better distribution of connections from fluentbit to syslog-ng. It's not a bad idea to use syslog-ng without disk buffers as long as data durability is not critical. Syslog-ng will try its best to flush all data to the destination before it shuts down under normal circumstances. Can you give us a more specific output configuration so that we can better understand what could possibly went wrong there? Also the number of nodes and rate of messages would be useful to understand. Feel free to ping us on discord as well: https://discord.gg/6FnMxKJC |
Beta Was this translation helpful? Give feedback.
-
I understand that asking for resource recommendations is highly subjective and a simple one-size-fits-all answer is not possible. Yet here we are and this is what happened in my world recently:
E0000 00:00:1722961821.975117 32 wire_format_lite.cc:626] String field 'logproto.EntryAdapter.line' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
I'm not sure if the last item is related or I just never saw it before, but I would really like some community feedback and what works for YOU. We are running 2 replicas of syslog-ng, each with a buffer of now 20Gi and 4Gi of memory. Are there any parameters I can use to tune this system besides these compute resources? Would decreasing the batch size reduce memory consumption? Would it help to increase the number of workers? Are buffers are bad idea?
Beta Was this translation helpful? Give feedback.
All reactions