-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sink OTLP] cannot ingest log batches #22188
Comments
Do you see any errors on OTLP/Loki side? The message rate is quite low so I doubt this is due to backpressure.
In the picture above, are you missing "Hello world - 29" and so on? |
Exactly. |
Also, can you add a console sink and verify that all logs are received and printed?
|
It's what I used to get the source format and also the sink format, after the |
Currently experiencing the same 🤔 |
Might be a coincidence, but none of my log messages containing emojis are going through. Thought it would be worth mentioning. I've confirmed the endpoint accepts it. I'm also losing way more than 10%. More than half I'd say. |
I found out what the problem was for me. The protocol-level batching breaks ingestion into loki. Only the first message was registered, and the rest magically disappeared. See more in #22232. As I mentioned there, I reckon this is because of how the requests are merged together? Perhaps it would require a custom sink? |
Thanks @FredrikAugust . I also modify my opentelemetry sink with sinks:
otelhttp:
inputs:
- manage_labels
protocol:
batch:
max_events: 1
encoding:
codec: json
framing:
method: newline_delimited
method: post
request:
headers:
content-type: application/json
type: http
uri: http://loki:3100/otlp/v1/logs
type: opentelemetry and now I receive all the messages in Loki. But for me, this workaround should have an impact on the vector performance if it sends message one by one, isn't it ? |
Yeah @blezoray I would assume this bumps up the resource usage quite a bit since it's got to send each log line as an individual HTTP req. Might it be because framing just merges the log messages into several lines in the HTTP req? As mentioned earlier, it would need to join them together in the JSON structure. I might be able to take a look at this tomorrow and get a PR up if that's possible? |
Looks like this issue and #22232 are the same but keeping them both open for now until we are certain. We definitely want to support batching. There are several framing options here to try. It does sound like the first log is decoded correctly and the rest are dropped. But I would be surprised if this happens silently, do you see any errors on the receiver side? |
In this previous post, you have the Loki logs in debug.
|
@pront I'm running this on Grafana Cloud so the debugging capabilities are limited, but I can't see anything indicating problems. Also regarding the framing, referring to the expected format in the OTEL spec, I don't think any of them would work. A closer solution which I was planning to try out if this remains an issue is to use the |
Good thought, https://vector.dev/docs/reference/configuration/transforms/reduce/ should be able to aggregate multiple logs into a single encoded payload in an acceptable format. We eventually want to introduce some sort of native grouping at the OTEL sink. But resources are limited this month. OTEL interoperability is high on the priority list and the community's participation in threads like this is very much appreciated. |
Thanks for the response @pront! Glad to hear OTEL interop. is prioritized as it's a very nice feature. |
A note for the community
Problem
Hello,
I'm using an Openshift Cluster 4.15 with the logging stack configured with vector to collect logs on each node and forward them to a namespace vector, through HTTP source/sink.
My namespace vector send logs to a local Grafana Loki using the Loki sink and it works fine.
But, when I try to use the OTLP sink that target the OTLP Loki api, I regularly lost some messages, about 10%.
I have an application that generates 1 msg/s to a kafka topic and it generate 1 log/msg.
Vector and Loki don't report any error.
Is there something in my configuration that could explain this behavior ?
Rgds.
Configuration
Version
0.43.1-distroless-libc
Debug Output
Example Data
Sink
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: