-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs do not pass through probabilistic sampling processor #36119
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Logs pass through fine when the processor is not in the pipeline. The docker networking is very unlikely to be the issue and Servbay does not cover our needs, but thank you for the ideas. |
We have a test that checks this very scenario. I am a bit at a loss to help here. Someone needs to spend some time reproducing. |
Here is what is happening: none of the logs you send have a traceID, so they get discarded. It works a bit differently if you try this:
And you run with this config:
Then it works as advertised. |
@atoulme, thank you.
This makes sense and I was also trying to use another attribute. I'm sure I was doing something wrong there and I'll figure it out, but I was confused since I couldn't confirm that the processor was working at all. I expected logs should come through when set at 100%, but I suppose that could be my misunderstanding of how an empty traceID is handled. I assumed it was still handled by the processor hashing an empty string or something like that. It sounds like it is actually discarded at the very beginning. |
I think there's a current UX problem with the processor. With a similar configuration, I see the following metric at the Collector's internal metrics:
You can verify this is the case by using a hash of the body as the "from_attribute": processors:
probabilistic_sampler:
hash_seed: 22
sampling_percentage: 10
attribute_source: record
from_attribute: body.hash
transform:
log_statements:
- context: log
statements:
- set(attributes["body.hash"], FNV(body)) Note however, that this particular example is a bad practice: the idea of the probabilistic sampler is to store samples of every event related to a specific business transaction, so that you get a good representation of what your systems are doing in production. The example above will semi-randomly discard logs based on the whole record. |
Component(s)
processor/probabilisticsampler
What happened?
Description
No matter how much I simplify the configuration, I cannot get any logs to pass through the probabilisitic sampler even when
sampling_percentage
is set to 100.Even if I'm making a mistake configuring the attributes, I would expect a percentage of 100 to pass every log.
Steps to Reproduce
Run the otel docker image (I tried with
latest
,0.103.1
, and0.102.1
) using this script with the simplified config shown belowand send logs or traces using
telemetrygen
, e.g.Our real config is obviously more complex, but I kept cutting it down until it was a minimal config in order to test this. I tried setting different values for
from_attribute
with no change in behavior.Expected Result
Some logs should be displayed by the
debug
exporter depending on thesampling_percentage
andattribute_source
/from_attribute
. Whensampling_percentage
is set to 100, I expect all logs to pass through even if the sampling attribute is constant across all logs.Actual Result
Sampling works as expected when sending traces with
telemetrygen
and logs are displayed when the sampler is not in the pipeline.When the sampler is in the logs pipeline, 0 logs display even when percentage is set to 100
Collector version
0.103.1, latest(0.112.0 I think), and 0.102.1
Environment information
Environment
OS: Mac
running docker images using Rancher.
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: