You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a distributed system where messages get sent into some cloud processor.
Before sending the message I add an attribute that persists back to my handler ,that is now running 3 to 4 hops away.
The attribute I add is
if (Activity.Current is not null) {
message.AddAttribute(nameof(Activity.Current.ParentId), Activity.Current.Id.ToString()));
}
The sampler is set the same for all services I control. In dev its 1 and as I roll it out to prod I decrease it to 0.25 eventually.
I tested this locally and some of our QA environments but once the sample got to pre prod testing and set at 0.5 I started to notice broken traces.
For example, a method that polls a long running method else where nicely showed me my API call, then the spans of the other services nicely showed up under the API call. That does not happen in pre production testing.. I just see the API call, because the distributed system did NOT emit traces (as I cant find them when manually looking) so its not stitching them up!
I thought the W3C trace ID contains a flag that is set to tell the other systems to sample the trace.. but I don't know.. maybe I am not starting the activities properly.
The remote service is a dotnet worker so that just runs 24/7 and waits for some messages to process. While its doing that its doing activities that are unrelated to messages. So there are activities before a message arrives.
Once the message arrives, at the first possible point in time I use this code
var parentId = string.Empty;
msgRequest.Attributes.TryGetValue(nameof(Activity.Current.ParentId), out parentId);
using (var activity = _activitySource.StartActivity($"HandleMessage", ActivityKind.Internal, parentId))
{
I assumed that THIS activity will now be sampled, even though maybe the workers parent activities are unsampled?
Or have I completely mis understood or written crap code?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We have a distributed system where messages get sent into some cloud processor.
Before sending the message I add an attribute that persists back to my handler ,that is now running 3 to 4 hops away.
The attribute I add is
The sampler is set the same for all services I control. In dev its 1 and as I roll it out to prod I decrease it to 0.25 eventually.
I tested this locally and some of our QA environments but once the sample got to pre prod testing and set at 0.5 I started to notice broken traces.
For example, a method that polls a long running method else where nicely showed me my API call, then the spans of the other services nicely showed up under the API call. That does not happen in pre production testing.. I just see the API call, because the distributed system did NOT emit traces (as I cant find them when manually looking) so its not stitching them up!
I thought the W3C trace ID contains a flag that is set to tell the other systems to sample the trace.. but I don't know.. maybe I am not starting the activities properly.
The remote service is a dotnet worker so that just runs 24/7 and waits for some messages to process. While its doing that its doing activities that are unrelated to messages. So there are activities before a message arrives.
Once the message arrives, at the first possible point in time I use this code
I assumed that THIS activity will now be sampled, even though maybe the workers parent activities are unsampled?
Or have I completely mis understood or written crap code?
Beta Was this translation helpful? Give feedback.
All reactions