-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial "Orphaned" Stacktraces Emitted By Agent #178
Comments
A likely cause is the limit on stack depth, see if you can avoid this issue by increasing it to 1024 or something. |
I may try that. However I would guess (with fairly high confidence) from the nature of what I'm profiling that these particular stacks are exact duplicates of some other stacks in the log, which would then be exactly as deep. I have to verify this (maybe tomorrow) but if I'm right, it's more likely to be a a timer issue. |
By the way, wouldn't it make more sense if I test the converse ? Set maxFrames to a low number (e.g. 5) and see what happens ? If I end up with the bottom 5 lines of all stacks then that would be the best proof that you're right and that probably in a couple of instances the depth got exceeded. Or am I missing something ? |
I tried setting maxFrames to 5. Now I indeed get a lot of "stack bottoms" - that is, every "stack" is 5 deep. And they all seem to be "real bottoms". That is to say, the bottom frame is practically always a "final" frame which doesn't even have anything to call anymore anyway. However, this indicates that the behaviour of the agent really is to limit the number of frames dumped per stack to . If in my original example I was running up against the default value of maxFrames (don't know what it is), the expected result of such behaviour would be that the "orphaned partial stacks" I have in those logs have depth == maxFrames. That is not the case : their depth is a lot less, typically between 2 and 6 frames in the example I investigated. If the default maxFrames is something like 50, these orphans would have contained 50 frames. And @nitsanw - thanks for the quick reaction and the interesting suggestion ! |
@PhRX good reasoning, and indeed from your evidence I would say you are correct to assume max depth is not the issue. |
At the bottom you'll find an extract from a real HP log, or at least the output I got by modifying the LogCollector to print the THREADSTARTs followed by the Thread Id, and the FRAMEs followed by the method name and id. (as an aside, the log is a fixed file run through sequentially by the LogCollector, there's no concurrency issues in play here).
I did this because I systematically saw some unexplained "top-level" methods in the tree view just under the thread level (so siblings of the java.lang.Thread.run method for example), which clearly didn't belong there.
The 2 lines in the middle (in other cases I saw up to 5-6 lines like this, but always in fairly "deep" stacks) between 2 THREADSTART events look like the bottom of a stack. Inspection for 2 different cases (looking at what's in the log and at the original source code being profiled) shows that these lines are "orphaned", i.e. they are not the "bottom" of the trace following the next THREADSTART (i.e. this eliminates the possible case that somehow spurious THREADSTARTS could be emitted in the middle of a stack being dumped).
My theory is that the agent starts emitting a stacktrace, and gets interrupted by the timer who asks for the next stacktrace, so the rest of the current stacktrace gets devnulled. I didn't look at the C++ code however, this is just a guess.
I can probably work around this in the LogCollector with a heuristic (peek at last frame before a new threadstart, if it isn't one of the very few top-level frames for the thread it belongs to, throw the stack away) but IMHO it should get fixed properly in the agent.
[Start of dump extract, anonimized a tad ;-) ]
[...]
FRAME com.[...].doExecute 139658604373232
FRAME com.[...].execute 139658589859688
FRAME com.[...].execute 139658590061520
FRAME com.[...].run 139658590030528
FRAME java.lang.Thread.run 139658581930536
THREADSTART 31885
FRAME com.[...].getNameBaseInternalPrivate 139656096479528
FRAME com.[...].getNextSiblingRefPrivate 139656096658576
THREADSTART 31885
FRAME com.[...].getContentPrivate 139656099318112
FRAME com.[...].getContent 139656099318384
FRAME com.[...].getNodeValue 139656099314848
FRAME com.[...].serializeTextNode 139656099112320
FRAME com.[...].serializeNode 139656099112520
[...]
The text was updated successfully, but these errors were encountered: