Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log HTTP/RPC message processing stats #10141

Merged
merged 2 commits into from
Nov 27, 2024
Merged

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Aug 29, 2024

This PR provides an alternative implementation of #10083, but without having to introduce a new log class, and with much more contextual log details.

  • 3749756 - In addition to the existing HTTP request status logs, this commit adds the elapsed time of waiting for a CPU semaphore.
  • ec79ae6 - This commit basically does the same thing as the commit above, but for JSON-RPC connections.

closes #10083

@cla-bot cla-bot bot added the cla/signed label Aug 29, 2024
@yhabteab yhabteab requested a review from julianbrost August 29, 2024 12:50
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from b1c480b to f166915 Compare August 29, 2024 13:18
lib/base/logger.hpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/httpserverconnection.cpp Outdated Show resolved Hide resolved
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from f166915 to 8e7a3d1 Compare August 29, 2024 16:01
@yhabteab yhabteab requested a review from julianbrost August 29, 2024 16:03
@yhabteab yhabteab added this to the 2.15.0 milestone Aug 29, 2024
@yhabteab yhabteab added the consider backporting Should be considered for inclusion in a bugfix release label Aug 29, 2024
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/httpserverconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from 8e7a3d1 to c8832b5 Compare August 30, 2024 12:45
@yhabteab yhabteab requested a review from julianbrost August 30, 2024 12:47
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from c8832b5 to c63f842 Compare September 2, 2024 07:25
@yhabteab yhabteab requested review from julianbrost and removed request for julianbrost September 2, 2024 07:26
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from c63f842 to ec79ae6 Compare September 4, 2024 09:46
@yhabteab
Copy link
Member Author

yhabteab commented Sep 4, 2024

Didn't change anything, just rebased!

@julianbrost
Copy link
Contributor

For completeness, as there is no actual cross-reference between these two PRs yet: #10140 is a soft dependency of this PR. The purpose of this PR is to provide logging for all places where CpuBoundWork could introduce unexpected delays but doesn't add logging for all instances currently in the code where the remaining ones are to be removed in #10140.

@yhabteab yhabteab requested a review from Al2Klimov September 18, 2024 12:53
lib/remote/jsonrpcconnection.cpp Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
@julianbrost
Copy link
Contributor

Are you planning (@yhabteab) / expecting (@Al2Klimov) further changes to this PR regarding the unresolved conversations? I saw that there was a bit of interaction in #10167 but does that mean you want to use it here?

@yhabteab
Copy link
Member Author

I saw that there was a bit of interaction in #10167 but does that mean you want to use it here?

Honestly, I have concerns about this approach; it appears rather random. Once #9990 is merged, it could become even more chaotic. If you, @julianbrost, believe we should use #10167, we can proceed with it, but I think it's unnecessary since we don't require that level of precision to determine whether it has blocked for a long time.

Are you planning (@yhabteab)

I've just rephrased the log message with the diagnostic error messages.

@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from ec79ae6 to 444323d Compare September 27, 2024 07:41
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from 444323d to 2bd789e Compare September 27, 2024 09:13
@yhabteab yhabteab requested a review from Al2Klimov September 27, 2024 09:13
Al2Klimov
Al2Klimov previously approved these changes Sep 30, 2024
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julianbrost To answer your question:

If you merge while Yonas is still absent, I'd just rebase my PR(s).

@julianbrost
Copy link
Contributor

There were a few things I didn't like regarding the changes in JsonRpcConnection::HandleIncomingMessages() which made me play around with the code a bit.

  • Control flow: in the error case (catch (const std::exception& ex)) the overall control flow is a bit convoluted in that it leaves the exception handler, does common logging code, and then resumes the error handling based on the value of a string variable. That probably works fine1, but I'd say it's easier to follow if all the error happens in the exception handler. Therefor, I've tried to keep the control flow as is with the logging now happening inside the exception handler.
  • Log messages: while avoiding duplication of the logging code sounds like a good idea, this seems to have resulted in a rewording of the existing "Error while processing [...]" message. With the new wording of that message, you have to read farther into the message until the reason why this is now logged as a warning becomes clear. I've tried to improve this by having separate and more clear wording in the beginning of the log message stating the reason for this message.

The current state of my playing around can be seen in commit 5de11a4. The main change is that there are now three distinct places where log messages can be generated. While this duplicates some code2, it doesn't even make the code longer overall. Apart from that, this commit also contains a few other minor tweak. Please have a look at this and let me know what you think.

Footnotes

  1. However, it's based on the assumption that DiagnosticInformation(ex) must never return an empty string. So if that assumption isn't needed, avoiding it doesn't sound too bad to me.

  2. I actually have an intermediate version where I tried to keep the common logging code with case distinctions for error/slow/other logging and place it in a Defer so that the break; can be kept inside catch. But that turned out to be quite messy.

@yhabteab
Copy link
Member Author

  • With the new wording of that message, you have to read farther into the message until the reason why this is now logged as a warning becomes clear.

How does it make any difference when the Error keyword appears in the log entry? you have to read further into the message ... sounds to me as if you would otherwise not read the rest of the log messages.

  • I've tried to improve this by having separate and more clear wording in the beginning of the log message stating the reason for this message.

And why exactly do we need this separation? Especially, why do you need to produce a different log entry based on the used log severity?

@julianbrost
Copy link
Contributor

Example of a log message from the existing code:

warning/JsonRpcConnection: Error while processing JSON-RPC message for identity 'some-endpoint': Something, something, problem

Example of a log message that your PR would generate:

warning/JsonRpcConnection: Processing JSON-RPC 'event::CheckResult' message for identity 'some-endpoint' took total 42ms: Error: Something, something, problem

Did you change the log message because you think it's an improvement to the log message or because it allowed you to combine the different log messages into basically the same with just minor differences for the different situations?

  • With the new wording of that message, you have to read farther into the message until the reason why this is now logged as a warning becomes clear.

How does it make any difference when the Error keyword appears in the log entry? you have to read further into the message ... sounds to me as if you would otherwise not read the rest of the log messages.

I think it's just nicer if the relevant part gets mentioned first. And in case of an error happening, that's that, not how long the processing took.

  • I've tried to improve this by having separate and more clear wording in the beginning of the log message stating the reason for this message.

And why exactly do we need this separation? Especially, why do you need to produce a different log entry based on the used log severity?

They are logged in the same context, but the situation that's warned about is different, so different messages for "an error happened" and "this was surprisingly slow" don't sound like a bad idea to me. We don't need this but we might want better log messages and I tried to make a suggestion in that direction.

@yhabteab
Copy link
Member Author

They are logged in the same context, but the situation that's warned about is different, so different messages for "an error happened" and "this was surprisingly slow" don't sound like a bad idea to me.

You are going to see either one of those at a time, the debug or the warning message, but never both entries at the same time, so separating them makes little sense to me. Apart from that, it shouldn't make any difference in the end result whether you prefix the message with Slow or not.

@yhabteab
Copy link
Member Author

Did you change the log message because you think it's an improvement to the log message or because it allowed you to combine the different log messages into basically the same with just minor differences for the different situations?

No, not as an improvement, but I just wanted to perform one such thing rather than x times that are actually not different from one another.

lib/remote/jsonrpcconnection.hpp Outdated Show resolved Hide resolved
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from fa1bff2 to ef2277f Compare November 15, 2024 15:57
lib/remote/jsonrpcconnection.cpp Show resolved Hide resolved
@yhabteab yhabteab force-pushed the log-slow-http-rpc-processing branch from ef2277f to 4564c06 Compare November 27, 2024 08:58
@julianbrost julianbrost merged commit 57df92f into master Nov 27, 2024
24 checks passed
@julianbrost julianbrost deleted the log-slow-http-rpc-processing branch November 27, 2024 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla/signed consider backporting Should be considered for inclusion in a bugfix release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants