-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
migrate to zap logging #135
migrate to zap logging #135
Conversation
The function was writing HTTP responses and logging, both of which are better done in the actual HTTP handler methods. When these helper methods know about HTTP and can (but don't always) write responses, it makes the handlers awkward to write, because it's never clear whether they should just return afterwards, or if they can do any error recovery or better logging, etc. Without the logging and sometimes writing of HTTP responses, the helper wasn't doing anything, and so it's easily removed. Also, the error handling of FindConfirmation is made consistent as an internal server error. BACK-2500
The function was writing HTTP responses and logging, both of which are better done in the actual HTTP handler methods. When these helper methods know about HTTP and can (but don't always) write responses, it makes the handlers awkward to write, because it's never clear whether they should just return afterwards, or if they can do any error recovery or better logging, etc. As one example, findResetConfirmation would write 404 responses if the confirmation wasn't found. The handler ignored that one particular case, and it wasn't obvious why it was ignoring it. Even as the handler ignored it, it still had to have a nil check/guard statement, which indented the code awkwardly and left the reader wondering just what was returned the user if conf was nil. BACK-2500
During code review, it was requested that the zap logger be used. I might have gone a bit overboard, but I've moved everything over to zap. I think it's still TBD if zap is what we really want, but at least now we can see it in action, and better evaluate if it brings advantages over other options. Zap produces a stack trace at the Error log level, so I used that as a guide. In general, any time a function returned an error, and that error caused the HTTP handler to abort its processing with a 5XX HTTP status, I used the Error level for the log, so we'll log the stack trace that led up to the error. In the event that an error was returned and the handler returned a 4XX HTTP status, I used the Info log level. My thinking was if the handler was telling the client that the error was on their side, then whatever the Go error was, we didn't need a stack trace. This happens a lot for example when a request body contains invalid JSON. The Unmarshal call will fail with an error and the handler returns a HTTP 400 or similar. A stack trace in that event wouldn't be useful. In general, I didn't use the Warn level much, as it behaves differently in prod versus non-prod. In the former, it doesn't include a stack trace, while in the latter it does. I used it during an optional permissions check. In development, we'll get better debug info, but in production we'll skip the stack trace, as the error isn't going to cause an abort of the handler method. I also refactored many API responses to use the existing sendError and other related methods. This should help maintain consistency in the future, and should we decide to remove zap for example, we'll have fewer places to modify, as the move to using sendError and friends has greatly reduced the number of log calls that were scattered around the code base. (Last I checked, the number of logging sites was reduced from 153 to 49, according to rough grep checks.) Another request from code review, was that logging should include the userId where one existed. This was accomplished using a middleware function, such that the userId is added to the logger's context and will automatically appear in the log output. There's no need to manually specify it at the logging call site. The zap dependency version was updated to 1.22 to get access to the WithOptions method of the Sugared logger, which makes some of our logging cleaner. BACK-2500
That's the InvocationParams found in from go.uber.org/fx BACK-2500
When migrating to zap logging, the Api.logger was modified by middleware that was adding the request's userID to the logger's context in a way that wasn't thread-safe. The result was that many many non-sensical userIDs were logged in the context, producing log output that wasn't at all useful. The best way to fix this was to properly generate a child logger for each request with the appropriate userID in its logging context, then pass that logger into the handlers via their request's context. As a result, the fix touches a lot of code—anywhere that logs, now needs to retrieve the request-specific logger from that request's context. Tests were written that attempt to exacerbate the problem, but as its a caused by race conditions, it's very hard to reliably reproduce. Flappiness in the test is a red flag though, and comments were added to indicate that. BACK-2500
From a previous code review it was pointed out that some amount of context was lost from some log messages. I have reviewed all the logging changes, and found these places, and restored them. I may have missed a few, but I made a best-effort attempt. BACK-2500
TL;DR If you see a logging site that has lost context, please point it out specifically! Included in this new PR is another round of review for every logging site, looking at the potential loss of any context. Sometimes it may look like a loss of context at first glance, but there are several factors that aren't always obvious that I took into consideration. A partial list:
So please, if you see a place where you think context has been lost, please double check that one of the above doesn't replace the missing context/information, and if there's still a lack somewhere, then please call out each site. I've been through every logging site laboriously looking for places where I missed something, and I'd really rather not having to comb over everything yet again. :) Thank you! |
return | ||
} | ||
|
||
defer req.Body.Close() | ||
var ib = &ClinicInvite{} | ||
if err := json.NewDecoder(req.Body).Decode(ib); err != nil { | ||
a.logger.Errorw("error decoding invite", zap.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this info is included in the sendError
call.
@@ -85,37 +86,27 @@ func (a *Api) passwordReset(res http.ResponseWriter, req *http.Request, vars map | |||
a.logMetricAsServer("reset confirmation sent") | |||
} else { | |||
a.logMetricAsServer("reset confirmation failed to be sent") | |||
log.Print("Something happened generating a passwordReset email") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving this out as a) addOrUpdateConfirmation
already does logging, and b) it's not useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean createAndSendNotification
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops, yes. And I just double checked again to make sure, and yes, everywhere it returns false, it logs.
if err != nil { | ||
log.Printf("findResetConfirmation: error [%s]\n", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller of this function will report the error, there' no need for duplications.
The same applies to all the logging in this function. The context of whether it wasn't found, or expired is maintained.
Here's the only place this method is called if you want to double check:
https://github.com/tidepool-org/hydrophone/blob/eric%2Fsend-invite-v2.2-logging/api/forgot.go#L138
// Find this confirmation | ||
// write error if it fails | ||
func (a *Api) addProfile(conf *models.Confirmation) error { | ||
if conf.CreatorId != "" { | ||
if err := a.seagull.GetCollection(conf.CreatorId, "profile", a.sl.TokenProvide(), &conf.Creator.Profile); err != nil { | ||
log.Printf("error getting the creators profile [%v] ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked that the caller logs the error in all cases.
return | ||
} | ||
if conf == nil { | ||
statusErr := &status.StatusError{Status: status.NewStatus(http.StatusNotFound, statusInviteNotFoundMessage)} | ||
log.Println("AcceptInvite ", statusErr.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AcceptInvite will be logged by zap, as it logs the function name from which it was called.
@@ -163,7 +132,6 @@ func (a *Api) sendSignUp(res http.ResponseWriter, req *http.Request, vars map[st | |||
return | |||
} else { | |||
a.logMetric("signup confirmation failed to be sent", req) | |||
log.Print("Something happened generating a signup email") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not useful. Plus, createAndSendNotification
writes log messages that actually contain useful information.
@@ -232,7 +202,6 @@ func (a *Api) resendSignUp(res http.ResponseWriter, req *http.Request, vars map[ | |||
a.logMetricAsServer("signup confirmation re-sent") | |||
} else { | |||
a.logMetricAsServer("signup confirmation failed to be sent") | |||
log.Print("resendSignUp: Something happened trying to resend a signup email") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not useful. Plus, createAndSendNotification
writes log messages that actually contain useful information.
@@ -144,7 +145,6 @@ func (c *MongoStoreClient) FindConfirmation(ctx context.Context, confirmation *m | |||
opts := options.FindOne().SetSort(bson.D{{Key: "created", Value: -1}}) | |||
|
|||
if err = confirmationsCollection(c).FindOne(ctx, query, opts).Decode(&result); err != nil && err != mongo.ErrNoDocuments { | |||
log.Printf("FindConfirmation: something bad happened [%v]", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked that the caller logs these errors.
@@ -110,23 +110,7 @@ func (c *SesNotifier) Send(to []string, subject string, msg string) (int, string | |||
|
|||
// Display error messages if they occur. | |||
if err != nil { | |||
if aerr, ok := err.(awserr.Error); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this information is output in the string representation of the AWS error, so nothing's lost here.
@@ -26,7 +25,6 @@ func NewHandler(store clients.StoreClient) events.EventHandler { | |||
} | |||
|
|||
func (h *handler) HandleDeleteUserEvent(payload events.DeleteUserEvent) error { | |||
log.Printf("Deleting confirmations for user %v", payload.UserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed as not useful. I guess I could re-add it as a trace-level, if this is somehow useful to someone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. Since we are deleting user data it should even be logged at info level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
Requested in a previous code review. BACK-2500
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some issues and some nitpicks
api/clinic.go
Outdated
if err != nil { | ||
a.sendError(res, http.StatusInternalServerError, STATUS_ERR_FINDING_CONFIRMATION, err) | ||
a.sendError(ctx, res, http.StatusInternalServerError, STATUS_ERR_FINDING_CONFIRMATION, err, | ||
fmt.Sprintf("clinic %s user already has or had an invite from %v", clinicId, inviterID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's better if you pass clinicId
and inviterId
a structured arguments to the logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I just copied the existing text, but I can do that.
api/clinicianInvites.go
Outdated
statusErr := a.sendClinicianConfirmation(req, confirmation) | ||
if statusErr != nil { | ||
a.sendError(res, statusErr.Code, statusErr.Reason, statusErr.Error()) | ||
code, msg, optionalErr := a.sendClinicianConfirmation(req, confirmation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having more than two arguments is a code smell. Consider wrapping all non-error return args in a struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out the code is superfluous anyway, removing it.
api/clinicianInvites.go
Outdated
if statusErr != nil { | ||
a.sendError(res, statusErr.Code, statusErr.Reason, statusErr.Error()) | ||
code, msg, optionalErr := a.sendClinicianConfirmation(req, confirmation) | ||
if code != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic value. What is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's removed. :D
api/clinicianInvites.go
Outdated
} | ||
|
||
a.logMetric("clinician_invite_sent", req) | ||
return nil | ||
return 0, "", nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having more than two arguments is a code smell. There's already a struct that wraps a code and a message status.Status
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
api/forgot.go
Outdated
return | ||
} | ||
|
||
resetCnf.Email = email | ||
//there is nothing more to do other than notify the user | ||
resetCnf.UpdateStatus(models.StatusCompleted) | ||
a.logger(ctx).With(zap.String("email", email)).Info(STATUS_RESET_NO_ACCOUNT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a duplicate of L63, but with a different error level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing what was there previously, I've removed this line.
api/hydrophoneApi.go
Outdated
c.Handle("/resend/signup/{useremail}", varsHandler(a.resendSignUp)).Methods("POST") | ||
c.Handle("/resend/invite/{inviteId}", varsHandler(a.ResendInvite)).Methods("PATCH") | ||
c.Handle("/resend/signup/{useremail}", vars(a.resendSignUp)).Methods("POST") | ||
c.Handle("/resend/invite/{inviteId}", uid(vars(a.ResendInvite))).Methods("PATCH") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary use of uid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was another on 247 that I've also removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and I double checked the rest.
rtr.Handle("/v1/clinics/{clinicId}/invites/clinicians/{inviteId}", varsHandler(a.GetClinicianInvite)).Methods("GET") | ||
rtr.Handle("/v1/clinics/{clinicId}/invites/clinicians/{inviteId}", varsHandler(a.ResendClinicianInvite)).Methods("PATCH") | ||
rtr.Handle("/v1/clinics/{clinicId}/invites/clinicians/{inviteId}", varsHandler(a.CancelClinicianInvite)).Methods("DELETE") | ||
c.Handle("/v1/clinics/{clinicId}/invites/patients", vars(a.GetPatientInvites)).Methods("GET") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth adding clinicId
to the logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a stab at this and see how it works.
api/hydrophoneApi.go
Outdated
nonErrs = append(nonErrs, extra) | ||
} | ||
} | ||
return nonErrs, errs, fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider creating a struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -26,7 +25,6 @@ func NewHandler(store clients.StoreClient) events.EventHandler { | |||
} | |||
|
|||
func (h *handler) HandleDeleteUserEvent(payload events.DeleteUserEvent) error { | |||
log.Printf("Deleting confirmations for user %v", payload.UserID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. Since we are deleting user data it should even be logged at info level.
api/patientInvites.go
Outdated
a.sendModelAsResWithStatus(res, statusErr, http.StatusInternalServerError) | ||
// addOrUpdateConfirmation logs and writes a response on errors | ||
if !a.addOrUpdateConfirmation(ctx, conf, res) { | ||
a.sendError(ctx, res, http.StatusInternalServerError, STATUS_ERR_SAVING_CONFIRMATION, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error is already sent at this point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, good catch.
- fixes a logic error (re-factor gone wrong) - some logging tweaks to level or structured variables - re-worked sendClinicianConfirmation down to 2 return values - removed findResetConfirmation (low value, confusing results) - re-worked splitExtrasAndErrorsAndFields to return a struct Next commit will have a few more related tweaks. BACK-2500
As requested in code review. The addUserIDToLogger method was made general-purpose. In addition, instead of using it only on those handlers that have a userId/clinicId parameter, it's used as a generic middleware on all requests. This small performance hit makes it so that any future endpoints that are added will automatically log userid and clinicid, plus the endpoint definition code is a bit cleaner, as it's not having to call uid helpers all over the place. BACK-2500
I believe I've addressed all of your comments to date, Todd. If you see something that I didn't respond to, call it out, maybe GitHub just hid it from me. :/ But I didn't ignore anything intentionally. |
api/clinicianInvites.go
Outdated
@@ -394,11 +394,11 @@ func (a *Api) sendClinicianConfirmation(req *http.Request, confirmation *models. | |||
} | |||
|
|||
if !a.createAndSendNotification(req, confirmation, emailContent) { | |||
return http.StatusInternalServerError, STATUS_ERR_SENDING_EMAIL, nil | |||
return STATUS_ERR_SENDING_EMAIL, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should always return an error if you return a message otherwise in some places the code may not return a response, e.g.:
msg, err := a.sendClinicianConfirmation(req, confirmation)
if err != nil {
a.sendError(ctx, res, http.StatusInternalServerError, msg, err)
return
}
As reported in code review, returning an error is the best indicator that something has gone wrong. BACK-2500
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think extensive regression testing is required.
As requested originally in #126 then re-worked to build on top of #134
BACK-2500