-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Retry Logic to Only Retry on Server-Side HTTP Errors #1390
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've not reviewed it in full as I'm on mobile, but first we should refactor the tests to not use a third party assertion library
pkg/osv/osv_test.go
Outdated
"testing" | ||
"time" | ||
|
||
"github.com/stretchr/testify/assert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the std libs for asserting rather than pull in a new package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched to standard library assertions for simplicity. Let me know if there’s anything more to address. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a good start - can you also make sure you've run scripts/run_lints.sh
and address anything it brings up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: this needs to be reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the scripts/run_lints.sh
script, and the only issues flagged were spacing warnings outside of the code I wrote. Let me know if you'd like me to address those too.
func makeRetryRequest(action func() (*http.Response, error)) (*http.Response, error) { | ||
var resp *http.Response | ||
var err error | ||
|
||
for i := range maxRetryAttempts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: I think you've changed this loop more than you need to
for i := range maxRetryAttempts
should be equivalent to what you've got currently- you've removed the random jitter, which is intentionally present to smooth the back-off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you were right. Sorry for removing Jitter in the function. Added back!
pkg/osv/osv.go
Outdated
@@ -158,7 +158,7 @@ func chunkBy[T any](items []T, chunkSize int) [][]T { | |||
|
|||
// checkResponseError checks if the response has an error. | |||
func checkResponseError(resp *http.Response) error { | |||
if resp.StatusCode == http.StatusOK { | |||
if resp.StatusCode >= 200 && resp.StatusCode < 300 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I think we shouldn't change this line, since it's not needed for addressing #861
even though 2xx are success codes, we're only expecting the API currently to return a 200 and since this change isn't strictly needed for the rest of the change to work, I would lean towards not changing it for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been cargo-culting the HTTP retry logic from the examples in https://github.com/sethvargo/go-retry/blob/main/retry_test.go, for what it's worth, which treats 5xx errors as retryable, only. That said, I also recently saw https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429 in the wild, which could be retried with an exponential backoff approach.
pkg/osv/osv_test.go
Outdated
// Override the QueryEndpoint for testing | ||
originalQueryEndpoint := QueryEndpoint | ||
QueryEndpoint = server.URL | ||
defer func() { QueryEndpoint = originalQueryEndpoint }() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: there's no reason to do this, because QueryEndpoint
isn't used directly in the function you're testing
(that'll then mean you can make this test a parallel one)
pkg/osv/osv_test.go
Outdated
"time" | ||
) | ||
|
||
func TestRetryOn5xx(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: we should be testing for a couple of different status codes and orders
I'd recommend refactoring this into a table-based test, and we should have cases for at least:
- 200
- 4xx
- 5xx
- 5xx, then a 2xx
pkg/osv/osv_test.go
Outdated
|
||
log.Printf("TestRetryOn5xx: resp = %v, err = %v", resp, err) | ||
|
||
// Assertion: resp should be nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these comments feel pretty redundant
Hi @G-Rath, |
would you mind posting a couple of those here? as there shouldn't be any output, and my local doesn't give any |
I'm not sure if this is what you are doing but If you're trying to run the script in PowerShell, it likely won’t work and will just return a blank output. Try using Git Bash instead.
and here are the warnings: I'm fairly certain that running |
@VishalGawade1 only the last one is about spacing, the others are legitimate issues to be addressed in your pull request, along with other changes from my review that you're yet to make such as restoring the The
I'm not since the script is a shell script not a powershell script 😅 edit: right I misread - yeah I develop in WSL so it's all going through linux; by "blank output" I mean it will give you this:
|
yeah, I assumed just to be safe 😄 |
@G-Rath |
Generally speaking I think your current implementation is a bit ... lets say I believe that's why the (that is to say, technically "yes it should go because it's not being used anymore", but I might as well frontfoot the introduction of a new similar function while we're at it 🙂) |
I totally understand, sorry about that, and I really appreciate your feedback. I'll work on making the code cleaner and more readable. I'll remove |
…, and cleared all lint warnings
Hi @G-Rath , |
go.mod
Outdated
@@ -26,6 +26,7 @@ require ( | |||
github.com/package-url/packageurl-go v0.1.3 | |||
github.com/pandatix/go-cvss v0.6.2 | |||
github.com/spdx/tools-golang v0.5.5 | |||
github.com/stretchr/testify v1.9.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still necessary here? It looks like this is no longer being used.
pkg/osv/osv.go
Outdated
@@ -16,7 +16,7 @@ import ( | |||
"golang.org/x/sync/errgroup" | |||
) | |||
|
|||
const ( | |||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed these from const to var so previously I could adjust the endpoint, I will revert back to const
pkg/osv/osv.go
Outdated
func makeRetryRequest(action func() (*http.Response, error)) (*http.Response, error) { | ||
var resp *http.Response | ||
var err error | ||
const maxRetryAttempts = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two variables already exist at the top const section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I kept changing the variable so i wanted to use it locally. I will revert these changes as well
pkg/osv/osv.go
Outdated
break | ||
} | ||
backoff := time.Duration(i*i) * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear why jitter is changed here, the new implementation requires duplicating the jitter code? Can you revert this to the previous implementation
…ormat, and restored jitter implementation to avoid code duplication in makeRetryRequest function
Please review the changes. Due to the merge conflict, some modifications were added back during the restore in previous commit |
Changes Implemented
Fixes #861
Selective Retrying in osv.go:
Before: The retry logic did not differentiate between server-side and client-side HTTP errors, potentially leading to unnecessary retries on HTTP 4xx responses.
After: Updated the retry mechanism to only retry when the response status code is in the 500 range (HTTP 5xx). This prevents the system from retrying requests that are likely to fail due to client-side issues, thereby optimizing performance and reducing redundant network calls.
osv_test.go:
Verified that the updated retry logic correctly differentiates between HTTP 5xx and HTTP 4xx responses.
Ensured that retries are only attempted for HTTP 5xx errors by running and passing the TestRetryOn5xx test case.