Retry refactor to allow for batching to retry. #617

nicklas-dohrn · 2024-10-10T07:12:27Z

Description

This change mostly changes around, where the retry logic is applied.
Thinking about the things that need to be retried, this is likely to only be needed for networking.
So the logic of retrying should be done after the serialisation of the data for every protocol, to increase performance and locality of problem solutions.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Testing performed?

Unit tests
Integration tests
Acceptance tests

Checklist:

This PR is being made against the main branch, or relevant version branch
I have made corresponding changes to the documentation
I have added testing for my changes

If you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack

ctlong

I have a number of stylistic concerns that I've left as comments throughout the code.

In general your approach seems sound. However, I would really like to see some unit tests added for this retry logic in order to validate that it performs as expected.

ctlong · 2024-11-07T00:41:59Z

src/pkg/egress/syslog/https_batch.go

@@ -15,6 +15,7 @@ const BATCHSIZE = 256 * 1024

 type HTTPSBatchWriter struct {
 	HTTPSWriter
+	*Retryer


Why embed *Retryer rather than using a named field to encapsulate the new struct via composition?

I think that I would prefer composition in this case because I don't see a good reason to exposes the fields and methods of *Retryer directly.

I was using a Pointer here, so that I could insert the retryer by creating it in the writer factory layer, so that the implementation and logic does not rely on the writers itself to propagate the settings through to the retryer.
Might be a shortcoming due to my limited understanding of Go idiomatic concepts.

ctlong · 2024-11-07T00:46:28Z

src/pkg/egress/syslog/retryer.go

+)
+
+// RetryWriter wraps a WriteCloser and will retry writes if the first fails.
+type Retryer struct {


Why not just represent Retryer as some new fields and methods in HTTPSBatchWriter since that's the only writer set to use it?

I don't have a strong opinion on this, but I think I lean toward just putting all this code into HTTPSBatchWriter so that it's all more self-contained.

I was thinking about that as well, but when one would introduce the retryer in the way I built it here, it would have more than one benefit:

Retrying after stringifying the rfsyslog message seemed beneficial performance wise.

Having a connection aware retry logic for tls based approaches might result in a more efficient approach to solve connectivity issues (retries are only needed due to connectivity issues)

the retry_writer retries parsing issues for syslog messages #612 is also present for tls/tcp writers. this could be fixed with a dedicated pr.

src/pkg/egress/syslog/retryer.go

First retry implementation

f00fb11

nicklas-dohrn requested a review from a team as a code owner October 10, 2024 07:12

ctlong reviewed Nov 7, 2024

View reviewed changes

ctlong assigned nicklas-dohrn Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry refactor to allow for batching to retry. #617

Retry refactor to allow for batching to retry. #617

nicklas-dohrn commented Oct 10, 2024 •

edited

Loading

ctlong left a comment

ctlong Nov 7, 2024

ctlong Nov 7, 2024

nicklas-dohrn Dec 2, 2024

ctlong Nov 7, 2024

ctlong Nov 7, 2024

nicklas-dohrn Dec 2, 2024

Retry refactor to allow for batching to retry. #617

Are you sure you want to change the base?

Retry refactor to allow for batching to retry. #617

Conversation

nicklas-dohrn commented Oct 10, 2024 • edited Loading

Description

Type of change

Testing performed?

Checklist:

ctlong left a comment

Choose a reason for hiding this comment

ctlong Nov 7, 2024

Choose a reason for hiding this comment

ctlong Nov 7, 2024

Choose a reason for hiding this comment

nicklas-dohrn Dec 2, 2024

Choose a reason for hiding this comment

ctlong Nov 7, 2024

Choose a reason for hiding this comment

ctlong Nov 7, 2024

Choose a reason for hiding this comment

nicklas-dohrn Dec 2, 2024

Choose a reason for hiding this comment

nicklas-dohrn commented Oct 10, 2024 •

edited

Loading