Support keep alive property #32

raytung · 2022-03-03T08:49:24Z

Hey 👋

There was a change to Fluentd's out_forward awhile ago to support keep alive behaviour fluent/fluentd#2393. Is there any chance of implementing this in this client as well? The use case is identical to the original issue in fluent/fluentd#2188 where we'd like our clients to rebalance its connections over to newly spawned fluentd replicas.

The text was updated successfully, but these errors were encountered:

jamiees2 · 2022-03-03T15:27:48Z

Hey! Just to make sure I understand the feature request correctly, you would like to pass a parameter to the socket that enables TCP keepalive on the socket?

I'm not super familiar with this feature, we don't use it in our production, so just wondering if anything else is needed.

raytung · 2022-03-03T18:20:45Z

Hey @jamiees2 thanks for your reply. The property name is kind of confusing, while it's called "keep alive" it's not really related to TCP keep alive. It's more akin to "tcp connection refresh interval".

To describe the problem a bit more in details, say we have something like this

Client replica [1...N] ---------> Load balancer ------> Fluentd replica A

In the event where fluentd is underload, we would scale it horizontally, say add 1 more replica. However, adding an extra fluentd replica doesn't currently help rebalance the load since our existing clients are still connected to the original fluentd replica.

The out_forward plugin's solution to this problem is basically adding a parameter to periodically disconnect and reconnect to upstream fluentd, thereby balancing the load overtime.

raytung · 2022-03-03T21:56:09Z

For what it's worth, when we tried simulating this behaviour by having a basic setInterval(() => { client.disconnect().then(client.connect()); });, we found that this client crashes our NodeJS process. I've created a separate ticket to track that issue #33

jamiees2 · 2022-03-16T10:18:04Z

I've been thinking about how it's best to implement this, this is made a little complex by acknowledgements, because we don't want to miss an acknowledgement during this phase. I think what we ideally want to do is:

On an interval, close the socket for writes
If acks are enabled, wait for the ack queue to empty out
Once acks have been completed, shut down the socket
Reconnect the socket

I think this can be implemented with a new state, similar to IDLE, but I'll need to think through this a bit. I'm going to cut a release though, with your fixes, so that you're able to use the simulated behavior :)

It might mishandle acks, and might conflict with the reconnection behavior, but if it works for you, that's great!

jamiees2 · 2022-03-16T10:21:49Z

I wonder if there are other approaches we're missing here, for example we could instead of a hard timeout have a count limit, which prevents us sending more than X messages without reconnecting the socket. Would we consider this more flexible?

Actually, I guess we could implement both in the same way, by having a setInterval that sets a flag

raytung · 2022-03-16T22:08:58Z

On an interval, close the socket for writes
If acks are enabled, wait for the ack queue to empty out
Once acks have been completed, shut down the socket
Reconnect the socket

Thinking out loud: One potential area to look out for is in the event where upstream fluentd is unresponsive, the acks may never come back, which means we'll be stuck waiting. Perhaps we could rely on a short ack timeout in this case, and if ack timeout we shutdown the socket and reconnect?

Another avenue to explore is probably to straight up ignore acks when we're reconnecting, we re-emit the events in ack queue after we reconnected. Would this make the implementation simpler? Looks like this is what Fluentd's own out_forward does https://github.com/fluent/fluentd/blob/2036c4a6d791de24165d9575243eb71e5914b3b8/lib/fluent/plugin/out_forward/socket_cache.rb#L70-L95

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support keep alive property #32

Support keep alive property #32

raytung commented Mar 3, 2022

jamiees2 commented Mar 3, 2022

raytung commented Mar 3, 2022 •

edited

Loading

raytung commented Mar 3, 2022 •

edited

Loading

jamiees2 commented Mar 16, 2022

jamiees2 commented Mar 16, 2022

raytung commented Mar 16, 2022

Support keep alive property #32

Support keep alive property #32

Comments

raytung commented Mar 3, 2022

jamiees2 commented Mar 3, 2022

raytung commented Mar 3, 2022 • edited Loading

raytung commented Mar 3, 2022 • edited Loading

jamiees2 commented Mar 16, 2022

jamiees2 commented Mar 16, 2022

raytung commented Mar 16, 2022

raytung commented Mar 3, 2022 •

edited

Loading

raytung commented Mar 3, 2022 •

edited

Loading