Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lease expiration on standalone Eureka server with self-presenvation enabled is broken #4101

Open
ViliusS opened this issue Jun 12, 2022 · 11 comments

Comments

@ViliusS
Copy link
Contributor

ViliusS commented Jun 12, 2022

It looks like lease expiration on standalone Eureka server with self-presenvation enabled is completely broken. I have the following configuration:

eureka.instance.hostname=localhost
eureka.client.serviceUrl.defaultZone=http://localhost:8761/eureka/
eureka.client.register-with-eureka=false
eureka.client.fetch-registry=false
eureka.server.renewalThresholdUpdateIntervalMs=60000
# this should be the same as eureka.instance.lease-renewal-interval-in-seconds on the clients
eureka.server.expected-client-renewal-interval-seconds=10

If I start completely empty server with this configuration it says:

c.n.e.r.PeerAwareInstanceRegistryImpl    : Got 1 instances from neighboring DS node
c.n.e.r.PeerAwareInstanceRegistryImpl    : Renew threshold is: 5

I tracked it down to defaultOpenForTrafficCount which by default is set to 1, changed to
eureka.instance.registry.defaultOpenForTrafficCount=0 . Now the server starts with:

c.n.e.r.PeerAwareInstanceRegistryImpl    : Got 0 instances from neighboring DS node
c.n.e.r.PeerAwareInstanceRegistryImpl    : Renew threshold is: 0

Issue however not solved because even if I add dozen of eureka clients the threshold never changes even when is actually recalculated. I get c.n.e.r.PeerAwareInstanceRegistryImpl : Current renewal threshold is : 0 in the logs every minute.

Hence, lease expiration never kicks in unless I completely disable self-preservation mode.

@ViliusS
Copy link
Contributor Author

ViliusS commented Jun 13, 2022

To add on top of that, looks like expectedNumberOfClientsSendingRenews is never passed to Eureka and is always calculated by Eureka itself automatically.

@OlgaMaciaszek
Copy link
Collaborator

Hello @ViliusS , please provide a minimal, complete, verifiable example that reproduces the issue.

@ViliusS
Copy link
Contributor Author

ViliusS commented Sep 3, 2022

Here you go
eureka.zip

Just run this Maven project which should start Eureka server in standalone mode. After that, connect any number of Eureka clients to it. In the logs you will always see: c.n.e.r.PeerAwareInstanceRegistryImpl : Current renewal threshold is : 0.

This doesn't allow leases to expire, unless you disable self preservation mode.

@OlgaMaciaszek
Copy link
Collaborator

For defaultOpenForTrafficCount the javadoc sais "Should be set to 0 for peer replicated eurekas". For non-peer-replicated eurekas it should be one. Why are you setting it to 0 for testing with one server instance?

@OlgaMaciaszek
Copy link
Collaborator

OlgaMaciaszek commented Sep 5, 2022

I have changed the value referenced above back to 0 and return the sample. Also, the value of expectedNumberOfClientsSendingRenews is passed initially while creating the registry, that is modified by the openForTraffic() method call. The javadoc for that method sais: "If PeerAwareInstanceRegistryImpl#openForTraffic(ApplicationInfoManager, int) is called with a zero argument, it means that leases are not automatically cancelled if the instance hasn't sent any renewals recently. This happens for a standalone server. It seems like a bad default, so we set it to the smallest non-zero value we can, so that any instances that subsequently register can bump up the threshold."
Since we're running a standalone server here, it seems to behave as documented.

@OlgaMaciaszek
Copy link
Collaborator

However, after I've run a client against the updated sample, when register is called, the value I set in expectedNumberOfClientsSendingRenews + 1 is being used to set the value of expectedNumberOfClientsSendingRenews before calling updateRenewsPerMinThreshold(), so that also behaves as expected.

@ViliusS
Copy link
Contributor Author

ViliusS commented Sep 5, 2022

defaultOpenForTrafficCount other than zero doesn't work if I change expected-client-renewal-interval-seconds to let's say 15 or 10 seconds. In such case, renewal threshold is always much higher than it should be for the lease expiration to work correctly.

So, maybe defaultOpenForTrafficCount is good for standard settings, but math definitely breaks somewhere with custom settings.

Btw, I don't understand why defaultOpenForTrafficCount javadoc says that it should be 1 for standalone eurekas. If standalone replica doesn't have any peers and if eureka.client.register-with-eureka=false and eureka.client.fetch-registry=false, like it should be for non-replica case, then it means Eureka doesn't have any clients during startup, does it?

@OlgaMaciaszek
Copy link
Collaborator

@ViliusS sorry for not getting back to you on this, however, the only thing we do here is pass the properties provided by the users to Netflix/Eureka's InstanceRegistry; defaultOpenForTrafficCount is only really used if the count of instances from peer node sync is 0, useful for non-replicated nodes to indicate that there's 1 instance; all the insatnce renewal and lease expiration logic is handled directly by https://github.com/Netflix/eureka, so if you see any issues with that, you may want to create an issue there instead.

@ViliusS
Copy link
Contributor Author

ViliusS commented Sep 30, 2024

@OlgaMaciaszek while waiting for the answer I've already filled an issue at Netflix/eureka#1459 , however I still think defaultOpenForTrafficCount default is wrong. The real problem is that defaultOpenForTrafficCount in Eureka is used for both, to indicate how many clients are there connected on startup AND it is also used in lease renewal/expiration math.

It was too long, so I don't remember the exact math, but the reason why setting it to 1 by default when there is really no clients connected is wrong, is that when used with expected-client-renewal-interval-seconds set to anything other than default (let's say 15 seconds) this makes renewal thresholds too large for lease expiration to work correctly.

So, in the end both fixes are needed:

  1. Eureka needs to fix lease expiration on standalone instances. Maybe don't use the same property for math, or improve renewal threshold math another way.
  2. Spring Cloud Netlifx module could improve documentation. I don't have strong opinion on defaultOpenForTrafficCount default value, however if a user wants to modify expected-client-renewal-interval-seconds I think two configuration options needs to be mentioned:
    a) set defaultOpenForTrafficCount to 0 and disable self-preservation when using standalone instance
    b) use peered instances
    Only these two options provide with correct lease expiration with custom expected-client-renewal-interval-seconds value.

@OlgaMaciaszek
Copy link
Collaborator

The defaultOpenForTrafficCount equal to 1 reflects the actual situation (the count of open instances being in reality 1) so the issue should be fixed in the other project rather than changing documentation.

@ViliusS
Copy link
Contributor Author

ViliusS commented Oct 2, 2024

The defaultOpenForTrafficCount equal to 1 reflects the actual situation (the count of open instances being in reality 1) so the issue should be fixed in the other project rather than changing documentation.

Is that really true? As far as I see this property is sent to count parameter of openForTraffic in Eureka. I didn't find exact documentation on the parameter itself, but from what I see in the code here https://github.com/Netflix/eureka/blob/015400c60d3dc730c3fc4871e9b586d3805cce0d/eureka-core/src/main/java/com/netflix/eureka/registry/PeerAwareInstanceRegistryImpl.java#L240 this is the total Eureka client count, not the Eureka instance count. Hence, in standalone Eureka instance on a startup with 0 clients connected it should be 0.

You can also see from here https://github.com/Netflix/eureka/blob/015400c60d3dc730c3fc4871e9b586d3805cce0d/eureka-server-governator/src/main/java/com/netflix/eureka/EurekaContextListener.java#L28 going down to https://github.com/Netflix/eureka/blob/015400c60d3dc730c3fc4871e9b586d3805cce0d/eureka-core/src/main/java/com/netflix/eureka/registry/PeerAwareInstanceRegistryImpl.java#L209-L237 that Eureka itself uses that parameter for sending total client count (even though it is confusedly named registryCount in some places).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants