-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NATS: Fix illegal resource version from storage: 0 (#274) #275
Conversation
Thanks for the contribution! Please sign-off your commit for DCO. @bruth can you comment if this is the best way to do this? |
acd594d
to
d71ad2e
Compare
Done! |
fwiw the bootstrap key is internal to k3s, I'm not sure this is the correct thing to create in kine. nats should already create a kine/pkg/drivers/nats/backend.go Lines 116 to 125 in 3773672
|
The There seems to be some kind of race condition going on which causes the error Here is an example of the Kine logs from the repro.
|
The constant string It should also have no effect as the value is immediately deleted. |
It sounds like the root cause here is that the health check key isn't getting created. Rather than creating and deleting another key, lets just fix the bug that's causing that key to not get created? |
d71ad2e
to
d18e887
Compare
The I was unable to identify why the method times out after the stream is created in NATS. Maybe someone with more in-depth knowledge on its inner workings can clarify it. Anyhow, adding some retry logic to the I have removed the previous bootstrap key creation when initializing the NATS backend in favour of the retry logic. |
d18e887
to
3773672
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
Signed-off-by: J. Rovira <[email protected]>
Will take a look today. |
The retry does work, but I am digging into why, once the KV bucket is created, the first write seemingly times out. It may be that the bucket is actually not yet ready to receive writes, but that should not be the case so I will dig a bit deeper on it. |
@bruth any update on this? I don't think we should fix this on the kine side if the issue is in nats. |
The latest version of the NATS Go client appears to fix the issue. At least I can't reproduce the bug with it. I will open a PR to update the NATS deps for Kine. |
I have tried the updated version from #281. From my testing, the issue still persists. It can be reproduced easily with just docker run --rm --network host nats:2.10.11 -js -DVV Then compile and run the updated git clone https://github.com/nats-io/kine
cd kine
git switch update-nats-versions
go build .
./kine --debug --endpoint "nats://?noEmbed=true" Kine will display the same error.
Full logs: kine
nats-server
|
Any update on this ? We are also facing the similar issue |
Found the root cause and updated my previous PR: #281. @jrovira-kumori feel free to validate on your end. |
I am happy to confirm that the issue has been resolved! I have tested it again with Thanks for the help! |
Fix for #274.
I am guessing this issue did not appear with K3s because the first command it runs when initializing is a
CREATE /bootstrap/...
. Therefore futureCOUNT ...
calls would always return!= 0
.This is not the case with Kubernetes KubeAPI which reads before writing anything and gets stuck in a loop of
illegal resource version from storage: 0
.This PR fixes the issue with the NATS driver by incrementing the BucketRevision number to 1 by creating and deleting a bootstrap key if a BucketRevision of 0 is detected at startup.