How to reach 100% uptime with Hetzner load balancers? #383

AndrewBedscastle · 2024-07-22T20:28:08Z

AndrewBedscastle
Jul 22, 2024

Using a "premium" cloud provider auch as GKE, AWS, azure this is not an issue, but unfortunately on Hetzner it is:
The next maintenance is scheduled for load balancers: https://status.hetzner.com/incident/cd0ebfd2-8985-4aae-8be5-6548558c0f8c

The last maintenance was I think in December and it was not only a few connections dropped but a downtime of 20-30 minutes which contradicts the purpose of a load balancer. But we cannot change that.

What we did is to deploy another load balancer (together with an nginx controller in a different region, e.g. Falkenstein and Helsinki)

That leads to another A record, but we made our clients aware of another http entrypoint as we develop a service which needs to reach 100% uptime. We had that for more than a decade on Google App engine but we're leaving it now for various reasons.

Do you have a better idea to reach the best possible uptime?

Best regards

vitobotta · 2024-07-22T20:31:59Z

vitobotta
Jul 22, 2024
Maintainer

Oh, I wasn't aware of upcoming maintenance for the load balancers, thanks for the heads up. So far I have been using Hetzner for personal, non critical stuff to be honest. At work we use GKE since forever, and although I have proposed Hetzner to cut costs (using my tool), others prefer Google as it's supposed to be more reliable. So so far I haven't had any cases where a brief downtime would cause me serious problems.

0 replies

AndrewBedscastle · 2024-07-22T20:42:30Z

AndrewBedscastle
Jul 22, 2024
Author

Yes, that's were Google is doing an insane Job, using project maglev
https://research.google/pubs/maglev-a-fast-and-reliable-software-network-load-balancer/

I just hope they (Hetzner) won't update every region concurrently, then our approach might work.

I'm a bit surprised you use it only for personal stuff to be honest :-)

Our plan is to run production workloads (and we already do)

Do you offer paid (!!) consulting? The money we save using hetzner instead of GAE / GKE....

There are no open questions atm, but these will come up for sure.

2 replies

vitobotta Jul 22, 2024
Maintainer

Yeah I trust Hetzner enough as I have been a customer for many years and I find it very reliable and performant considering the amazing prices. I would like a lot to use it at work too, I just need to manage to convince the team to switch to Hetzner :)

IMO Hetzer is definitely fine for production workloads.

As for consulting, it really depends. If it's something that say requires mostly advice and can be done asynchronously even via email or something, then it's easy and the answer is yes. But if it's something that would actively require meetings and/or me to write code and set things up then it depends, we'd need to discuss the details before I can commit to it or not. You can find my LinkedIn here https://vitobotta.com/ - get in touch so we can have a chat if/when you need :)

AndrewBedscastle Jul 24, 2024
Author

Thank you very much, I'll contact you (when something comes up)
Definitely no coding!
Only advice!

@ALL

Any other ideas concerning the 100% uptime approach?

jrudolph · 2024-08-03T13:42:29Z

jrudolph
Aug 3, 2024

As a workaround during downtime you can work without Hetzner's load balancers and directly point your DNS to one of your nodes directly and configure the ingress to run on public port 443. Of course, you then require that node to front all the HTTP (and ssl) traffic, so you need to make sure it can handle the load.

Real custom high availability solutions are hard to build because it often needs layer 2/3 access (routing tables) to fail over the load balancers.

In general, 100% uptime is hard to achieve on Hetzner, I ran into multiple issues with their cloud infrastructure often leading to half days+ of downtime (but I dont have any critical systems, maybe if the pressure is higher and you are a bigger customer, it's easier to escalate).

1 reply

AndrewBedscastle Aug 4, 2024
Author

Thanks for your post. Could you elaborate on that a bit? What was down and for how long?
We're using Hetzner for one year now and all we had was a 20 min downtime on the load balancer.

Btw the last lb maintenance is over and we could not even see it in our monitoring (30s interval)
That seems to have been conducted much smoother than last time.
Did you notice any downtime?

Best regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reach 100% uptime with Hetzner load balancers? #383

{{title}}

Replies: 3 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to reach 100% uptime with Hetzner load balancers? #383

AndrewBedscastle Jul 22, 2024

Replies: 3 comments · 3 replies

vitobotta Jul 22, 2024 Maintainer

AndrewBedscastle Jul 22, 2024 Author

vitobotta Jul 22, 2024 Maintainer

AndrewBedscastle Jul 24, 2024 Author

jrudolph Aug 3, 2024

AndrewBedscastle Aug 4, 2024 Author

AndrewBedscastle
Jul 22, 2024

Replies: 3 comments 3 replies

vitobotta
Jul 22, 2024
Maintainer

AndrewBedscastle
Jul 22, 2024
Author

vitobotta Jul 22, 2024
Maintainer

AndrewBedscastle Jul 24, 2024
Author

jrudolph
Aug 3, 2024

AndrewBedscastle Aug 4, 2024
Author