Skip to content

Commit

Permalink
Update docs as per PR comments
Browse files Browse the repository at this point in the history
  • Loading branch information
petrutlucian94 committed Sep 25, 2024
1 parent a4e8828 commit 33ed437
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 44 deletions.
78 changes: 36 additions & 42 deletions docs/src/snap/howto/2-node-ha.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,38 @@
# 2-Node Active-Active HA using Dqlite
# 2-Node Active-Active High-Availability using Dqlite

## Rationale

High availability is a mandatory requirement for most production-grade
High availability (HA) is a mandatory requirement for most production-grade
Kubernetes deployments, usually implying three or more nodes.

However, 2-node HA clusters are desired in some situations due to cost saving
and operational efficiency considerations. Follow this guide to learn how
Canonical Kubernetes can achieve high availability with just two nodes
while using the default datastore, Dqlite.
2-node HA clusters are sometimes preferred for cost savings and operational
efficiency considerations. Follow this guide to learn how Canonical Kubernetes
can achieve high availability with just two nodes while using the default
datastore, Dqlite.

Dqlite cannot achieve Raft quorum with less than three nodes. This means that
Dqlite will not be able to replicate data and the secondaries will simply
forward the queries to the primary node.

In the event of a node failure, the database will have to be recovered by
following the steps outlined in the [Dqlite recovery guide].
In the event of a node failure, database recovery will require following the
steps in the [Dqlite recovery guide].

## Proposed solution

Since Dqlite data replication is not available in this situation, we propose
using synchronous block level replication through DRBD.
using synchronous block level replication through
[Distributed Replicated Block Device] (DRBD).

The cluster monitoring and failover process will be handled by Pacemaker and
Corosync. In the event of a node failure, the DRBD volume will be mounted on
the replica, which can then access the most recent version of the Dqlite database.
Corosync. After a node failure, the DRBD volume will be mounted on the standby
node, allowing access to the latest Dqlite database version.

Additional recovery steps are automated and invoked through Pacemaker.

## Alternatives

Another possible approach is to use PostgreSQL with Kine and logical replication.
However, it is outside the scope of this document.
Another possible approach is to use PostgreSQL with Kine and logical
replication. However, it is outside the scope of this document.

See the [external datastore guide] for more information on how Canonical
Kubernetes can be configured to use other datastores.
Expand All @@ -40,30 +41,20 @@ Kubernetes can be configured to use other datastores.

### Prerequisites

Make sure that:

* Both nodes have joined the Kubernetes cluster.
* Ensure both nodes are part of the Kubernetes cluster.
See the [getting started] and [add/remove nodes] guides.
* The user associated with the HA service has SSH access to the peer node and
passwordless sudo configured. For simplicity, the default "ubuntu" user can
be used.
* We recommend using static IP configuration.

The [2ha.sh script] automates most operations related to the 2-node HA scenario.
Retrieve it like so:

```
sudo mkdir -p /var/snap/k8s/common
repo=https://raw.githubusercontent.com/petrutlucian94/k8s-snap
sudo curl $repo/refs/heads/KU-1606/2ha_script/k8s/hack/2ha.sh \
-o /var/snap/k8s/common/2ha.sh
sudo chmod a+rx /var/snap/k8s/common/2ha.sh
```
The [2ha.sh script] automates most operations related to the 2-node HA scenario
and is included in the snap.

The first step is to install the required packages:

```
/var/snap/k8s/common/2ha.sh install_packages
/snap/k8s/current/k8s/hack/2ha.sh install_packages
```

### DRBD
Expand Down Expand Up @@ -112,7 +103,7 @@ sudo systemctl start rc-local.service
```

Let's configure the DRBD block device that will hold the Dqlite data.
Make sure to use the right node addresses.
Ensure the correct node addresses are used.

```
# Disable the DRBD service, it will be managed through Pacemaker.
Expand Down Expand Up @@ -143,8 +134,8 @@ sudo drbdadm status
```

Let's create a mount point for the DRBD block device. Non-default mount points
need to be passed to the 2ha.sh script mentioned above, see the script for the
full list of configurable parameters.
need to be passed to the ``2ha.sh`` script mentioned above, see the script for
the full list of configurable parameters.

```
DRBD_MOUNT_DIR=/mnt/drbd0
Expand Down Expand Up @@ -231,7 +222,7 @@ Let's define a Pacemaker resource for the DRBD block device, which
ensures that the block device will be mounted on the replica in case of a
primary node failure.

Pacemaker fencing (stonith) configuration is environment specific and thus
[Pacemaker fencing] (stonith) configuration is environment specific and thus
outside the scope of this guide. However, we highly recommend using fencing
if possible to reduce the risk of cluster split-brain situations.

Expand All @@ -257,8 +248,8 @@ EOF
Before moving forward, let's ensure that the DRBD Pacemaker resource runs on
the primary (voter) Dqlite node.

Remember that in our case only the primary node contains the latest Dqlite data,
which will be transfered to the DRBD device once the clustered service starts.
In this setup, only the primary node holds the latest Dqlite data, which will
be transferred to the DRBD device once the clustered service starts.
This is automatically handled by the ``2ha_k8s.sh start_service`` command.

```
Expand All @@ -280,7 +271,7 @@ sudo crm resource clear fs_res
### Kubernetes services

We can now turn our attention to the Kubernetes services. Ensure that the k8s
snap services no longer start automatically. Instead, they will be manged by a
snap services no longer start automatically. Instead, they will be managed by a
wrapper service.

```
Expand All @@ -291,8 +282,9 @@ done
```

The next step is to define the wrapper service. Add the following to
``/etc/systemd/system/2ha_k8s.service``. Note that the sample uses the ``ubuntu``
user, feel free to use a different one as long as the prerequisites are met.
``/etc/systemd/system/2ha_k8s.service``. Note that the sample uses the
``ubuntu`` user, feel free to use a different one as long as the prerequisites
are met.

```
[Unit]
Expand All @@ -303,7 +295,7 @@ After=network.target pacemaker.service
User=ubuntu
Group=ubuntu
Type=oneshot
ExecStart=/bin/bash /var/snap/k8s/common/2ha.sh start_service
ExecStart=/bin/bash /snap/k8s/current/k8s/hack/2ha.sh start_service
ExecStop=/bin/bash sudo snap stop k8s
RemainAfterExit=true
Expand All @@ -313,15 +305,15 @@ WantedBy=multi-user.target

```{note}
The ``2ha.sh start_service`` command used by the service wrapper automatically
detects the expected Dqlite role based on the DRBD state and takes the necessary
steps to bootstrap the Dqlite state directories, synchronize with the peer node
(if available) and recover the database.
detects the expected Dqlite role based on the DRBD state and takes the
necessary steps to bootstrap the Dqlite state directories, synchronize with the
peer node (if available) and recover the database.
```

We need the ``2ha_k8s`` service to be restarted once a DRBD failover occurs.
For that, we are going to define a separate service that will be invoked by
Pacemaker. Create a file called ``/etc/systemd/system/2ha_k8s_failover.service``
containing the following:
Pacemaker. Create a file called
``/etc/systemd/system/2ha_k8s_failover.service`` containing the following:

```
[Unit]
Expand Down Expand Up @@ -416,8 +408,10 @@ sudo drbdadm connect r0
```

<!--LINKS -->
[Distributed Replicated Block Device]: https://ubuntu.com/server/docs/distributed-replicated-block-device-drbd
[Dqlite recovery guide]: restore-quorum
[external datastore guide]: external-datastore
[2ha.sh script]: https://github.com/canonical/k8s-snap/blob/main/k8s/hack/2ha.sh
[getting started]: ../tutorial/getting-started
[add/remove nodes]: ../tutorial/add-remove-nodes
[Pacemaker fencing]: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/fencing.html
7 changes: 5 additions & 2 deletions k8s/hack/2ha.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
#!/bin/bash

# This script automates various operations on 2-node HA A-A Canonical K8s
# clusters that use the default datastore, Dqlite.
#
# Prerequisites:
# * required packages installed using the "install_packages" command.
# * initialized k8s cluster, both nodes joined
# * initialized K8s cluster, both nodes joined
# * the current user has ssh access to the peer node.
# - used to handle k8s services and transfer dqlite data
# - used to handle K8s services and transfer Dqlite data
# * the current user has passwordless sudo enabled.
sourced=0

Expand Down

0 comments on commit 33ed437

Please sign in to comment.