Replicated Stateful Set (RSS) Operator

Project status: beta

Most major planned features have been completed and while no breaking API changes are currently planned, we reserve the right to address bugs and API changes in a backwards incompatible way before the project is declared stable. See upgrade guide for safe upgrade process.

We expect to consider the operator stable soon; backwards incompatible changes will not be made once the project reaches stability.

See the roadmap.

Overview

The replication operator manages application clusters deployed to Kubernetes and automates tasks related to seeding, and operating a cluster. It has been adapted from the etcd operator from the CoreOS folks.

Create and destroy
Resize
Failover

There are examples of different applications and specs for driving them

Read why the operator exists and how replication is managed.

Read RBAC docs for how to setup RBAC rules for the replication operator if RBAC is in place.

Read Developer Guide for setting up development environment if you want to contribute.

See the Resources and Labels doc for an overview of the resources created by the rss-operator.

Requirements

Kubernetes 1.9+

Demo - Replicated Database

Getting started

Build the image

The demo uses a CentOS 7 based image which contains a number of scripts that wrap the galera OCF resource agent.

You will need to create an image from https://github.com/beekhof/galera-container and push it to docker/quay.

Once complete, modify example/cluster.yaml to point to the image you just published.

Deploy the replication operator

See instructions on how to install/uninstall replication operator .

Create and destroy a cluster

$ kubectl create -f example/cluster.yaml

A 3 member cluster will be created.

$ kubectl get pods
NAME            READY     STATUS    RESTARTS   AGE
rss-example-0   1/1       Running   0          3m
rss-example-1   1/1       Running   0          3m
rss-example-2   1/1       Running   0          3m
rss-operator    1/1       Running   0          6m

In addition to the pod status, we can check the state of the application by examining the Kubernetes Custom Resource representing the cluster:

$ kubectl -n dummy get rss/example -o=jsonpath='{"Primaries: "}{.status.members.primary}{"\n"}{"Members:   "}{.status.members.ready}{"\n"}'
Primaries: [rss-example-0 rss-example-2]
Members:   [rss-example-0 rss-example-1 rss-example-2]

See client service for how to access clusters created by the replication operator.

Destroy a replicated cluster:

$ kubectl delete -f example/cluster.yaml

Resize a cluster

Create a cluster:

$ kubectl create -f example/cluster.yaml

In example/cluster.yaml the initial cluster size is 3. Modify the file and change size from 3 to 5.

$ cat example/cluster.yaml
apiVersion: clusterlabs.org/v1alpha1
kind: ReplicatedStatefulSet
metadata:
  name: example
spec:
  replicas: 5
  reconcileInterval: 30s
  pod:
    antiAffinity: true
    commands:
      sequence: 
        command: ["/sequence.sh"]
      primary: 
        command: ["/start.sh"]
      seed: 
        command: ["/seed.sh"]
      status: 
        timeout: 60s
        command: ["/check.sh"]
      stop: 
        command: ["/stop.sh"]
    containers:
    - name: dummy
      image: quay.io/beekhof/dummy:latest

Apply the size change to the cluster CR:

$ kubectl apply -f example/cluster.yaml

The replicated cluster will scale to 5 members (5 pods):

$ kubectl get pods
NAME            READY     STATUS    RESTARTS   AGE
rss-example-0   1/1       Running   0          5m
rss-example-1   1/1       Running   0          5m
rss-example-2   1/1       Running   0          5m
rss-example-3   1/1       Running   0          2m
rss-example-4   1/1       Running   0          2m
rss-operator    1/1       Running   0          9m

Similarly we can decrease the size of cluster from 5 back to 3 by changing the size field again and reapplying the change.

$ cat example/cluster.yaml
apiVersion: clusterlabs.org/v1alpha1
kind: ReplicatedStatefulSet
metadata:
  name: example
spec:
  replicas: 3
  reconcileInterval: 30s
  pod:
    antiAffinity: true
    commands:
      sequence: 
        command: ["/sequence.sh"]
      primary: 
        command: ["/start.sh"]
      seed: 
        command: ["/seed.sh"]
      status: 
        timeout: 60s
        command: ["/check.sh"]
      stop: 
        command: ["/stop.sh"]
    containers:
    - name: dummy
      image: quay.io/beekhof/dummy:latest

$ kubectl apply -f example/cluster.yaml

We should see that the cluster will eventually reduce to 3 pods:

$ kubectl get pods
NAME            READY     STATUS    RESTARTS   AGE
rss-example-0   1/1       Running   0          8m
rss-example-1   1/1       Running   0          8m
rss-example-2   1/1       Running   0          8m
rss-operator    1/1       Running   0          11m

Failover

If any members crash, the replication operator will automatically recover the failure. Let's walk through in the following steps.

Create a cluster:

$ kubectl create -f example/cluster.yaml

Wait until all three members are up. Simulate a member failure by deleting a pod:

$ kubectl delete pod rss-example-0 --now

The replication operator will recover the failure by recreating the pod rss-example-0 :

$ kubectl get pods
NAME            READY     STATUS    RESTARTS   AGE
rss-example-0   1/1       Running   0          41s
rss-example-1   1/1       Running   0          10m
rss-example-2   1/1       Running   0          10m
rss-operator    1/1       Running   0          13m

Destroy dummy cluster:

$ kubectl delete -f example/cluster.yaml

Replication operator recovery

If the replication operator restarts, it can recover its previous state. Let's walk through in the following steps.

$ kubectl create -f example/cluster.yaml

Wait until all three members are up. Then

$ kubectl delete -f example/operator.yaml
pod "rss-operator" deleted

$ kubectl delete pod rss-example-0  --now
pod "rss-example-0 " deleted

Then restart the replication operator. It should recover itself and the clusters it manages.

$ kubectl create -f example/operator.yaml
pod "rss-operator" created

$ kubectl get pods
NAME            READY     STATUS    RESTARTS   AGE
rss-example-0   1/1       Running   0          4m
rss-example-1   1/1       Running   0          13m
rss-example-2   1/1       Running   0          13m
rss-operator    1/1       Running   0          4m

Limitations

The replication operator only manages clusters created in the same namespace. Users need to create multiple operators in different namespaces to manage clusters in different namespaces.
Lights-out recovery of the replication operator currently requires shared storage. Backup and restore capability will be added in the future if there is sufficient interest.

Name		Name	Last commit message	Last commit date
Latest commit History 2,616 Commits
apps		apps
cmd/operator		cmd/operator
doc		doc
example		example
hack		hack
pkg		pkg
test		test
vendor		vendor
version		version
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
LICENSE		LICENSE
MAINTAINERS		MAINTAINERS
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
boilerplate.go.txt		boilerplate.go.txt
code-of-conduct.md		code-of-conduct.md
codecov.yaml		codecov.yaml
glide.lock		glide.lock
glide.yaml		glide.yaml
go-list.sh		go-list.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replicated Stateful Set (RSS) Operator

Project status: beta

Overview

Requirements

Demo - Replicated Database

Getting started

Build the image

Deploy the replication operator

Create and destroy a cluster

Resize a cluster

Failover

Replication operator recovery

Limitations

About

Releases

Packages

Contributors 33

Languages

License

beekhof/rss-operator

Folders and files

Latest commit

History

Repository files navigation

Replicated Stateful Set (RSS) Operator

Project status: beta

Overview

Requirements

Demo - Replicated Database

Getting started

Build the image

Deploy the replication operator

Create and destroy a cluster

Resize a cluster

Failover

Replication operator recovery

Limitations

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 33

Languages

Packages