Based on: https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes-multi-cluster.html
us-east1 b, c, d Moncks Corner, South Carolina, USA us-east4 a, b, c Ashburn, Northern Virginia, USA us-west1 a, b, c The Dalles, Oregon, USA us-west2 a, b, c Los Angeles, California, USA us-west3 a, b, c Salt Lake City, Utah, USA us-west4 a, b, c Las Vegas, Nevada, USA
If you need to increase GCP quotas, go to this URL: https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=
gcloud compute firewall-rules create allow-cockroach-internal \
--allow=tcp:26257 --source-ranges=10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
- gcloud is installed locally and configured for authentication and for the correct project
- kubectl is installed locally
MACHINETYPE=e2-standard-16
GCEREGION1=us-west1
GCEREGION2=us-west2
GCEREGION3=us-west3
ACCOUNT=`gcloud info | grep Account | awk '{print $2}' | cut -d "[" -f 2 | cut -d "]" -f 1`
GCLOUDPROJECT=`gcloud config get-value project`
CLUSTER1=gke_${GCLOUDPROJECT}_${GCEREGION1}_cockroachdb1
CLUSTER2=gke_${GCLOUDPROJECT}_${GCEREGION2}_cockroachdb2
CLUSTER3=gke_${GCLOUDPROJECT}_${GCEREGION3}_cockroachdb3
gcloud container clusters create cockroachdb1 \
--region=$GCEREGION1 --machine-type=$MACHINETYPE --num-nodes=1 \
--cluster-ipv4-cidr=10.1.0.0/16 --node-locations=$GCEREGION1-a,$GCEREGION1-b,$GCEREGION1-c
gcloud container clusters create cockroachdb2 \
--region=$GCEREGION2 --machine-type=$MACHINETYPE --num-nodes=1 \
--cluster-ipv4-cidr=10.2.0.0/16 --node-locations=$GCEREGION2-a,$GCEREGION2-b,$GCEREGION2-c
gcloud container clusters create cockroachdb3 \
--region=$GCEREGION3 --machine-type=$MACHINETYPE --num-nodes=1 \
--cluster-ipv4-cidr=10.3.0.0/16 --node-locations=$GCEREGION3-a,$GCEREGION3-b,$GCEREGION3-c
Notes:
-
I tried to use c2-standard-16, but they weren't available in all zones; so I used e2-standard-16 instead.
-
If you don't specify a machine type parameter in the
gcloud config get-value project
, it will default to e2-medium -
I'm specifying explicit ranges here for the pod networks. I don't know if GCP is smart enough to create pod networks that don't overlap between clusters, but I like being explicit about it here because then I know the ranges won't overlap and I can be a little more certain about what's actually happening.
-
When specifying
--num-nodes=1
, there will be 1 node created in each of the node-locations. So, the above commands actually create 9 nodes.
kubectl config get-contexts
You should see all 3 GCP k8s clusters listed.
kubectl create clusterrolebinding $USER-cluster-admin-binding --clusterrole=cluster-admin --user=$ACCOUNT --context=$CLUSTER1
kubectl create clusterrolebinding $USER-cluster-admin-binding --clusterrole=cluster-admin --user=$ACCOUNT --context=$CLUSTER2
kubectl create clusterrolebinding $USER-cluster-admin-binding --clusterrole=cluster-admin --user=$ACCOUNT --context=$CLUSTER3
mkdir multiregion
cd $_
curl -OOOOOOOOO https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/multiregion/{README.md,client-secure.yaml,cluster-init-secure.yaml,cockroachdb-statefulset-secure.yaml,dns-lb.yaml,example-app-secure.yaml,external-name-svc.yaml,setup.py,teardown.py}
Update setup.py with the correct contexts
and regions
variables
contexts = {
'us-west1': 'gke_cockroachlabs-hatcher-284815_us-west1_cockroachdb1',
'us-west2': 'gke_cockroachlabs-hatcher-284815_us-west2_cockroachdb2',
'us-west3': 'gke_cockroachlabs-hatcher-284815_us-west3_cockroachdb3',
}
regions = {
}
Update tearddown.py with the correct contexts
variable (same as above).
contexts = {
'us-west1': 'gke_cockroachlabs-hatcher-284815_us-west1_cockroachdb1',
'us-west2': 'gke_cockroachlabs-hatcher-284815_us-west2_cockroachdb2',
'us-west3': 'gke_cockroachlabs-hatcher-284815_us-west3_cockroachdb3',
}
To get the best performance, we want to make sure that we're:
- using SSD disks
- using larger drives which will give us better IOPS and throughput
- Requesting enough CPU and memory that our k8s pods get distributed onto different k8s nodes to avoid noisy neighbor issues
Find this section (towards the bottom) and make the noted edits:
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes:
- "ReadWriteOnce"
#add this next line
storageClassName: storage-class-ssd
resources:
requests:
#change this next line from 100Gi to 1024Gi
storage: 1024Gi
Find this section and make the noted edits:
containers:
- name: cockroachdb
image: cockroachdb/cockroach:v20.1.4
imagePullPolicy: IfNotPresent
# add this next "resources" section
resources:
requests:
cpu: "15"
memory: "60G"
ports:
cat << EOF > storage-class-ssd.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: storage-class-ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
EOF
kubectl create -f storage-class-ssd.yaml --context $CLUSTER1
kubectl create -f storage-class-ssd.yaml --context $CLUSTER2
kubectl create -f storage-class-ssd.yaml --context $CLUSTER3
python2.7 setup.py
kubectl get pods --selector app=cockroachdb --all-namespaces --context $CLUSTER1
kubectl get pods --selector app=cockroachdb --all-namespaces --context $CLUSTER2
kubectl get pods --selector app=cockroachdb --all-namespaces --context $CLUSTER3
You should see 3/3
for the CockroachDB pods
gcloud container node-pools create workloadnodes --cluster=cockroachdb1 --disk-type=pd-ssd --machine-type=$MACHINETYPE --num-nodes=1 --zone=$GCEREGION1 --node-locations=$GCEREGION1-a
gcloud container node-pools create workloadnodes --cluster=cockroachdb2 --disk-type=pd-ssd --machine-type=$MACHINETYPE --num-nodes=1 --zone=$GCEREGION2 --node-locations=$GCEREGION2-a
gcloud container node-pools create workloadnodes --cluster=cockroachdb3 --disk-type=pd-ssd --machine-type=$MACHINETYPE --num-nodes=1 --zone=$GCEREGION3 --node-locations=$GCEREGION3-a
kubectl create -f client-secure.yaml --context $CLUSTER1
kubectl create -f client-secure.yaml --context $CLUSTER2
kubectl create -f client-secure.yaml --context $CLUSTER3
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public
kubectl exec -it cockroachdb-client-secure --context $CLUSTER2 --namespace default -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public
kubectl exec -it cockroachdb-client-secure --context $CLUSTER3 --namespace default -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default \
-- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public \
--execute="CREATE USER roach WITH PASSWORD 'whateverpasswordyouwant'; GRANT admin TO roach;"
kubectl port-forward cockroachdb-0 8080 --context $CLUSTER1 --namespace $GCEREGION1
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default \
-- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public \
--execute="SET CLUSTER SETTING cluster.organization = 'MultiRegionDemo'; SET CLUSTER SETTING enterprise.license = 'crl-0-?????????????';"
Note: you need the license in place to do the partitioning stuff in the next step.
time kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- \
./cockroach workload fixtures import tpcc \
--warehouses 1000 \
--partition-affinity=0 --partitions=3 --partition-strategy=leases --zones=us-west1,us-west2,us-west3 \
'postgresql://root@cockroachdb-public:26257?sslmode=verify-full&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key&sslrootcert=/cockroach-certs/ca.crt'
Note: I'm using cockroach workload fixtures import tpcc
instead of cockroach workload init tpcc
. By doing so, the data is imported, rathen than being inserted. It runs in about 30 minutes, as opposed to 90 minutes.
Note: You can run this on one workload, not all three.
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- \
./cockroach dump tpcc --dump-mode=schema --certs-dir=/cockroach-certs --host=cockroachdb-public > schema.txt
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- \
./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public \
--execute="SHOW ALL ZONE CONFIGURATIONS;" > zoneconfigs.txt
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- \
./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public \
--execute="SELECT COUNT(1), table_name, lease_holder_locality FROM [SHOW RANGES FROM DATABASE tpcc] GROUP BY 2,3 ORDER BY 3,2;" > ranges.txt
kubectl exec -it cockroachdb-client-secure --context $CLUSTER1 --namespace default -- \
./cockroach workload run tpcc \
--duration=60m --warehouses 1000 --ramp=180s \
--partition-affinity=0 --partitions=3 --partition-strategy=leases \
'postgresql://root@cockroachdb-public:26257?sslmode=verify-full&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key&sslrootcert=/cockroach-certs/ca.crt'
kubectl exec -it cockroachdb-client-secure --context $CLUSTER2 --namespace default -- \
./cockroach workload run tpcc \
--duration=60m --warehouses 1000 --ramp=180s \
--partition-affinity=1 --partitions=3 --partition-strategy=leases \
'postgresql://root@cockroachdb-public:26257?sslmode=verify-full&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key&sslrootcert=/cockroach-certs/ca.crt'
kubectl exec -it cockroachdb-client-secure --context $CLUSTER3 --namespace default -- \
./cockroach workload run tpcc \
--duration=60m --warehouses 1000 --ramp=180s \
--partition-affinity=2 --partitions=3 --partition-strategy=leases \
'postgresql://root@cockroachdb-public:26257?sslmode=verify-full&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key&sslrootcert=/cockroach-certs/ca.crt'
You can see the summary of the three workload apps after they've finished. And you can look at metrics from the Admin UI.
Your output should look like:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
3600.0s 0 575026 159.7 98.2 109.1 184.5 251.7 1208.0
Audit check 9.2.1.7: PASS
Audit check 9.2.2.5.1: PASS
Audit check 9.2.2.5.2: PASS
Audit check 9.2.2.5.3: PASS
Audit check 9.2.2.5.4: PASS
Audit check 9.2.2.5.5: PASS
Audit check 9.2.2.5.6: PASS
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
3600.0s 4160.5 32.4% 128.1 117.4 159.4 226.5 268.4 1040.2
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
3600.0s 0 575130 159.8 72.3 71.3 142.6 201.3 2281.7
Audit check 9.2.1.7: PASS
Audit check 9.2.2.5.1: PASS
Audit check 9.2.2.5.2: PASS
Audit check 9.2.2.5.3: PASS
Audit check 9.2.2.5.4: PASS
Audit check 9.2.2.5.5: PASS
Audit check 9.2.2.5.6: PASS
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
3600.0s 4162.2 32.4% 94.1 88.1 130.0 176.2 218.1 2281.7
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
3600.0s 0 577825 160.5 70.8 58.7 176.2 234.9 1040.2
Audit check 9.2.1.7: PASS
Audit check 9.2.2.5.1: PASS
Audit check 9.2.2.5.2: PASS
Audit check 9.2.2.5.3: PASS
Audit check 9.2.2.5.4: PASS
Audit check 9.2.2.5.5: PASS
Audit check 9.2.2.5.6: PASS
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
3600.0s 4178.4 32.5% 90.1 88.1 130.0 184.5 243.3 1040.2
python2.7 teardown.py
kubectl delete storageclass storage-class-ssd --cluster $CLUSTER1
kubectl delete storageclass storage-class-ssd --cluster $CLUSTER2
kubectl delete storageclass storage-class-ssd --cluster $CLUSTER3
gcloud container clusters delete cockroachdb1 --region=$GCEREGION1 --quiet
gcloud container clusters delete cockroachdb2 --region=$GCEREGION2 --quiet
gcloud container clusters delete cockroachdb3 --region=$GCEREGION3 --quiet