Skip to content

Latest commit

 

History

History
331 lines (252 loc) · 11.4 KB

File metadata and controls

331 lines (252 loc) · 11.4 KB

Deploy Persistent Storage

Helm Install

Install Helm and add the Akash repo if not done previously by following the steps in this guide.

All steps in this section should be conducted from the Kubernetes control plane node on which Helm has been installed.

Rook has published the following Helm charts for the Ceph storage provider:

  • Rook Ceph Operator: Starts the Ceph Operator, which will watch for Ceph CRs (custom resources)
  • Rook Ceph Cluster: Creates Ceph CRs that the operator will use to configure the cluster

The Helm charts are intended to simplify deployment and upgrades.

Persistent Storage Deployment

  • Note - if any issues are encountered during the Rook deployment, tear down the Rook-Ceph components via the steps listed here and begin anew.
  • Deployment typically takes approximately 10 minutes to complete**.**

Migration procedure

If you already have the akash-rook helm chart installed, make sure to use the following documentation:

Rook Ceph repository

Add Repo

  • Add the Rook repo to Helm
helm repo add rook-release https://charts.rook.io/release
  • Expected/Example Result
# helm repo add rook-release https://charts.rook.io/release

"rook-release" has been added to your repositories

Verify Repo

  • Verify the Rook repo has been added
helm search repo rook-release --version v1.13.5
  • Expected/Example Result
# helm search repo rook-release --version v1.13.5

NAME                          	CHART VERSION	APP VERSION	DESCRIPTION                                       
rook-release/rook-ceph        	v1.13.5       	v1.13.5     	File, Block, and Object Storage Services for yo...
rook-release/rook-ceph-cluster	v1.13.5       	v1.13.5     	Manages a single Ceph cluster namespace for Rook

Deployment Steps

STEP 1 - Install Ceph Operator Helm Chart

TESTING

Scroll further for PRODUCTION

For additional Operator chart values refer to this page.

All In One Provisioner Replicas

For all-in-one deployments, you will likely want only one replica of the CSI provisioners.

  • Add following to rook-ceph-operator.values.yml created in the subsequent step
  • By setting provisionerReplicas to 1, you ensure that only a single replica of the CSI provisioner is deployed. This defaults to 2 when it is not explicitly set.
csi:
 provisionerReplicas: 1

Default Resource Limits

You can disable default resource limits by using the following yaml config, this is useful when testing:

cat > rook-ceph-operator.values.yml << 'EOF'
resources:
csi:
  csiRBDProvisionerResource:
  csiRBDPluginResource:
  csiCephFSProvisionerResource:
  csiCephFSPluginResource:
  csiNFSProvisionerResource:
  csiNFSPluginResource:
EOF

Install the Operator Chart

helm install --create-namespace -n rook-ceph rook-ceph rook-release/rook-ceph --version 1.13.5 -f rook-ceph-operator.values.yml

PRODUCTION

No customization is required by default.

  • Install the Operator chart:
helm install --create-namespace -n rook-ceph rook-ceph rook-release/rook-ceph --version 1.13.5

STEP 2 - Install Ceph Cluster Helm Chart

For additional Cluster chart values refer to this page.
For custom storage configuration refer to this example.

TESTING / ALL-IN-ONE SETUP

For production multi-node setup, please skip this section and scroll further for PRODUCTION SETUP

Preliminary Steps

  1. Device Filter: Update deviceFilter to correspond with your specific disk configurations.
  2. Storage Class: Modify the storageClass name from beta3 to an appropriate one, as outlined in the Storage Class Types table.
  3. Node Configuration: Under the nodes section, list the nodes designated for Ceph storage, replacing placeholders like node1, node2, etc., with your Kubernetes node names.

Configuration for All-in-One or Single Storage Node

When setting up an all-in-one production provider or a single storage node with multiple storage drives (minimum requirement: 3 drives, or 2 drives if osdsPerDevice is set to 2):

  1. Failure Domain: Set failureDomain to osd.
  2. Size Settings:
    • The size and osd_pool_default_size should always be set to osdsPerDevice + 1 when failureDomain is set to osd.
    • Set min_size and osd_pool_default_min_size to 2.
    • Set size and osd_pool_default_size to 3. Note: These can be set to 2 if you have a minimum of 3 drives and osdsPerDevice is 1.
  3. Resource Allocation: To ensure Ceph services receive sufficient resources, comment out or remove the resources: field before execution.
cat > rook-ceph-cluster.values.yml << 'EOF'
operatorNamespace: rook-ceph

configOverride: |
  [global]
  osd_pool_default_pg_autoscale_mode = on
  osd_pool_default_size = 1
  osd_pool_default_min_size = 1

cephClusterSpec:
  resources:

  mon:
    count: 1
  mgr:
    count: 1

  storage:
    useAllNodes: false
    useAllDevices: false
    deviceFilter: "^nvme."
    config:
      osdsPerDevice: "1"
    nodes:
    - name: "node1"
      config:

cephBlockPools:
  - name: akash-deployments
    spec:
      failureDomain: host
      replicated:
        size: 1
      parameters:
        min_size: "1"
        bulk: "true"
    storageClass:
      enabled: true
      name: beta3
      isDefault: true
      reclaimPolicy: Delete
      allowVolumeExpansion: true
      parameters:
        # RBD image format. Defaults to "2".
        imageFormat: "2"
        # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
        imageFeatures: layering
        # The secrets contain Ceph admin credentials.
        csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
        csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
        csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
        csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
        csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
        csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
        # Specify the filesystem type of the volume. If not specified, csi-provisioner
        # will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
        # in hyperconverged settings where the volume is mounted on the same node as the osds.
        csi.storage.k8s.io/fstype: ext4

# Do not create default Ceph file systems, object stores
cephFileSystems:
cephObjectStores:

# Spawn rook-ceph-tools, useful for troubleshooting
toolbox:
  enabled: true
  resources:
EOF

PRODUCTION SETUP

Core Configuration

  1. Device Filter: Update deviceFilter to match your disk specifications.
  2. Storage Class: Change the storageClass name from beta3 to a suitable one, as specified in the Storage Class Types table.
  3. OSDs Per Device: Adjust osdsPerDevice according to the guidelines provided in the aforementioned table.
  4. Node Configuration: In the nodes section, add your nodes for Ceph storage, ensuring to replace node1, node2, etc., with the actual names of your Kubernetes nodes.

Configuration for a Single Storage Node

For a setup involving a single storage node with multiple storage drives (minimum: 3 drives, or 2 drives if osdsPerDevice = 2):

  1. Failure Domain: Set failureDomain to osd.
  2. Size Settings:
    • The size and osd_pool_default_size should always be set to osdsPerDevice + 1 when failureDomain is set to osd.
    • Set min_size and osd_pool_default_min_size to 2.
    • Set size and osd_pool_default_size to 3. Note: These can be set to 2 if you have a minimum of 3 drives and osdsPerDevice is 1.
cat > rook-ceph-cluster.values.yml << 'EOF'
operatorNamespace: rook-ceph

configOverride: |
  [global]
  osd_pool_default_pg_autoscale_mode = on
  osd_pool_default_size = 3
  osd_pool_default_min_size = 2

cephClusterSpec:

  mon:
    count: 3
  mgr:
    count: 2

  storage:
    useAllNodes: false
    useAllDevices: false
    deviceFilter: "^nvme."
    config:
      osdsPerDevice: "2"
    nodes:
    - name: "node1"
      config:
    - name: "node2"
      config:
    - name: "node3"
      config:

cephBlockPools:
  - name: akash-deployments
    spec:
      failureDomain: host
      replicated:
        size: 3
      parameters:
        min_size: "2"
        bulk: "true"
    storageClass:
      enabled: true
      name: beta3
      isDefault: true
      reclaimPolicy: Delete
      allowVolumeExpansion: true
      parameters:
        # RBD image format. Defaults to "2".
        imageFormat: "2"
        # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
        imageFeatures: layering
        # The secrets contain Ceph admin credentials.
        csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
        csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
        csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
        csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
        csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
        csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
        # Specify the filesystem type of the volume. If not specified, csi-provisioner
        # will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
        # in hyperconverged settings where the volume is mounted on the same node as the osds.
        csi.storage.k8s.io/fstype: ext4

# Do not create default Ceph file systems, object stores
cephFileSystems:
cephObjectStores:

# Spawn rook-ceph-tools, useful for troubleshooting
toolbox:
  enabled: true
  #resources:
EOF
  • Install the Cluster chart:
helm install --create-namespace -n rook-ceph rook-ceph-cluster \
   --set operatorNamespace=rook-ceph rook-release/rook-ceph-cluster --version 1.13.5 -f rook-ceph-cluster.values.yml

STEP 3 - Label the storageClass

This label is mandatory and is used by the Akash's inventory-operator for searching the storageClass.

  • Change beta3 to your storageClass you have picked before
kubectl label sc beta3 akash.network=true

STEP 4 - Update Failure Domain (Single Storage Node or All-In-One Scenarios Only)

When running a single storage node or all-in-one, make sure to change the failure domain from host to osd for the .mgr pool.

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bash

ceph osd crush rule create-replicated replicated_rule_osd default osd
ceph osd pool set .mgr crush_rule replicated_rule_osd