sesdev
is a CLI tool to deploy Ceph clusters (both the upstream and SUSE
downstream versions).
This tool uses Vagrant behind the scenes to create the VMs and run the deployment scripts.
- Installation
- Usage
- Create/deploy a Ceph cluster
- Bare bone cluster
- CaaSP (with or without Rook/Ceph/SES)
- k3s (with or without Rook/Ceph/SES)
- On a remote libvirt server via SSH
- Using salt instead of DeepSea/ceph-salt CLI
- With a FQDN environment
- Without the devel repo
- With an additional custom zypper repo
- With a set of custom zypper repos completely replacing the default repos
- With custom image paths
- With custom default roles
- config.yaml examples
- With wire encryption
- Deploying non-SUSE environments
- Introspect existing deployments
- List existing deployments
- SSH access to a cluster
- Copy files into and out of a cluster
- Services port-forwarding
- Replace ceph-salt
- Replace MGR modules
- Add a repo to a cluster
- Link two clusters together
- Temporarily stop a cluster
- Destroy a cluster
- Run "make check"
- Custom provisioning
- Create/deploy a Ceph cluster
- Common pitfalls
- Domain about to create is already taken
- Storage pool not found: no storage pool with matching name 'default'
- When sesdev deployments get destroyed, virtual networks get left behind
- sesdev destroy reported an error
- "Failed to connect socket" error when attempting to use remote libvirt server
- mount.nfs: Unknown error 521
- Problems accessing dashboard on remote sesdev
- Error creating IPv6 cluster
- Failed to initialize libnetcontrol
- Contributing
First, you should have both QEMU and Libvirt installed in some machine to host the VMs created by sesdev (using Vagrant behind the scenes).
First, if on SUSE Linux Enterprise, make sure you have the Server Applications Module available on the system. (Links to internal repos are available here.)
Run the following commands as root:
# zypper -n install -t pattern kvm_server kvm_tools
# systemctl enable libvirtd
# systemctl restart libvirtd
If you are running libvirt on the same machine where you installed sesdev, add your user to the "libvirt" group to avoid "no polkit agent available" errors when vagrant attempts to connect to the libvirt daemon:
# groupadd libvirt
groupadd: group 'libvirt' already exists
# usermod -a -G libvirt $USER
Log out, and then log back in. You should now be a member of the "libvirt" group.
sesdev needs Vagrant to work. Vagrant can be installed in a number of ways, depending on your environment:
On very new OSes like these, Vagrant is included in the operating system's base
repos. Just install the vagrant
and vagrant-libvirt
packages.
For SLE-15-SP2, the packages are available via the SUSE Package Hub. (Links to internal repos are available here.)
To install Vagrant on these systems, run the following commands as root:
# zypper ar https://download.opensuse.org/repositories/Virtualization:/vagrant/<repo> vagrant_repo
# zypper ref
# zypper -n install vagrant vagrant-libvirt
Where <repo>
can be any of the openSUSE build targets currently enabled for
the Virtualization:vagrant/vagrant package in the openSUSE Build Service.
Be aware that Virtualization:vagrant
is a development project where updates
to the latest official openSUSE vagrant packages are prepared. That means the
vagrant packages in this repo will tend to be new and, sometimes, even broken.
In that case, read on to the next section.
If you find that, for whatever reason, you cannot get a working vagrant package from OBS, it is possible to install vagrant from the official RPMs published on the Hashicorp website.
To install vagrant and its libvirt plugin from Hashicorp, the following procedure has been known to work with vagrant 2.4.0 (run the commands as root):
- download vagrant RPM from https://releases.hashicorp.com/vagrant/
- install gcc, make and libvirt-devel (
zypper install gcc make libvirt-devel
) - install vagrant (
rpm -i <the RPM you just downloaded>
) - delete file that causes libvirt plugin compilation to fail
(
rm /opt/vagrant/embedded/lib/libreadline.so.8
)
Finally, run the following command as the user you run sesdev with:
vagrant plugin install vagrant-libvirt
Proceed to Install sesdev from source, then refer to the Usage chapter, below, for further information.
Run the following commands as root:
# dnf install qemu-common qemu-kvm libvirt-daemon-kvm \
libvirt-daemon libvirt-daemon-driver-qemu vagrant-libvirt
# systemctl enable libvirtd
# systemctl restart libvirtd
Proceed to Install sesdev from source, then refer to the Usage chapter, below, for further information.
sesdev is known to work on recent Ubuntu versions. Follow the instructions given in Install sesdev from source.
sesdev uses the libvirt API Python bindings, and these cannot be installed via pip unless the RPM packages "gcc", "python3-devel", and "libvirt-devel" are installed, first. Also, in order to clone the sesdev git repo, the "git-core" package is needed. So, before proceeding, make sure that all of these packages are installed in the system:
# zypper -n install gcc git-core libvirt-devel python3-devel python3-virtualenv
# apt-get install -y git gcc libvirt-dev \
virtualenv python3-dev python3-venv python3-virtualenv
# dnf install -y git-core gcc libvirt-devel \
python3-devel python3-virtualenv
Now you can proceed to clone the sesdev source code repo and bootstrap it:
$ git clone https://github.com/SUSE/sesdev.git
$ cd sesdev
$ ./bootstrap.sh
Before you can use sesdev
, you need to activate the Python virtual environment
created by the bootstrap.sh
script. The script tells you how to do this, but
we'll give the command here, anyway:
source venv/bin/activate
At this point, sesdev should be installed and ready to use: refer to the Usage chapter, below, for further information.
To leave the virtual environment, simply run:
deactivate
CAVEAT: Remember to re-run ./bootstrap.sh
after each git pull.
If you are preparing a code change for submission and would like to run the unit tests on it, make sure you have installed sesdev from source, as described above, and the virtualenv is active. Then, follow the instructions below.
First, make sure you have installed sesdev from source following the instructions from here.
Second, make sure your virtualenv is active (source venv/bin/activate
).
At this point, run tox --version
to check if tox is already installed on your
system. If it is not, then run pip3 install tox
to install it in the Python
virtual environment.
Finally, inspect the list of testing environments in tox.ini
and choose one or
more that you are interested in. Here is an example, but the actual output might
be different:
$ tox --listenvs
py3
lint
(This means you have two testing environments to choose from: py3
and lint
.)
Finally, run your chosen test environment(s):
tox -e py3
tox -e lint
If you don't know which testing environment to choose, the command tox
will
run all the testing environments.
First, generate the autocompletion code for the shell of your choice. This example assumes the bash shell, but zsh and fish are supported too and work analogous:
sesdev shell-completion bash > ~/.sesdev-completion.sh
Then source it in your shell's rc file, for bash that is ~/.bashrc
:
source ~/.sesdev-completion.sh
Run sesdev --help
or sesdev <command> --help
to get the available
options and description of the commands.
To create a single node Ceph cluster based on nautilus/leap-15.1 on your local system, run the following command:
$ sesdev create nautilus --single-node mini
The mini
argument is the ID of the deployment. It is optional: if you omit it,
sesdev will assign an ID as it sees fit. You can create many deployments by
giving them different IDs.
To create a multi-node Ceph cluster, you can specify the nodes and their roles
using the --roles
option.
The roles of each node are grouped in square brackets, separated by commas. The nodes are separated by commas, too.
The following roles can be assigned:
master
- The master node, running management components like the Salt masteradmin
- signifying that the node should get ceph.conf and keyring [1]bootstrap
- The node wherecephadm bootstrap
will be runclient
- Various Ceph client utilitiesnfs
- NFS (Ganesha) gateway [2] [4]grafana
- Grafana metrics visualization (requires Prometheus) [3]igw
- iSCSI target gatewaymds
- CephFS MDSmgr
- Ceph Manager instancemon
- Ceph Monitor instanceprometheus
- Prometheus monitoring [3]rgw
- Ceph Object Gatewaystorage
- OSD storage daemon [3]suma
- SUSE Manager (octopus only)
[1] CAVEAT: sesdev applies the admin
role to all nodes, regardless of whether
or not the user specified it explicitly on the command line or in config.yaml
.
[2] The nfs
role may also be used -- by itself on a dedicated VM -- when
deploying a CaaSP cluster. See Rook and CaaSP based Ceph
cluster for more information.
[3] Do not use the storage
role when deploying Rook/Ceph over CaaSP. See
Rook and CaaSP based Ceph cluster for more
information.
[4] Not currently supported by the octopus
, pacific
, or master
roles.
The following example will generate a cluster with four nodes: the master (Salt Master) node that is also running a MON daemon; a storage (OSD) node that will also run a MON, a MGR and an MDS and serve as the bootstrap node; another storage (OSD) node with MON, MGR, and MDS; and a fourth node that will run an iSCSI gateway, an NFS (Ganesha) gateway, and an RGW gateway.
$ sesdev create nautilus --roles="[master, mon], [bootstrap, storage, mon, mgr, mds], \
[storage, mon, mgr, mds], [igw, nfs, rgw]"
An important use case of sesdev is to create "bare bone" clusters: i.e., clusters with almost nothing running on them, but ready for manual testing of deployment procedures, or just playing around.
Some caveats apply:
- These caveats apply only to core (Ceph) deployment versions. Rook/CaaSP is different: see Rook and CaaSP based Ceph cluster for details.
- For
nautilus
, andses6
, the only role required ismaster
and you can use--stop-before-deepsea-stage
to control how many DeepSea stages are run. - For
octopus
,ses7
,ses7p
, andpacific
, the only roles required aremaster
andbootstrap
. While it is possible to stop the deployment script at various stages (seesesdev create octopus --help
for details), in general sesdev will try to deploy Ceph services/daemons according to the roles given by the user. - You can specify a node with no roles like so:
[]
- Ordinarily, a node gets extra disks ("OSD disks") only when the
storage
role is specified. However, to facilitate deployment of "bare bone" clusters, sesdev will also create and attach disks if the user explicitly gives the--num-disks
option. - Disks will not be created/attached to nodes that have only the
master
role and no other roles.
Example:
sesdev create octopus --roles="[master],[mon,mgr,bootstrap],[],[]" --num-disks 3
This will bootstrap an octopus cluster with:
- an "admin node" (
[master]
) - a bootstrap node (
[mon,mgr,bootstrap]
) - two empty nodes (
[]
) ready for "Day 2" operations
To create CaaSP k8s cluster that has a loadbalancer
node, 2 worker
nodes and
a master
node:
$ sesdev create caasp4
By default it just creates and configures a CaaSP cluster, and workers don't
have any disks unless the --deploy-ses
(see below) or --num-disks
options
are given.
To create workers with disks and without a loadbalancer
role:
$ sesdev create caasp4 --roles="[master], [worker], [worker]" --disk-size 6 --num-disks 2
Note: sesdev does not support sharing of roles on a single caasp4
node. Each
node must have one and only one role. However, it is still possible to deploy
a single-node cluster (see below). In this case the master node will also
function as a worker node even though the worker
role is not explicitly given.
For persistent storage, there are two options: either deploy SES with Rook (see
below), or specify an nfs
role -- always by itself on a dedicated node. In the
latter case, sesdev will create a node acting as an NFS server as well as an NFS
client pod in the CaaSP cluster, providing a persistent store for other
(containerized) applications.
To have sesdev deploy Rook on the CaaSP cluster, give the --deploy-ses
option.
The default disk size is 8G, number of worker nodes 2, number of disks per
worker node 3:
$ sesdev create caasp4 --deploy-ses
Note: sesdev does not support sharing of roles on a single caasp4
node. Each
node must have one and only one role. However, it is still possible to deploy
a single-node cluster (see below). In this case the master node will also
function as a worker node even though the worker
role is not explicitly given.
Note: the storage
role should never be given in a caasp4
cluster. By
default, Rook will will look for any spare block devices on worker nodes (i.e.,
all block devices but the first (OS disk) of any given worker) and create OSD
pods for them. Just be aware that sesdev will not create these "spare block
devices" unless you explicitly pass either the --num-disks
or the
--deploy-ses
option (or both).
To create a single-node CaaSP cluster, use --single-node
option. This may be
given in combination with --deploy-ses
, or by itself. For example, the
following command creates a CaaSP cluster on one node with four disks (8G) and
also deploy SES/Ceph on it, using Rook:
$ sesdev create caasp4 --single-node --deploy-ses
Note: since passing --single-node
without an explicit deployment name causes
the name to be set to DEPLOYMENT_VERSION-mini
, the resulting cluster from the
example above would be called caasp4-mini
.
To create a k3s cluster that has 4 worker
nodes and
a master
node:
$ sesdev create k3s
This uses curl -sfL https://get.k3s.io | -
to install k3s,
and curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
to install helm.
By default it just creates and configures a k3s cluster, and workers don't
have any disks unless the --deploy-ses
, --deploy-longhorn
(see below)
or --num-disks
options are given.
To have sesdev deploy Rook on the k3s cluster, give the --deploy-ses
option.
The default disk size is 8G, number of worker nodes 4, number of disks per
worker node 3:
$ sesdev create k3s --deploy-ses
To have sesdev deploy Longhorn instead of Ceph, give
the --deploy-longhorn
option. By default this will deploy 4 worker nodes
each with one additional 8G disk, mounted at /var/lib/longhorn, and will
install the latest stable version of Longhorn:
$ sesdev create k3s --deploy-longhorn
To deploy a specific version of Longorn, use the --longhorn-version
option:
$ sesdev create k3s --deploy-longhorn --longhorn-version=1.4.1
Currently Longhorn deployments will only use one disk. If more are
specified using the --num-disks
option, only the first disk will be
mounted for use by Longhorn. All other additional disks will remain
untouched.
If you would like to start the cluster VMs on a remote server via libvirt/SSH,
create a configuration file $HOME/.sesdev/config.yaml
with the following
content:
libvirt_use_ssh: true
libvirt_user: <ssh_user>
libvirt_private_key_file: <private_key_file> # defaults to $HOME/.ssh/id_rsa
libvirt_host: <hostname|ip address>
Note that passwordless SSH access to this user@host combination needs to be configured and enabled.
By default, sesdev will use the DeepSea CLI to run the DeepSea Stages (nautilus, ses6) or the "ceph-salt" command to apply the ceph-salt Salt Formula (ses7, octopus, pacific).
If you would rather use Salt directly, give the --salt
option on the sesdev create
command line.
In some cases you might want to deploy a Ceph cluster in an environment where
hostname
returns an FQDN and
hostname -s
returns the short hostname (defined as a string containing no .
characters).
DeepSea and ceph-salt should have no problem with this. You can tell sesdev
to set the hostname to the FQDN by passing the --fqdn
option to sesdev create
.
The "core" deployment targets (nautilus, ses6, octopus, ses7, ses7p, pacific) all have a concept of a "devel" repo where updates to the Ceph/storage-related packages are staged. Since users frequently want to install the "latest, greatest" packages, the "devel" repo is added to all nodes by default. However, there are times when this is not desired: when using sesdev to simulate update/upgrade scenarios, for example.
To deploy a Ceph cluster without the "devel" repo, give the --product
option
on the sesdev create
command line.
Each deployment version (e.g. "octopus", "nautilus") is associated with a set of zypper repos which are added on each VM that is created.
There are times when you may need to add additional zypper repo(s)
to all the VMs prior to deployment. In such a case, add one or more --repo
options to the command line, e.g.:
$ sesdev create nautilus --single-node --repo [URL_OF_REPO]
By default, the custom repo(s) will be added with an elevated priority,
to ensure that packages from these repos will be installed even if higher
RPM versions of those packages exist. If this behavior is not desired,
add --no-repo-priority
to disable it.
If the default zypper repos that are added to each VM prior to deployment are
completely wrong for your use case, you can override them via
~/.sesdev/config.yaml
.
To do this, you have to be familiar with two of sesdev's internal dictionaries:
OS_REPOS
and VERSION_DEVEL_REPOS
. The former specifies repos that are
added to all VMs with a given operating system, regardless of the Ceph version
being deployed, and the latter specifies additional repos that are added to VMs
depending on the Ceph version being deployed. Refer to seslib/__init__.py
for
the current defaults.
To override OS_REPOS
, add an os_repos:
stanza to your ~/.sesdev/config.yaml
.
To override VERSION_DEVEL_REPOS
, add a version_devel_repos:
stanza to your ~/.sesdev/config.yaml
.
Please note that you need not copy-paste any parts of these internal dictionaries from the source code into your config. You can selectively override only those parts that you need. For example, the following config snippet will override the default additional repos for "octopus" deployments on "leap-15.2", but it will not change the defaults for any of the other deployment versions:
version_devel_repos:
octopus:
leap-15.2:
- 'https://download.opensuse.org/repositories/filesystems:/ceph:/octopus/openSUSE_Leap_15.2'
If you need a higher priority on one or more of the repos,
version_devel_repos
supports a "magic priority prefix" on the repo URL,
like so:
version_devel_repos:
octopus:
leap-15.2:
- '96!https://download.opensuse.org/repositories/filesystems:/ceph:/octopus/openSUSE_Leap_15.2'
This would cause the zypper repo to be added at priority 96.
In Ceph versions "octopus" and newer, the Ceph daemons run inside containers.
When the cluster is bootstrapped, a container image is downloaded from a remote
registry. The default image paths are set by the internal dictionaries
IMAGE_PATHS_DEVEL
and IMAGE_PATHS_PRODUCT
. You can specify a different
image path using the --image-path
option to e.g., sesdev create octopus
.
If you would like to permanently specify a different image path for one or more
Ceph versions, you can override the defaults by adding a stanza like the
following to your ~/.sesdev/config.yaml
:
image_paths_devel:
octopus:
ceph: 'registry.opensuse.org/filesystems/ceph/octopus/images/ceph/ceph'
In case there is a need to use some container registry mirror, it is possible
to override registry location, and disable ssl if required. For example,
similar record can be add to the ~/.sesdev/config.yaml
.
container_registry:
prefix: 'registry.suse.de'
location: '1.2.3.4:5000'
insecure: True
Custom libvirt image vagrant box can be provided using os_box
record for each os
:
os_box:
sles-15-sp2: 'http://1.2.3.4/mirror/SLE-15-SP2/images/SLES15-SP2-Vagrant.x86_64-libvirt.box'
When the user does not give the --roles
option on the command line, sesdev
will use the default roles for the given deployment version. These defaults can
be changed by adding a version_default_roles
stanza to your ~/.sesdev/config.yaml
:
version_default_roles:
octopus:
- [master, mon, mgr, storage]
- [mon, mgr, storage]
- [mon, mgr, storage]
This is the default, so no tweaking of config.yaml is necessary. Just:
sesdev create octopus
Run sesdev create octopus
with the following options:
sesdev create octopus \
--repo-priority \
--repo https://download.opensuse.org/repositories/filesystems:/ceph:/octopus:/upstream/openSUSE_Leap_15.2 \
--image-path registry.opensuse.org/filesystems/ceph/octopus/upstream/images/ceph/ceph
Alternatively, add the following to your config.yaml
to always use these
options when deploying octopus
clusters:
version_devel_repos:
octopus:
leap-15.2:
- 'https://download.opensuse.org/repositories/filesystems:/ceph:/octopus/openSUSE_Leap_15.2'
- '96!https://download.opensuse.org/repositories/filesystems:/ceph:/octopus:/upstream/openSUSE_Leap_15.2'
image_paths_devel:
octopus:
ceph: 'registry.opensuse.org/filesystems/ceph/octopus/upstream/images/ceph/ceph'
Note: The elevated priority on the filesystems:ceph:octopus:upstream
repo is needed to ensure that the ceph package from that project gets installed
even if RPM evaluates its version number to be lower than that of the ceph
packages in the openSUSE Leap 15.2 base and filesystems:ceph:octopus
repos.
This is the default, so no tweaking of config.yaml is necessary. Just:
sesdev create ses7
Note that this will work even if there is no ceph package visible at https://build.suse.de/project/show/Devel:Storage:7.0 since it uses the installation media repo, not the "SLE_15_SP2" repo.
This is the default, so no tweaking of config.yaml is necessary. Just:
sesdev create ses7p
Note that this will work even if there is no ceph package visible at https://build.suse.de/project/show/Devel:Storage:7.0:Pacific since it uses the installation media repo, not the "SLE_15_SP3" repo.
The ceph package in Devel:Storage:7.0:CR
has the same version as
the one in filesystems:ceph:master:upstream
, so the procedure for
using it is similar:
sesdev create ses7 \
--repo-priority \
--repo http://download.suse.de/ibs/Devel:/Storage:/7.0:/CR/SLE_15_SP2/ \
--image-path registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph
Alternatively, add the following to your config.yaml
to always use
these options when deploying ses7
clusters:
version_devel_repos:
ses7:
sles-15-sp2:
- 'http://download.suse.de/ibs/SUSE:/SLE-15-SP2:/Update:/Products:/SES7/images/repo/SUSE-Enterprise-Storage-7-POOL-x86_64-Media1/'
- 'http://download.suse.de/ibs/Devel:/Storage:/7.0/images/repo/SUSE-Enterprise-Storage-7-POOL-x86_64-Media1/'
- '96!http://download.suse.de/ibs/Devel:/Storage:/7.0:/CR/SLE_15_SP2/'
image_paths_devel:
ses7:
ceph: 'registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph'
Note: The elevated priority on the Devel:Storage:7.0:CR
repo is needed to
ensure that the ceph package from that project gets installed even if RPM
evaluates its version number to be lower than that of the ceph packages in the
SES7 Product and Devel:Storage:7.0
repos.
The "octopus", "ses7", "ses7p", and "pacific" deployment versions can be told to use wire encryption (a feature of the Ceph Messenger v2), where Ceph encrypts its own network traffic.
In order to deploy a cluster with Messenger v2 encryption, we need to either prioritise 'secure' over 'crc' mode, or only provide 'secure' mode.
The specific ceph options used to accomplish this are:
ms_cluster_mode
ms_service_mode
ms_client_mode
By default all of these are set to crc secure
, which prioritises crc
over full encryption (secure
).
To tell sesdev to deploy a cluster with wire encryption active, provide one of the following two options:
--msgr2-secure-mode
: This sets the above 3 options to just 'secure'.
--msgr2-prefer-secure
: This changes the order to secure crc
so secure
is prefered over crc.
These only effect msgr2, so anything talking msgr1 (like the RBD and CephFS kernel clients) will be unencrypted.
sesdev has limited ability to deploy non-SUSE environments. Read on for details.
Ubuntu Bionic is supported with the octopus
deployment version. For example:
sesdev create octopus --os ubuntu-bionic
sesdev create octopus --single-node --os ubuntu-bionic
This will create Ubuntu 18.04 VMs and bootstrap a Ceph Octopus cluster on them
using cephadm bootstrap
. To stop the deployment before bootstrap, give the
--stop-before-cephadm-bootstrap
option.
Ubuntu Focal is supported with the octopus
, pacific
, quincy
, reef
deployment versions, and this os is the default for quincy
and reef
.
For example:
sesdev create pacific --os ubuntu-focal
sesdev create pacific --single-node --os ubuntu-focal
This will create Ubuntu 20.04 VMs and bootstrap a Ceph Pacific cluster on them
using cephadm bootstrap
. To stop the deployment before bootstrap, give the
--stop-before-cephadm-bootstrap
option.
When deploying Ceph Quincy there is no need to specify --os
option:
sesdev create quincy
sesdev create quincy --single-node
Please note that the sesdev status
and sesdev show
commands take
a --format
option, which can be used to make the command produce JSON output
(easily parsable by computer programs) as opposed to the default format
(intended to be read by humans).
$ sesdev status
$ sesdev status <deployment_id> [NODE]
For example, if I want to see the status of all nodes in deployment "foo":
$ sesdev status foo
If I want to see the status of just one node3 in deployment "foo":
$ sesdev status foo node3
$ sesdev show --detail <deployment_id>
The following command provides all details of a deployment, including the roles of all nodes:
$ sesdev show --detail <deployment_id>
If you need to find which node of a deployment contains role "foo", try this:
$ sesdev show --nodes-with-role=<role> <deployment_id>
$ sesdev ssh <deployment_id> [NODE]
Spawns an SSH shell to the master node, or to node NODE
if explicitly
specified. You can check the existing node names with the following command:
$ sesdev show <deployment_id>
sesdev
provides a subset of scp
functionality. For details, see:
$ sesdev scp --help
It's possible to use an SSH tunnel to enble TCP port-forwarding for a service running in the cluster. Currently, the following services can be forwarded:
- dashboard - The Ceph Dashboard (nautilus and above)
- grafana - Grafana metrics dashboard
- suma - SUSE Manager (octopus only)
$ sesdev tunnel <deployment_id> dashboard
The command will output the URL that you can use to access the dashboard.
For deployments that used ceph-salt, it's possible to replace the ceph-salt installed by sesdev with a different one:
$ sesdev replace-ceph-salt --local <path> <deployment_id>
Assuming <path>
points to ceph-salt source code, the command will work
regardless of whether ceph-salt was originally installed from source or
from RPM.
It's possible to replace Ceph MGR modules with a version found in a github PR, git branch or in a local repository.
This can be helpful to test PRs in a cluster with all services enabled.
$ sesdev replace-mgr-modules <deployment_id> <pr>
A custom repo can be added to all nodes of a running cluster using the following command:
$ sesdev add-repo <deployment_id> <repo_url>
If the repo URL is omitted, the "devel" repo (as defined for the Ceph version deployed) will be added.
If you want to also update packages on all nodes to the versions in that repo,
give the --update
option. For example, one can test an update scenario by
deploying a cluster with the --product
option and then updating the cluster to
the packages staged in the "devel" project:
$ sesdev add-repo --update <deployment_id>
When sesdev deploys a Ceph cluster, the "public network" of the cluster points at a virtual network that was created by libvirt together with the cluster VMs. Although Ceph calls it the "public network", this network is actually private in the sense that, due to iptables rules created by libvirt, packets from this network cannot reach the "public networks" of other Ceph clusters deployed by sesdev, even though they are all on the same host (the libvirt host).
Under ordinary circumstances, this is a good thing because it prevents packets from one sesdev environment from reaching other sesdev environments. But there are times when one might wish the various libvirt networks were not so isolated from each other -- such as when trying to set up RGW Multisite, RBD Mirroring, or CephFS Snapshot Sync between two sesdev clusters.
If you need your clusters to be able to communicate with each other over the network and you are desperate enough to mess with iptables on the libvirt host to accomplish it, run the following commands as root on the libvirt host:
# iptables -F LIBVIRT_FWI
# iptables -A LIBVIRT_FWI -j ACCEPT
The LIBVIRT_FWI chain (part of the FORWARD table) contains the rules ensuring that Vagrant environments cannot see or communicate with one another over the network. The first command flushes the chain (deletes all these rules), and the second one replaces them all with a single rule which unconditionally accepts any packets that are processed through this chain. This has the effect of completely opening up all libvirt VMs to communicate with all other libvirt VMs on the same host.
It can also be useful to add lines to /etc/hosts
and
/root/.ssh/authorized_keys
on the two clusters so nodes on the "other"
cluster can be referred to by their Fully Qualified Domain Names (FQDNs, e.g.
"master.octopus2.test") and to facilitate SSHing between the two clusters. This
can be accomplished very easily by issuing the following command:
$ sesdev link <deployment_id_1> <deployment_id_2>
where <deployment_id_1>
and <deployment_id_2>
are the deployment IDs of two
existing sesdev clusters.
A running cluster can be stopped by running the following command:
$ sesdev stop <deployment_id>
To remove a cluster (both the deployed VMs and the configuration), use the following command:
$ sesdev destroy <deployment_id>
It has been reported that vagrant-libvirt sometimes leaves networks behind when
destroying domains (i.e. the VMs associated with a sesdev deployment). If this
bothers you, sesdev destroy
has a --destroy-networks
option you can use.
If your libvirtd machine has enough memory, you can use sesdev to run "make check" in various environments. Use
$ sesdev create makecheck --help
to see the available options.
RAM CAVEAT: the default RAM amount for the makecheck might not be sufficient.
If you have plenty of memory on your libvirtd machine, running with higher
values of --ram
(the higher, the better) is recommended.
CPUS CAVEAT: using the --cpus
option, it is also possible increase the number
of (virtual) CPUs available for the build, but values greater than four have not
been well tested.
The sesdev create makecheck
command will (1) deploy a VM, (2) create an
"ordinary" (non-root) user with passwordless sudo privileges and, as this
user (3) clone the specified Ceph repo and check out the specified branch,
(4) run install-deps.sh
, and (5) run run-make-check.sh
.
The following sub-sections provide instructions on how to reproduce some common "make check" scenarios.
This is the default. Just:
$ sesdev create makecheck
$ sesdev create makecheck --os leap-15.2 --ceph-branch octopus
(It is not necessary to give --ceph-repo https://github.com/ceph/ceph
here,
since that is the default.)
$ sesdev create makecheck --os sles-15-sp2 \
--ceph-repo https://github.com/SUSE/ceph \
--ceph-branch ses7
More combinations are supported than are described here. Compiling
the respective sesdev create makecheck
commands for these environments is left
as an exercise for the reader.
If you like to add configuration files or run arbitrary commands on each VM on
deployment, you can do so by providing these files in the
~/.sesdev/.user_provision
directory.
Note that all configuration files are copied to all the VMs on deployment, as
well as the provision.sh
file is executed on all VMs on deployment.
To have configuration added automatically to each VM, simply put them into the
~/.sesdev/.user_provision/config
directory. All files in this directory will be
copied to /root
on the hosts.
Create a file ~/.sesdev/.user_provision/config/.vimrc
and it will be copied to
/root/.vimrc
on each host on deployment, so you will always have your personal
Vim configuration on all hosts across all deployments.
Running any commands is achieved by creating a
~/.sesdev/.user_provision/provision.sh
file. The script will be executed after
the deployment of a VM has been successfully completed.
Custom provisioning can be triggered manually by issuing sesdev user-provision <deployment-id>
or sesdev user-provision <deployment-id> <host>
respectively.
The former command applies the custom provisioning to all VMs in the deployment,
whereas the latter variant only to a single VM.
This section describes some common pitfalls and how to resolve them.
After deleting the ~/.sesdev
directory, sesdev create
fails because
Vagrant throws an error message containing the words "domain about to create is
already taken".
As described
here,
this typically occurs when the ~/.sesdev
directory is deleted. The libvirt
environment still has the domains, etc. whose metadata was deleted, and Vagrant
does not recognize the existing VM as one it created, even though the name is
identical.
As described here, this can be resolved by manually deleting all the domains (VMs) and volumes associated with the old deployment (note: the commands must be run as root):
# virsh list --all
# # see the names of the "offending" machines. For each, do:
# virsh destroy <THE_MACHINE>
# virsh undefine <THE_MACHINE>
# virsh vol-list default
# # For each of the volumes associated with one of the deleted machines, do:
# virsh vol-delete --pool default <THE_VOLUME>
You run ses create
but it does nothing and gives you a traceback ending with
an error:
libvirt.libvirtError: Storage pool not found: no storage pool with matching name 'default'
For whatever reason, your libvirt deployment does not have a default pool defined. You can verify this by running the following command as root:
# virsh pool-list
In a working deployment, it says:
Name State Autostart
-------------------------------
default active no
but in this case the "default" storage pool is missing. (One user hit this when deploying sesdev on SLE-15-SP1.)
The "libvirt-daemon" RPM owns a directory /var/lib/libvirt/images
which is
intended to be associated with the default storage pool:
# rpm -qf /var/lib/libvirt/images
libvirt-daemon-5.1.0-lp151.7.6.1.x86_64
Assuming this directory exists and is empty, you can simply create a storage pool called "default" that points to this directory, and the issue will be resolved (run the commands as root):
# virsh pool-define /dev/stdin <<EOF
<pool type='dir'>
<name>default</name>
<target>
<path>/var/lib/libvirt/images</path>
</target>
</pool>
EOF
# virsh pool-start default
# virsh pool-autostart default
Credits to Federico Simoncelli for the resolution, which I took from his post here
You create and destroy a sesdev deployment, perhaps even several times, and then you notice that virtual networks get left behind. For example, after several create/destroy cycles on deployment "foo":
# virsh net-list
Name State Autostart Persistent
----------------------------------------------------
foo0 active yes yes
foo1 active yes yes
foo10 active yes yes
foo2 active yes yes
foo3 active yes yes
foo4 active yes yes
foo5 active yes yes
foo6 active yes yes
foo7 active yes yes
foo8 active yes yes
foo9 active yes yes
vagrant-libvirt active no yes
It has been reported that vagrant-libvirt sometimes leaves networks behind when it destroys domains (i.e. the VMs associated with a sesdev deployment). We do not currently know why, or under what conditions, this happens.
If this behavior bothers you, sesdev destroy
has a --destroy-networks
option
you can use. Of course, sesdev destroy --destroy-networks
only works for the
network(s) associated with the VMs in the deployment being destroyed. To quickly
destroy a bunch of networks, construct a script like this one:
#!/bin/bash
read -r -d '' NETZ <<EOF
foo0
foo1
foo2
foo3
foo4
foo5
foo6
foo7
foo8
foo9
foo10
EOF
for net in $NETZ ; do
virsh net-destroy $net
virsh net-undefine $net
done
The script should be run as root on the libvirt server.
An unsupported, user-contributed version of this script -- contrib/nukenetz.sh
-- can be found in the source-code tree.
Also, read the next section for more relevant information.
You ran sesdev destroy
but there were errors and you suspect that a deployment
(or deployments) might not have been completely destroyed.
The command sesdev destroy
has been known to fail, leaving a deployment "not
completely destroyed".
A sesdev deployment DEP_ID
consists of several components:
- a subdirectory under
~/.sesdev/DEP_ID
- some number of libvirt domains
- some number of libvirt storage volumes in the
default
storage pool - some number of libvirt networks
and the names of all the libvirt domains, volumes, and networks used by domain
DEP_ID
can be expected to begin with DEP_ID
. For example, if the DEP_ID
is "octopus", the associated libvirt artifacts will have names starting with
"octopus".
Use the following commands to check for vestiges of your deployment:
sudo virsh list --all | grep '^ DEP_ID'
sudo virsh vol-list default | grep '^ DEP_ID'
sudo virsh net-list | grep '^ DEP_ID'
(cd ~/.sesdev ; ls -d1 */ | grep '^DEP_ID')
Then, assuming you use your libvirt instance is dedicated to sesdev
and not
used for anything else, you could use the following commands to delete
everything you found, and that would clean up the partially-destroyed
deployment:
sudo virsh destroy LIBVIRT_DOMAIN
sudo virsh undefine LIBVIRT_DOMAIN
sudo virsh vol-delete --pool default LIBVIRT_STORAGE_POOL
sudo virsh net-destroy LIBVIRT_NETWORK
sudo virsh net-undefine LIBVIRT_NETWORK
When attempting to create or list deployments on a remote libvirt/SSH server, sesdev barfs out a Python traceback ending in:
libvirt.libvirtError: Failed to connect socket to
'/var/run/libvirt/libvirt-sock': No such file or directory
When told to use remote libvirt/SSH, sesdev expects that there won't be any libvirtd instance running locally. This Python traceback is displayed when
- sesdev is configured to use remote libvirt/SSH, and
- libvirtd.service is running locally
Stop the local libvirtd.service.
When the --synced-folder
option is provided, the deployment fails with
something like:
mount -o vers=3,udp 192.168.xxx.xxx:/home/$USER/.sesdev/$NAME /$PATH
Stderr from the command:
mount.nfs: Unknown error 521
This indicates that your nfs-server is not working properly or hasn't started yet.
Please make sure that your nfs-server is up and running without errors:
# systemctl status nfs-server
If this doesn't report back with active
, please consider running:
# systemctl restart nfs-server
# systemctl enable nfs-server
I'm running sesdev on a remote machine and I want to access the dashboard of a cluster deployed by sesdev on that machine. Since the machine is remote, I can't just fire up a browser on it. I would like to point a browser that I have running locally (e.g. on a laptop) at the dashboard deployed by sesdev on the remote machine. I've tried a bunch of stuff, but I just can't seem to make it work.
There are two possible pitfalls you could be hitting. First: if you do
sesdev tunnel DEP_ID dashboard
sesdev will choose an IP address essentially at random. Your remote sesdev machine very likely has multiple IP addresses and sesdev, in accordance with Murphy's Law, sesdev is choosing an IP address which is not accessible from the machine where the browser is running.
However, even when specifying --local-address CORRECT_IP_ADDRESS
, it still
might not work if there are other dashboard instances (sesdev or bare metal)
running on the remote machine and already listening on the port where the newly
deployed dashboard is listening. In other words, there might be other dashboards
running on the machine that you're not aware of.
Things are further confused by the nomenclature of the sesdev tunnel
command.
What sesdev refers to as "local address/port" is actually the address/port on
the remote machine (remote to you, but local to sesdev itself). What it refers
to as "remote port" is the port that is being tunneled (the one inside the VM,
on which the dashboard is listening).
First, you have to be really sure that the "local IP address" you feed into the
sesdev tunnel
command is (1) a valid IP address of the sesdev machine that (2)
is accessible from the browser running on your local machine.
Once you are sure of the correct IP address, use sesdev ssh DEP_ID
to enter
the cluster and run
ceph mgr services
This will tell you the node where the dashboard is running, and the port that it's listening on, and the protocol to use (http or https). Carefully write down of all three pieces of information. Now, do:
sesdev tunnel DEP_ID \
--node NODE_WHERE_DASHBOARD_IS_RUNNING \
--remote-port PORT_WHERE_DASHBOARD_IS_LISTENING \
--local-address CORRECT_IP_ADDRESS \
--local-port ANY_ARBITRARY_HIGH_NUMBERED_PORT
The output of this command will say
You can now access the service in: CORRECT_IP_ADDRESS:ANY_ARBITRARY_HIGH_NUMBERED_PORT
Now, you probably can't just paste that URL into your browser, because the dashboard is likely using SSL (the default). Instead, refer to your notes to determine the protocol the dashboard is using (probably "https", but might be "http" if SSL is disabled), and then fashion a fully-qualified URL like so:
PROTOCOL://CORRECT_IP_ADDRESS:ANY_ARBITRARY_HIGH_NUMBERED_PORT
One final note: it's a good practice to use a different
ANY_ARBITRARY_HIGH_NUMBERED_PORT
every time you run sesdev tunnel
. This is
because of https://github.com/SUSE/sesdev/issues/276
.
I'm running sesdev create
with --ipv6
option, and I'm getting the following error:
Error while activating network: Call to virNetworkCreate failed: internal error:
Check the host setup: enabling IPv6 forwarding with RA routes without accept_ra
set to 2 is likely to cause routes loss. Interfaces to look at: enp0s25.
Set "Accept Router Advertisements" to 2 ("Overrule forwarding behaviour"), by running:
sysctl -w net.ipv6.conf.<if>.accept_ra=2
Where <if>
is the network interface from the error, or all
if you want to apply
the config to all network interfaces.
After starting libvirtd.service
, systemctl status libvirtd.service
says
Failed to intialize libnetcontrol. Management of interface devices is disabled
(Yes, it really says "intialize" instead of "initialize".)
At present, libvirtd
works together well with the wicked
network management
system. It does not work so well with NetworkManager
, so if you see this
message it probably means you are using NetworkManager
.
These two - wicked
and NetworkManager
- are mutually exclusive: you must
have one or the other, and you cannot have both at the same time.
The resolution is to disable NetworkManager
by enabling wicked
and
configuring it properly (i.e. so you don't experience any loss in network
connectivity).
Refer to your operating system's documentation to learn how to configure
networking with wicked
. For example, for openSUSE Leap 15.2 you can refer
to Reference -> System -> Basic Networking:
https://doc.opensuse.org/documentation/leap/reference/html/book-opensuse-reference/cha-network.html
Once you have wicked
running without any loss of network connectivity, proceed
to restart libvirtd:
# systemctl restart libvirtd.service
After this, the Failed to intialize libnetcontrol
message should no longer
appear in the journal (log) of libvirtd.service
.
If you would like to submit a patch to sesdev, or otherwise participate in the
sesdev community, please read the files CONTRIBUTING.rst
and
CODE_OF_CONDUCT.md
in the top-level directory of the source code
distribution. These files can also be found on-line:
https://github.com/SUSE/sesdev/blob/master/CONTRIBUTING.rst https://github.com/SUSE/sesdev/blob/master/CODE_OF_CONDUCT.md