bos1.openelectronicslab.org

This is a repository for configuration files for the servers in the bos1.openelectronicslab.org subdomain.

General layout

kubeworker0 - kubeworker3
- VMs, paired to boletus0 - boletus3
kubecontroller0 - kubecontroller2
- VMs, paired to boletus0 - boletus2
testvm3 - running on boletus3
boletus0 - boletus3.bos1.openelectroncslab.org
- act as KVM hosts
- 16 core (32ht), 256GB RAM
- 7 TB SSD, 2x16TB hard disk
morchella0.bos1.openelectroncslab.org: management machine
- 6 ethernet ports: 1 to top of rack, 4 to boletus machines, 1 unused
- minimal ram & storage, no IPMI

Directory structure

group-vars: Ansible variable definitions for groups of machines.
host-vars: Ansible variable definitions for specific machines.
initial-boot: files for installing a basic OS on first boot
- morchella: usb setup drive for the firewall server
roles: Ansible roles
secrets: files involved in secrets management
- ssh-keys: ssh public keys for users and administrators
- gpg-keys: gpg public keys for users and administrators
- vault-keys: master keys for ansible-vault secrets. Each vault has a subdirectory containing a gpg-encrypted key for each user.
util: misc. utility scripts

IPMI/KVM access for boletus servers (workaround)

The boletus servers use an older version of the supermicro BMC software that requires an older (unsupported) version of Java to run. The current attempt to run this version of java in a container on morchella0 is not working for unclear reasons. We have a (temporary?) workaround for this using SSH tunneling and a container on the local machine. You will first need to install some required packages:

apt install socat psmisc docker.io

You may then want to add yourself to the docker group (if you skip this step, you will need to sudo the docker commands that follow):

# add yourself to the docker group
sudo adduser $USER docker
# reload group membership
# first change current GID to docker
newgrp docker
# reset GID to $USER, now will include docker
newgrp

Next, you will need to build the docker container with the old java vm and a compatible version of firefox:

cd roles/ipmi_viewer/files/ipmi-kvm-docker
docker image build -t ipmi_kvm_docker .

You can then run the container, for example:

docker container run --network="host" --user $(id -u):$(id -g) -it --rm \
    --volume /etc/hosts:/etc/hosts --volume /etc/group:/etc/group \
    --volume /etc/passwd:/etc/passwd ipmi_kvm_docker:latest

Leave this container running, and in a new console window run the following script to create an ssh tunnel from boletus0-ipmi through morchella to your local machine for the whole array of required ports for IPMI and with the java kvm (note than you can replace boletus0 with boletus1, etc.).

util/proxy_ipmi boletus0

At this point you should be able to open a local web browser and point it at localhost:8080 and see a vnc window containing an older version of firefox. If you connect to https://localhost in this older version of firefox, you should see be able to navigate the BMC web interface and open a java KVM.

This is far from an ideal solution, but other things I've tried (e.g. running a similar container on morchella0 or trying to use the default bridged networking and map the ports inside of the container rather than for the host) all seem to cause a "connection failed" error in the java kvm client. Something cleaner should be possible, but I haven't found it yet.

Setup notes

Secrets management

All shared secrets are stored in the git repository along with the other setup files. The secrets are encrypted as individual variables using ansible-vault. The vault keys are then encrypted using the GPG public keys of the Open Electronics Labs administrators and stored in the secrets/vault-keys/'VAULT NAME' directory.

GPG setup

Each user will need to generate a GPG key as follows (if they don't already have one). Run the following command, choose RSA/RSA keys with a 4096 bit key length, give it a good password, and choose an email address to associate the key with:

gpg --full-gen-key

You will then need to export your public key to a file:

gpg -a --output $USER.asc --export [your email address]

If you have multiple keys with the same email address the fingerprint can be used instead.

Place this file in the secrets/gpg-keys directory:

cp -iv $USER.asc secrets/gpg-keys
git add secrets/gpg-keys/$USER.asc
# git commit && git push

After pushing the key, have someone with access to the repository re-encrypt each of the vault keys using the new set of GPG keys, e.g.

secrets/re-encrypt-vault-key.sh vault0

The easiest way to share your gpg keys with another machine is just by copying the contents of the ~/.gnupg directory:

scp -r ~/.gnupg [remote host]:

You can also copy individual keys using the --export-secret-key and --import flags for GPG - see the documentation for more details.

GPG Agent Forwarding

Agent forwarding can be done with ssh. The syntax is along the lines of port forwarding, but with sockets paths, similar to this:

LOCAL_SOCKET=$(gpgconf --list-dir agent-extra-socket)
REMOTE_SOCKET=$(ssh $REMOTE_HOSTNAME gpgconf --list-dir agent-socket)
ssh $REMOTE_HOSTNAME rm $REMOTE_SOCKET
ssh -R $REMOTE_SOCKET:$LOCAL_SOCKET $REMOTE_HOSTNAME

The agent-extra-socket is a restricted socket. You may see errors like:

gpg: connection to agent is in restricted mode
gpg: setting pinentry mode 'loopback' failed: Forbidden

or:

gpg: public key decryption failed: Invalid IPC response

If this is the case, try using the non-restricted socket:

LOCAL_SOCKET=$(gpgconf --list-dir agent-socket)

If you have text-entry of the gpg passphrase, ensure ~/.gnupg/gpg.conf contains:

use-agent
pinentry-mode loopback

And ensure ~/.gnupg/gpg-agent.conf contains:

allow-loopback-pinentry

Restart the agent with:

echo RELOADAGENT | gpg-connect-agent

Creating new secrets

A new password (e.g. for an SQL admin account) can be generated with the command

secrets/generate-ansible-secret.sh

This will print out an Ansible variable declaration with an encrypted value that can be cut and pasted into Ansible code. The value itself will be a 45 character (upper and lower case letter) password, containing about 256 bits of entropy. Note that adding numbers, underscores, and dashes would only add about 0.3 bits of entropy per character, and thus the resulting passwords would only be about 2 characters shorter and would likely be more cumbersome and error-prone to type.

If random passwords of another form are needed, (e.g. of a different length) they can often easily be generated just by modifying the code that generate-ansible-secret.sh uses:

tr -dc A-Za-z < /dev/random | head -c 45 | \
    ansible-vault encrypt_string --stdin-name 'my_variable'

A similar approach can also be used for encrypting other secrets, e.g. TLS certificates:

cat my-ssl.key | \
    ansible-vault encrypt_string --stdin-name 'my_ssl_key_variable'

Secret rotation strategy

Automated rotation of secrets would be ideal, but requires service-specific code for each type of secret (e.g. LUKS passwords, SQL admin keys, etc) and thus probably too labor intensive for the first version. Instead, keys will have to be manually rotated.

To rotate the secrets, first a new vault, e.g. vault1, should be created with a new encryption key:

secrets/generate-vault-key.sh vault1

The ansible.cfg file should then be changed to include the new vault and use it as the default for encryption keys. For example, if the ansible.cfg file previously contained the two lines

vault_identity_list=vault0@secrets/vault-keys/vault0/decrypt.sh
vault_encrypt_identity=vault0

these could be changed to

vault_identity_list=vault0@secrets/vault-keys/vault0/decrypt.sh,vault1@secrets/vault-keys/vault1/decrypt.sh
vault_encrypt_identity=vault1

At this point the code will be able to use a mixture of secrets encrypted with the old and new vault keys, and new secrets will use the new vault keys by default. One can then gradually replace all of the secrets encrypted with the old vault key with new secrets encrypted with the new vault key. It should be easy to identify all of the old secrets by using tools like git grep and searching for the old vault name (vault0 in this example), but replacing them will probably require additional steps (e.g. replacing LUKS keys on an encrypted partition or updating passwords in config files and SQL databases).

Bootstrapping the servers

First, you'll need to create a bootable USB drive for morchella0:

ansible-playbook make_install_media.yml
sudo dd if=build/make_install_media/morchella0/autoinstall.iso \
    of=[usb device] bs=4M conv=fsync

Next, you'll need to boot the firewall from this drive, which will install a minimal Debian setup with an encrypted root partition and Dropbear client for remotely unlocking the drive during boot.

TODO

Current plan for development:

Phase 1: Build a foundation (done)
- Boletus running debian binary blobs only
- Support for VMs on boletus with fixed assignments between vms and physical machines.
- VMs running binary builds of open source software from relatively trusted sources.
- Distributed storage via ceph running on VMs (using Rook?)
  - space reserved on each Boletus machine for this purpose
  - encryption at rest (via ceph?)
- Kubernetes running via distro (k3s?) in 7 VMs
  - set up 3 controler vms and 4 worker vms
  - Get routing working between vms
  - install k3s on vms
  - join up machines into cluster
- VIP from ToR switch to cluster (e.g: ha proxy)
- Distributed storage on cluster using Rook
- Next cloud running in a way that will automatically recover (within 5 minutes is fine) from the failure of any boletus machine (e.g. by running in a container on kubernetes using shared ceph storage)
Phase 2: Collaboration environment
- backup infra
- move git.openelectronicslab.org into Kubernetes
- mumble in containers
- hacking environment (i.e.: migrate venus)
- overleaf
- hackmd?
- jitsi?
- monitoring, graphing
  - measure usage
Phase 3: Robustness
- infra to verify ansible builds in VMs prior to live run
- rebuild venus as gymnopilus0.ams1.openelectronicslab.org
- DNS (bos1 & ams1)
- build otp1.openelectronicslab.org
- alerting
- chaos monkey
  - use IPMI to kill boxes?
Phase 4: Make it right :-)
- start replacing binaries with automated repeatable build-from-source
Phase 5: Go pro
- vm migration
- ensure standard services use no more than 25% resources, so that anyone may use a different 25% use without need for discussion.
  - learn how to "nice" jobs (pre-emptable)

perfomance:

Ceph is much slower than local storage and may benefit from tuning. For example, running borg init -e none test.borg in the ceph-provisioned directory /var/lib/backups in the local-backups-toolbox container takes (26.8, 23.9, 25.2) seconds vs (0.674, 0.665, 0.664) seconds when run in /var/lib (in the containers local filesystem).

other

debug pmi_kvm_docker issue with the javaws console console redirect
- consider: https://github.com/sunfoxcz/ipmiview
- see also: https://github.com/ixs/kvm-cli
reproducible ISO builds
reproducible tftp builds
set BIOS and IPMI passwords
safety checks for commands like reinstall_os_via_pxe
group vars hackiness around pxe_mac and pxe_file variables
cleanup luks unlock ansible to not fail and recover but rather to test and store return codes

supporting physical machines

morchella wireguard and kvm setup
PXE boot setup
boletus PXE boot
boletus ansible setup
static IP setup
back-haul network setup
boletus firewall/reverse proxy setup (keepalived + haproxy?)
virtual machine hypervisor (kvm/virt-lib?)
network storage setup (ceph?)
encryption unlock to listen on multiple cards
fix qemu pxeboot test for (machine name)
infiniband setup
- sudo modprobe ib_ipoib
- sudo apt install -y opensm
- sudo ip addr add 10.0.0.10/24 dev ibp2s0
- sudo ip link set ibp2s0 up

virtual machine infrastructure

clustering setup (kubernetes?)
networking layer (calico?)
microservice mesh (istio?)
logging/metrics (Prometheus + Grafana?)

services

DNS (unbound?)
email openelectronicslab.org
email list management
video conf (jit.si?) and voice (mumble?)
mirrored/HA gitlab
nextcloud
file synching service (syncthing?)
hosted xvnc dev env
jupyter, tensorflow, gpu gpu hacking env

migration

move git.openelectronicslab.org into new infra
move logicgate.nl into our infra
create logicgate email
transform venus into cloud node
move tdf web content into our infra
move tdf email into our infra
current shared tdf will be replaced, old hardware will spin-down soon
- new hardware has IPMI and second unit; potential to be a 'sea1' location
- let's chat with keithr
move kendrickshaw.org website

improvements

document how the "temp" passphrase can be exploited if there is physical access to the machines
replace the "temp" passphrase with a secret
switch from iptables to nftables syntax in ansible scripts
add wireguard VPN accesses to management network etc.
establish a non-sucky way to repo-sync with big-name git repository sites

Name		Name	Last commit message	Last commit date
Latest commit History 218 Commits
ams1		ams1
bos1		bos1
files/manifests		files/manifests
notes		notes
roles		roles
secrets		secrets
templates/manifests		templates/manifests
util		util
.gitignore		.gitignore
ansible.cfg		ansible.cfg
apply_kube_manifests.yml		apply_kube_manifests.yml
install_prereqs.sh		install_prereqs.sh
make_install_media.yml		make_install_media.yml
readme.md		readme.md
reinstall_kubernetes.yml		reinstall_kubernetes.yml
reinstall_os_via_pxe.yml		reinstall_os_via_pxe.yml
reinstall_os_via_usb.yml		reinstall_os_via_usb.yml
site.yml		site.yml
unlock-luks.yaml		unlock-luks.yaml
upgrade-debian.yaml		upgrade-debian.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bos1.openelectronicslab.org

General layout

Directory structure

IPMI/KVM access for boletus servers (workaround)

Setup notes

Secrets management

GPG setup

GPG Agent Forwarding

Creating new secrets

Secret rotation strategy

Bootstrapping the servers

TODO

perfomance:

other

supporting physical machines

virtual machine infrastructure

services

migration

improvements

About

Releases

Packages

Contributors 4

Languages

OpenElectronicsLab/bos1.openelectronicslab.org

Folders and files

Latest commit

History

Repository files navigation

bos1.openelectronicslab.org

General layout

Directory structure

IPMI/KVM access for boletus servers (workaround)

Setup notes

Secrets management

GPG setup

GPG Agent Forwarding

Creating new secrets

Secret rotation strategy

Bootstrapping the servers

TODO

perfomance:

other

supporting physical machines

virtual machine infrastructure

services

migration

improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages