Skip to content

Commit

Permalink
WIP to be squashed properly
Browse files Browse the repository at this point in the history
  • Loading branch information
Julien Girardin committed Dec 14, 2023
1 parent 8181515 commit cd3fec6
Show file tree
Hide file tree
Showing 14 changed files with 88 additions and 112 deletions.
38 changes: 12 additions & 26 deletions docs/guides/join_nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,44 +27,30 @@ node-3
If you don't have provision a load-balancer and require the local haproxy to be deployed:

```
ansible-playbook -i inventory enix.kubeadm.00_apiserver_proxy -l kube_control_plane:nodes-3
ansible-playbook -i inventory enix.kubeadm.00_apiserver_proxy -e limit=nodes-3
```
You can skip the `-l` argument, if you're cluster doesn't have pending change you want to preserve on other nodes.
Don't forget to put all control_plane or it will fail to provision the apiserver proxy
You need to specify the `limit` variable via "extra-vars", because `-l` cannot really work in the context of ansible-kubeadm
(you need to connect to all the masters to get the IP needed to configure the loadbalancer)

### Joining nodes

### Create bootstrap-token

Then create a bootstrap token by adding using the `bootstrap_token` tag.
Don't use a limit that skip control plane nodes.
You can join a node and skip other changes on other nodes by specify the limit variable.

```
ansible-playbook -i inventory.cfg enix.kubeadm.01_site -t bootstrap_token
ansible-play -i inventory.cfg enix.kubeadm.01_site -e limit=nodes-3
```

No need to retrieve it by yourself, it will be discovered when joining the node
The token has a validity of 1H, so you don't need to repeat this step each time you try to join nodes

### Joining nodes

You can join a node and skip other changes to the cluster by using the `join` tag.
With the tag, you can limit to hosts you want to join.

```
ansible-play -i inventory.cfg enix.kubeadm.01_site -t join -l nodes-3
```

## Alternative method
### Create bootstrap-token

You can merge the creation of the boostrap token with the joining of the action of join:
Then create a bootstrap token by adding using the `bootstrap_token` tag.
Don't use a limit that skip control plane nodes.

```
ansible-playbook -i inventory.cfg enix.kubeadm.01_site -t bootstap_token,join -l kube_control_plane:node-3
ansible-playbook -i inventory.cfg enix.kubeadm.01_site -t bootstrap_token
```

Please note that you need to include a least one control plane node in the limit host pattern,
You can also skip the limit host pattern to apply to all nodes as those step are indempotent on their own: it will not mess with the current nodes.

# To join control-plane nodes
No need to retrieve it by yourself, it will be discovered when joining the node
The token has a validity of 1H, so you don't need to repeat this step each time you try to join nodes

There is no tag for this operation, you need to apply the entire playbook for this
11 changes: 3 additions & 8 deletions playbooks/00_apiserver_proxy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,10 @@
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
vars:
_control_plane: true
pre_tasks:
- name: 'Fail if not all master in the specified limit'
fail:
msg: 'Not all control_plane provided, ajust --limit to provid all control_plane'
when: groups[kube_cp_group|default("kube_control_plane")]|difference(ansible_play_hosts)|length > 0
roles:
- role: find_ip

- hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}'
- hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
pre_tasks:
- include_role:
Expand All @@ -36,7 +31,7 @@
vars:
kubeadm_hook_list: ['post_apiserver_proxy']

- hosts: 'haproxy_upgrade_group:&{{ kube_cp_group|default("kube_control_plane") }}'
- hosts: 'haproxy_upgrade_group:&{{ kube_cp_group|default("kube_control_plane") }}{{ ":" ~ limit if limit is defined else "" }}'
serial: '{{ upgrade_cp_serial|default(1) }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
pre_tasks:
Expand All @@ -52,7 +47,7 @@
vars:
kubeadm_hook_list: ['post_proxy_upgrade_haproxy']

- hosts: 'haproxy_upgrade_group:&{{ kube_worker_group|default("kube_workers") }}'
- hosts: 'haproxy_upgrade_group:&{{ kube_worker_group|default("kube_workers") }}{{ ":" ~ limit if limit is defined else "" }}'
serial: '{{ upgrade_worker_serial|default(1) }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
pre_tasks:
Expand Down
16 changes: 8 additions & 8 deletions playbooks/01_site.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
vars:
kubeadm_hook_list: ['post_preflight_cp']

- hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}'
- hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
roles:
- role: find_ip
Expand All @@ -42,7 +42,7 @@
roles:
- role: process_reasons

- hosts: '{{ kube_cp_group|default("kube_control_plane") }}'
- hosts: '{{ kube_cp_group|default("kube_control_plane") }}{{ ":" ~ limit if limit is defined else "" }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
gather_facts: false
roles:
Expand Down Expand Up @@ -92,7 +92,7 @@
kubeadm_hook_list: ['post_config_update']

# This has to be overly cautious on package upgade
- hosts: cp_upgrade
- hosts: 'cp_upgrade{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
gather_facts: false
pre_tasks:
Expand All @@ -118,7 +118,7 @@

# Upgrade conrol-plane nodes
- name: 'Upgrade to control plane nodes'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}:&nodes_upgrade'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}:&nodes_upgrade{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
serial: '{{ upgrade_cp_serial|default(1) }}'
gather_facts: false
Expand Down Expand Up @@ -147,7 +147,7 @@

# Upgrade worker nodes
- name: 'Upgrade to workers nodes'
hosts: '{{ kube_worker_group|default("kube_workers") }}:&nodes_upgrade'
hosts: '{{ kube_worker_group|default("kube_workers") }}:&nodes_upgrade{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
serial: '{{ upgrade_worker_serial|default(1) }}'
gather_facts: false
Expand All @@ -174,7 +174,7 @@

# Join control-plane nodes
- name: 'Join new control plane nodes'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
gather_facts: false
vars:
Expand All @@ -198,7 +198,7 @@

# Join worker nodes
- name: 'Join new workers nodes'
hosts: '{{ kube_worker_group|default("kube_workers") }}'
hosts: '{{ kube_worker_group|default("kube_workers") }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
gather_facts: false
tags: ['join']
Expand All @@ -218,7 +218,7 @@
kubeadm_hook_list: ['post_workers_join', 'post_nodes_join']

- name: 'Finally executing post_run hook on all hosts'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}'
hosts: '{{ kube_cp_group|default("kube_control_plane") }}:{{ kube_worker_group|default("kube_workers") }}{{ ":" ~ limit if limit is defined else "" }}'
any_errors_fatal: '{{ any_errors_fatal|default(true) }}'
gather_facts: false
tasks:
Expand Down
1 change: 1 addition & 0 deletions roles/bootstrap_token/tasks/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
nodes_to_join: >-
{{ q('inventory_hostnames', kube_cp_group ~ ':' ~ kube_worker_group)
|map('extract', hostvars)
|selectattr('_kubelet_config_stat', 'defined')
|rejectattr('_kubelet_config_stat.stat.exists')
|map(attribute='inventory_hostname')|list }}
run_once: true
Expand Down
3 changes: 3 additions & 0 deletions roles/common_vars/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,6 @@ kube_cp_group: kube_control_plane
kube_worker_group: kube_workers

cp_node: '{{ (groups.cp_running|default(groups[kube_cp_group]))|first }}'

_target_kube_version: '{{ hostvars[cp_node]._target_kube_version }}'
_target_kubeadm_version: '{{ hostvars[cp_node]._target_kubeadm_version }}'
17 changes: 1 addition & 16 deletions roles/packages/meta/main.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,4 @@
---
dependencies:
- role: packages_common
galaxy_info:
author: Julien Girardin
description: Install kubernetes related packages
company: Enix
license: Apache
min_ansible_version: 2.7
platforms:
- name: Ubuntu
versions:
- 18.04
- 20.04
galaxy_tags:
- kubernetes
- kubeadm
- kubelet
- kubectl
- role: common_vars
2 changes: 1 addition & 1 deletion roles/upload_certs/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
kubeadm init phase upload-certs
--upload-certs
--certificate-key {{ cert_encryption_key }}
no_log: '{{ sensitive_debug|bool }}'
no_log: '{{ not sensitive_debug|bool }}'
run_once: true
delegate_to: '{{ cp_node }}'
when: _cp_to_join|length > 0
22 changes: 8 additions & 14 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,8 +95,8 @@ def vagrant(tmpdir):
return LocalVagrant(inventory_dir_copy=tmpdir)


@then("Set cluster {variable} = {value}")
@given("The cluster {variable) = {value}")
@then(parsers.parse("Set cluster {variable} = {value}"))
@given(parsers.parse("The cluster {variable} = {value}"))
def cluster_set_param(provider, variable, value):
provider.vars[variable] = value
# Refresh infrastructure
Expand Down Expand Up @@ -160,7 +160,7 @@ def ansible_extra_args(request):

@when(
parsers.re(
r"I (?P<dry_run>dry-)?run the playbooks?:?\s+(?P<playbooks>.+?)(?P<with_err>\s+with error:?\s+)?(?(with_err)(?P<error>.+)|\Z)",
r"I (?P<dry_run>dry-)?run the playbooks?:?\s+(?P<arguments>.+?)(?P<with_err>\s+with error:?\s+)?(?(with_err)(?P<error>.+)|\Z)",
re.DOTALL,
)
)
Expand All @@ -171,26 +171,20 @@ def ansible_playbook(
galaxy_deps,
ansible_extra_args,
results,
playbooks,
arguments,
dry_run,
error,
):
if dry_run == "dry-":
dry_run = True
else:
dry_run = False
playbook_list = re.findall(r"[\w./]+", playbooks)
if not all(os.path.exists(p) for p in playbook_list):
playbook_list_subdir = [os.path.join("playbooks", p) for p in playbook_list]
if all(os.path.exists(p) for p in playbook_list_subdir):
playbook_list = playbook_list_subdir
else:
raise ValueError("All playbooks could not be found")
argument_list = re.findall(r"[^\s]+", arguments)
result = run_ansible_playbook(
virtualenv,
playbook_list,
ansible_extra_args=ansible_extra_args,
inventory=inventory,
arguments=argument_list,
ansible_extra_args=ansible_extra_args,
dry_run=dry_run,
)
if error:
Expand All @@ -206,9 +200,9 @@ def ansible_playbook(
def ansible_kubeadm(inventory, virtualenv, galaxy_deps, ansible_extra_args, results):
result = run_ansible_playbook(
virtualenv,
inventory,
["tests/playbooks/verify.yml"],
ansible_extra_args=ansible_extra_args,
inventory=inventory,
)
assert_ansible_error(result)

Expand Down
12 changes: 6 additions & 6 deletions tests/features/haproxy.feature
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,23 @@ Feature: Haproxy
apiserver_proxy_use_docker: true
kube_version: 1.23
When I run the playbook tests/playbooks/prepare.yml
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
When I run the playbook tests/playbooks/cni.yml
Then I should have a working cluster


When With those group_vars on group all:
apiserver_proxy_use_docker:
When I reset tasks counters
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
with error:
As docker has been deprecated

When With those group_vars on group all:
apiserver_proxy_use_docker: false
When I reset tasks counters
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
Then I should have a working cluster
12 changes: 6 additions & 6 deletions tests/features/install.feature
Original file line number Diff line number Diff line change
Expand Up @@ -17,16 +17,16 @@ Feature: Install
cgroupDriver: "systemd"
kube_version: <version>
When I run the playbook tests/playbooks/prepare.yml
When I dry-run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I dry-run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
When I run the playbook tests/playbooks/cni.yml
Then I should have a working cluster

When I reset tasks counters
And I run the playbooks 00_apiserver_proxy.yml
01_site.yml
And I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
Then I should see no orange/yellow changed tasks

Examples:
Expand Down
25 changes: 15 additions & 10 deletions tests/features/join_nodes.feature
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
Feature: Upgrade
A test to upgrade a kubeadm cluster
Feature: Join Nodes
A test to join nodes to a kubeadm cluster

Scenario: Upgrade via ansible-kubeadm
Scenario: Join nodes via ansible-kubeadm
Given I want ansible 3
Given The cluster control_plane_count = 1
Given The cluster worker_count = 1
Given The cluster worker_count = 0
Given Some running VMs

When With those group_vars on group all:
Expand All @@ -18,14 +18,19 @@ Feature: Upgrade
cgroupDriver: "systemd"
kube_version: 1.23
When I run the playbook tests/playbooks/prepare.yml
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
When I run the playbook tests/playbooks/cni.yml
Then I should have a working cluster

Then Set cluster worker_count = 2
Then Set cluster worker_count = 1
When I run the playbook tests/playbooks/prepare.yml
When I run the playbooks playbooks/01_site.yml -e "limit=*-node-1"
Then I should have a working cluster

When With those group_vars on group all: kube_version: 1.24
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
Then Set cluster control_plane_count = 2
When I run the playbook tests/playbooks/prepare.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml

Then I should have a working cluster
8 changes: 4 additions & 4 deletions tests/features/upgrade.feature
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ Feature: Upgrade
kube_version: <from_version>
action_reasons_review_skip: true
When I run the playbook tests/playbooks/prepare.yml
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml
When I run the playbook tests/playbooks/cni.yml

When With those group_vars on group all: kube_version: <to_version>
When I run the playbooks 00_apiserver_proxy.yml
01_site.yml
When I run the playbooks playbooks/00_apiserver_proxy.yml
playbooks/01_site.yml

Then I should have a working cluster

Expand Down
Loading

0 comments on commit cd3fec6

Please sign in to comment.