Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate token #375

Merged
merged 4 commits into from
Nov 11, 2024
Merged

Generate token #375

merged 4 commits into from
Nov 11, 2024

Conversation

anon-software
Copy link
Contributor

If a token is not explicitly provided, let the first server generate a random one. Such a token is saved on the first server and the playbook can retrieve it from there and store it a a fact. All other servers and agents can use that token later to join the cluster. It will be saved into their environment file as usual.

I tested this by creating a cluster of one server and then adding two more servers and one agent. Please let me know if I should try some other tests as well.

Changes

Linked Issues

#307

If a token is not explicitly provided, let the first server generate a
random one. Such a token is saved on the first server and the playbook
can retrieve it from there and store it a a fact. All other servers and
agents can use that token later to join the cluster. It will be saved
into their environment file as usual.

Signed-off-by: Marko Vukovic <[email protected]>
@dereknola
Copy link
Member

This has been tried before. You need to test the case of 3 servers all at the start. There are issues with the fact correctly being propagated to the other two servers, as the bring up on all the nodes happens asynchronously, so you cannot guarantee (and when I tried to implemented this it failed) that the fact will even exist for the other 2 servers to see and join with.

@anon-software
Copy link
Contributor Author

Maybe I do not understand how Ansible works then. Doesn't the task "Init first server node" from roles/k3s_server/tasks/main.yml runs first and terminates before "Start other server if any and verify status" runs? The former task will save the token and it will always be available for the others being setup in the latter, or at least that is how thought it would work.

Setting up three servers all at once was actually the first test I ran, although it is possible that it was a fluke that it worked.

@dereknola
Copy link
Member

If you got it working, that great! I'm gonna pull down your PR and check it out sometime later today or Monday.

@dereknola
Copy link
Member

the CNCF requires that all commits be signed. Just follow the instructions https://github.com/k3s-io/k3s-ansible/pull/375/checks?check_run_id=32818692853

Signed-off-by: Marko Vukovic <[email protected]>
@dereknola
Copy link
Member

dereknola commented Nov 11, 2024

When testing with the vagrant file, I see the following error

TASK [k3s_agent : Get the token from the first server] *************************
fatal: [agent-0]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'\n\nThe error appears to be in '/home/derek/rancher/ansible-k3s/roles/k3s_agent/tasks/main.yml': line 38, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Get the token from the first server\n  ^ here\n"}

Its possible that the vagrant ansible provisioner works differently than a regular ansible-playbook deployment. I'm testing with my local pi cluster. Will Update.

@dereknola
Copy link
Member

So interesting results. For the 3 pi cluster, the first time I tested with 3 servers, it installed fine. Then ran the reset playbook and tried to run the site.yaml again. This time it also failed with

TASK [k3s_server : Get the token from the first server] *******************************************************************************************************************************
fatal: [192.168.1.91]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'\n\nThe error appears to be in '/home/derek/rancher/ansible-k3s/roles/k3s_server/tasks/main.yml': line 208, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n  block:\n    - name: Get the token from the first server\n      ^ here\n"}
fatal: [192.168.1.92]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'\n\nThe error appears to be in '/home/derek/rancher/ansible-k3s/roles/k3s_server/tasks/main.yml': line 208, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n  block:\n    - name: Get the token from the first server\n      ^ here\n"}

PLAY RECAP ****************************************************************************************************************************************************************************
192.168.1.90               : ok=21   changed=3    unreachable=0    failed=1    skipped=45   rescued=0    ignored=1   
192.168.1.91               : ok=21   changed=3    unreachable=0    failed=1    skipped=61   rescued=0    ignored=1   
192.168.1.92               : ok=21   changed=3    unreachable=0    failed=1    skipped=60   rescued=0    ignored=1   

This is the exact same issue I ran into the first time I attempted to implement auto generating tokens.

@dereknola
Copy link
Member

dereknola commented Nov 11, 2024

Run a server + agent inventory on the raspberry pi cluster, the playbook works because those are seperate roles, so they run sequentially (ie the server role gets executed, then the agent role). But for the vagarant provisioner, it just runs everything in parallel, so this system will never work.

I'm less concerned if the Vagrantfile works, that can just be notes in the Vagrantfile as "requires token". But the above errors around regular ssh nodes is a blocker on this PR.

You might want to look into https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#restricting-execution-with-throttle or other ways to control execution on nodes. Its possible there is some way of achieving if no token exists: run the next task throttled/sequential to ensure the other nodes can find the token var

@anon-software
Copy link
Contributor Author

Do you still have the complete log of the playbook execution that you can attach here?

@dereknola
Copy link
Member

cluster.log

@dereknola
Copy link
Member

Okay nvmd I just read my own error logs. Let me fix it.

@dereknola
Copy link
Member

I seem to have found a seperate issue around Copy K3s service file needing extra_server_args to be defined. I had stripped down my inventory.yaml to be super simple. I will open a seperate PR to address this issue.

@anon-software
Copy link
Contributor Author

OK, I shall push another commit to address the new batch of Lint errors.

Copy link
Member

@dereknola dereknola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment above https://github.com/k3s-io/k3s-ansible/blob/master/Vagrantfile#L31 That a token variable is required for the vagrant ansible provisioner.

I'm happy to accept this PR if it work on "real" ansible playbooks and vagrant is just weird.

Signed-off-by: Marko Vukovic <[email protected]>
The token is still required when using Vagrant.

Signed-off-by: Marko Vukovic <[email protected]>
@dereknola dereknola merged commit c10b84f into k3s-io:master Nov 11, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants