You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the chapter 15, CML successfully creates the runner on GCP, however it hangs on the setup-runner step of the workflow.
Behaviour
The cicd starts on GitHub
CML creates the runner on GCP
The setup-runner step hangs on Terraform waiting: level":"info","message":"iterative_cml_runner.runner: Still creating...
After 5-7mins, the GCP pod auto-terminates
The GitHub workflow is still hanging with Terraform at the setup-runner step
Below is the output of the runner pod:
> kubectl logs -f cml-bo4s2uhzqs-2qx6z08y-ig1rgwq0-lg67g
Failed to get unit file state for cml.service: No such file or directory
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 84.5M 100 84.5M 0 0 28.4M 0 0:00:02 0:00:02 --:--:-- 37.8M
bash: line 24: lsof: command not found
{"level":"info","message":"POST /repos/leonardcser/mlops-test/actions/runners/registration-token - 201 in 275ms"}
{"level":"info","message":"GET /repos/leonardcser/mlops-test/actions/runners?per_page=100 - 200 in 215ms"}
{"level":"warn","message":"Github Actions timeout has been updated from 72h to 35 days. Update your workflow accordingly to be able to restart it automatically."}
{"level":"info","message":"Preparing workdir /home/runner..."}
{"level":"info","message":"Launching github runner"}
{"level":"info","message":"Terraform 1.5.4"}
{"level":"info","message":"Plan: 0 to add, 0 to change, 0 to destroy."}
{"level":"info","message":"Apply complete! Resources: 0 added, 0 changed, 0 destroyed."}
{"level":"info","message":"Outputs: 0"}
{"level":"warn","message":"Error connecting to ACPI socket: connect ENOENT /var/run/acpid.socket. The acpid.service helps with instance termination detection."}
{"level":"info","message":"POST /repos/leonardcser/mlops-test/actions/runners/registration-token - 201 in 317ms"}
{"date":"2023-08-03T09:15:06.304Z","level":"info","message":"runner status","repo":"https://github.com/leonardcser/mlops-test","status":"ready"}
{"level":"info","message":"Unregistering runner cml-bo4s2uhzqs-2qx6z08y-ig1rgwq0..."}
{"level":"info","message":"GET /repos/leonardcser/mlops-test/actions/runners?per_page=100 - 200 in 277ms"}
{"level":"info","message":"DELETE /repos/leonardcser/mlops-test/actions/runners/23 - 204 in 360ms"}
{"level":"info","message":"\tSuccess"}
{"level":"info","message":"Waiting 10 seconds to destroy"}
We're moving away from CML as a k8s registration tool, due to the aformentionned issue. CML development is also seemingly in a maintenance mode, without much activity anymore.
We'll keep using CML for reporting in github comments.
For the chapter 15, CML successfully creates the runner on GCP, however it hangs on the
setup-runner
step of the workflow.Behaviour
setup-runner
step hangs on Terraform waiting:level":"info","message":"iterative_cml_runner.runner: Still creating...
setup-runner
stepBelow is the output of the runner pod:
This output is similar to this issue on CML: iterative/cml#1332
The text was updated successfully, but these errors were encountered: