You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This may be that I'm not using things right: I can't find any document that explains the differences between inactivating, freezing and decommissioning a host.
If I go through the following steps:
Mark a host inactive (via POST /api/inactive).
Stop the mesos-agent on it.
Start a new instance of mesos-agent on it (I'm using a Docker container to run Mesos, so I think it gets a new slave ID, but I'm not 100% sure).
Mark the host active again (via DELETE /api/inactive).
Then the slave remains in the decommissioned state and won't run any tasks.
My goal is to be able to prevent new tasks running on a slave (so that once existing tasks die we can reboot/do maintenance on it - we use only on-demand tasks with finite lifetime), and later allow tasks to run on it again (possibly after doing maintenance on it). I've been using "inactive" rather than "freeze" because the former API works on hostnames, which means it can be set even if the mesos-agent isn't running at the time. But let me know what you advise for that.
The text was updated successfully, but these errors were encountered:
so, inactive was something we created to deal with some ec2 impairment cases. We would frequently have some cases whee a host went impaired, came back, went impaired, and cycled like that. The inactive marker was meant to make it so that anything coming in with that host name will be automatically marked as decommissioned, to save tasks from being launched on an impaired/cycling host like that. The reactive here essentially just removes it from a 'blocked' list of hosts
Other definitions:
Freeze - don't launch new tasks on a host, but leave any that are already running alone
Decommission - don't launch new tasks on a host, and also move any that are currently running on the host elsewhere
If just using decommission, since it is done by slave id, the new agent coming into the cluster with a new id will be in the active state. To clean up any that are in that inactive + decommissioned state you mentioned, can remove them from inactive list first, then 'reactivate' in the UI. We can update docs to make this clearer
bmerry
changed the title
Re-activating a host doesn't re-enable the slave
Document inactivate/freeze/decommission procedures
Jul 20, 2020
This may be that I'm not using things right: I can't find any document that explains the differences between inactivating, freezing and decommissioning a host.
If I go through the following steps:
Then the slave remains in the decommissioned state and won't run any tasks.
My goal is to be able to prevent new tasks running on a slave (so that once existing tasks die we can reboot/do maintenance on it - we use only on-demand tasks with finite lifetime), and later allow tasks to run on it again (possibly after doing maintenance on it). I've been using "inactive" rather than "freeze" because the former API works on hostnames, which means it can be set even if the mesos-agent isn't running at the time. But let me know what you advise for that.
The text was updated successfully, but these errors were encountered: