-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLOC-1194] Add forceable container removal if needed #1753
base: master
Are you sure you want to change the base?
Changes from all commits
624f877
5caa920
d60b09a
5d7c1a7
8b2ca51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -208,13 +208,14 @@ def exists(unit_name): | |
otherwise ``False``. | ||
""" | ||
|
||
def remove(unit_name): | ||
def remove(unit_name, force): | ||
""" | ||
Stop and delete the given unit. | ||
|
||
This can be done multiple times in a row for the same unit. | ||
|
||
:param unicode unit_name: The name of the unit to stop. | ||
:param bool force: Force removal of the unit. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And add a note about the default behaviour here. |
||
|
||
:return: ``Deferred`` that fires once the unit has been stopped | ||
and removed. | ||
|
@@ -271,7 +272,7 @@ def add(self, unit_name, image_name, ports=frozenset(), environment=None, | |
def exists(self, unit_name): | ||
return succeed(unit_name in self._units) | ||
|
||
def remove(self, unit_name): | ||
def remove(self, unit_name, force=False): | ||
if unit_name in self._units: | ||
del self._units[unit_name] | ||
return succeed(None) | ||
|
@@ -546,7 +547,7 @@ def _blocking_container_runs(self, container_name): | |
).write() | ||
return result['State']['Running'] | ||
|
||
def remove(self, unit_name): | ||
def remove(self, unit_name, force=False): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This method is an implementation of an interface method,
|
||
container_name = self._to_container_name(unit_name) | ||
|
||
def _remove(): | ||
|
@@ -608,7 +609,7 @@ def _remove(): | |
message_type="flocker:docker:container_remove", | ||
container=container_name | ||
).write() | ||
self._client.remove_container(container_name) | ||
self._client.remove_container(container_name, force=force) | ||
Message.new( | ||
message_type="flocker:docker:container_removed", | ||
container=container_name | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -99,7 +99,7 @@ def start_container(self, unit_name, | |
environment=None, volumes=(), | ||
mem_limit=None, cpu_shares=None, | ||
restart_policy=RestartNever(), | ||
command_line=None): | ||
command_line=None, force=False): | ||
""" | ||
Start a unit and wait until it reaches the `active` state or the | ||
supplied `expected_state`. | ||
|
@@ -114,6 +114,8 @@ def start_container(self, unit_name, | |
:param cpu_shares: See ``IDockerClient.add``. | ||
:param restart_policy: See ``IDockerClient.add``. | ||
:param command_line: See ``IDockerClient.add``. | ||
:param bool force: Force container removal via SIGKILL | ||
during cleanup. | ||
|
||
:return: ``Deferred`` that fires with the ``DockerClient`` when | ||
the unit reaches the expected state. | ||
|
@@ -130,14 +132,67 @@ def start_container(self, unit_name, | |
restart_policy=restart_policy, | ||
command_line=command_line, | ||
) | ||
self.addCleanup(client.remove, unit_name) | ||
self.addCleanup(client.remove, unit_name, force) | ||
|
||
d.addCallback(lambda _: wait_for_unit_state(client, unit_name, | ||
expected_states)) | ||
d.addCallback(lambda _: client) | ||
|
||
return d | ||
|
||
def test_force_remove(self): | ||
""" | ||
``DockerClient.remove`` can forcefully remove a running container | ||
when the ``force`` parameter is set to True. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something to note about this test is that, though it does exercise the new I skimmed the Docker issue and didn't notice the section that suggests force=True as a work-around for this. A reference to that suggestion, as well as to the issue as a whole somewhere in this code (perhaps near the callers that pass |
||
|
||
This is intended to help avoid an intermittent error in other | ||
tests caused by a Docker behaviour where an "device busy" error | ||
occurs when attempting to remove a container. | ||
|
||
In many cases, it is reported that forced removal of the container | ||
via the Docker API or "-f" CLI switch is a work-around that prevents | ||
this error, though is not a solution to the underlying problem. | ||
|
||
See https://github.com/docker/docker/issues/9665 | ||
See http://blog.hashbangbash.com/2014/11/ | ||
docker-devicemapper-fix-for-device-or-resource-busy-ebusy/ | ||
See FLOC-1194. | ||
""" | ||
client = DockerClient() | ||
unit_name = u'test-force-remove' | ||
expected_states = (u'active',) | ||
d = client.add( | ||
unit_name=unit_name, | ||
image_name=u"openshift/busybox-http-app" | ||
) | ||
d.addCallback( | ||
lambda _: wait_for_unit_state( | ||
client, unit_name, expected_states | ||
) | ||
) | ||
|
||
def check_is_listed(_): | ||
listed = client.list() | ||
|
||
def check_unit(results): | ||
self.assertIn(unit_name, [unit.name for unit in results]) | ||
listed.addCallback(check_unit) | ||
return listed | ||
|
||
d.addCallback(check_is_listed) | ||
d.addCallback(lambda _: client.remove(unit_name, force=True)) | ||
|
||
def check_is_removed(_): | ||
listed = client.list() | ||
|
||
def check_unit(results): | ||
self.assertNotIn(unit_name, [unit.name for unit in results]) | ||
listed.addCallback(check_unit) | ||
return listed | ||
|
||
d.addCallback(check_is_removed) | ||
return d | ||
|
||
def test_default_base_url(self): | ||
""" | ||
``DockerClient`` instantiated with a default base URL for a socket | ||
|
@@ -356,6 +411,7 @@ def image_built(image_name): | |
unit_name=unit_name, | ||
image_name=image_name, | ||
environment=Environment(variables=expected_variables), | ||
force=True, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the only real change in behavior is in the test suite then, hopefully, our test suite will stop failing spuriously with this problem. What about the actual implementation, though? If this failure can strike test code that's removing a container, it seems plausible it could strike Is there a reason to think this isn't the case? Or does this problem eventually solve itself so that, as long as you're retrying (as the looping approach of the convergence agent will do) the removal does eventually succeed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, what about other tests that create and then remove containers (ie, all the other tests using Couldn't they all also fail with this error? The error doesn't seem particularly container-with-environment-variables specific at first glance. If any There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Plausibly, yes - but the issue FLOC-1194 is only concerned with the environment variables test and is a "fix flaky test" category issue. So it possibly makes sense to not alter the behaviour of other tests using the Client change in this branch until such time as they are also identified as intermittently failing on the same device busy error. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, it is plausible this error could strike
Maybe; I've seen at least one report (anecdote) that it was necessary for the host running Docker to be rebooted to resolve the problem, but again this seems to vary. There's not even a guarantee that force removal via Docker's API prevents the device busy error, I've seen reports of it happening even with the |
||
) | ||
d.addCallback(image_built) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have thought
zope.verify
would have caught this, but the signature in the interface should match the signature in the implementation, so the interface ought to haveforce=False
.