-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLOC-1194] Add forceable container removal if needed #1753
base: master
Are you sure you want to change the base?
Changes from 2 commits
624f877
5caa920
d60b09a
5d7c1a7
8b2ca51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -99,7 +99,7 @@ def start_container(self, unit_name, | |
environment=None, volumes=(), | ||
mem_limit=None, cpu_shares=None, | ||
restart_policy=RestartNever(), | ||
command_line=None): | ||
command_line=None, force=False): | ||
""" | ||
Start a unit and wait until it reaches the `active` state or the | ||
supplied `expected_state`. | ||
|
@@ -114,6 +114,8 @@ def start_container(self, unit_name, | |
:param cpu_shares: See ``IDockerClient.add``. | ||
:param restart_policy: See ``IDockerClient.add``. | ||
:param command_line: See ``IDockerClient.add``. | ||
:param bool force: Force container removal via SIGKILL | ||
during cleanup. | ||
|
||
:return: ``Deferred`` that fires with the ``DockerClient`` when | ||
the unit reaches the expected state. | ||
|
@@ -130,14 +132,54 @@ def start_container(self, unit_name, | |
restart_policy=restart_policy, | ||
command_line=command_line, | ||
) | ||
self.addCleanup(client.remove, unit_name) | ||
self.addCleanup(client.remove, unit_name, force) | ||
|
||
d.addCallback(lambda _: wait_for_unit_state(client, unit_name, | ||
expected_states)) | ||
d.addCallback(lambda _: client) | ||
|
||
return d | ||
|
||
def test_force_remove(self): | ||
""" | ||
``DockerClient.remove`` can forcefully remove a running container | ||
when the ``force`` parameter is set to True. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something to note about this test is that, though it does exercise the new I skimmed the Docker issue and didn't notice the section that suggests force=True as a work-around for this. A reference to that suggestion, as well as to the issue as a whole somewhere in this code (perhaps near the callers that pass |
||
""" | ||
client = DockerClient() | ||
unit_name = u'test-force-remove' | ||
expected_states = (u'active',) | ||
d = client.add( | ||
unit_name=unit_name, | ||
image_name=u'clusterhq/flask' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This image and flask are incidental complexity for this test. The detail of which container to run doesn't really matter to what the test is concerned with testing. All of the code from Flask that will run when this test runs is likewise irrelevant. Depending on the execution environment of the test suite, it's also possible this will result in a Docker image being pulled from Dockerhub. The incidental complexity increases the maintenance cost of the test method. The Dockerhub dependency increases the chances that this will fail intermittently due to transient network or Dockerhub conditions. A way to address the incidental complexity is to contain it and then isolate it from this particular test. Add an attribute or method to the class responsible for giving back the name of an image that has the desired properties - in this case, an image that runs indefinitely, I think. A way to address the external dependency is to build a custom image for this test as we do elsewhere. I think a busybox-based container that runs forever should be pretty straightforward. |
||
) | ||
d.addCallback( | ||
lambda _: wait_for_unit_state( | ||
client, unit_name, expected_states | ||
) | ||
) | ||
|
||
def check_is_listed(_): | ||
listed = client.list() | ||
|
||
def check_unit(results): | ||
self.assertTrue(unit_name in [unit.name for unit in results]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This provides a better failure message because it has both the "needle" and the "haystack". |
||
listed.addCallback(check_unit) | ||
return listed | ||
|
||
d.addCallback(check_is_listed) | ||
d.addCallback(lambda _: client.remove(unit_name, force=True)) | ||
|
||
def check_is_removed(_): | ||
listed = client.list() | ||
|
||
def check_unit(results): | ||
self.assertFalse(unit_name in [unit.name for unit in results]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Likewise. |
||
listed.addCallback(check_unit) | ||
return listed | ||
|
||
d.addCallback(check_is_removed) | ||
return d | ||
|
||
def test_default_base_url(self): | ||
""" | ||
``DockerClient`` instantiated with a default base URL for a socket | ||
|
@@ -356,6 +398,7 @@ def image_built(image_name): | |
unit_name=unit_name, | ||
image_name=image_name, | ||
environment=Environment(variables=expected_variables), | ||
force=True, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the only real change in behavior is in the test suite then, hopefully, our test suite will stop failing spuriously with this problem. What about the actual implementation, though? If this failure can strike test code that's removing a container, it seems plausible it could strike Is there a reason to think this isn't the case? Or does this problem eventually solve itself so that, as long as you're retrying (as the looping approach of the convergence agent will do) the removal does eventually succeed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, what about other tests that create and then remove containers (ie, all the other tests using Couldn't they all also fail with this error? The error doesn't seem particularly container-with-environment-variables specific at first glance. If any There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Plausibly, yes - but the issue FLOC-1194 is only concerned with the environment variables test and is a "fix flaky test" category issue. So it possibly makes sense to not alter the behaviour of other tests using the Client change in this branch until such time as they are also identified as intermittently failing on the same device busy error. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, it is plausible this error could strike
Maybe; I've seen at least one report (anecdote) that it was necessary for the host running Docker to be rebooted to resolve the problem, but again this seems to vary. There's not even a guarantee that force removal via Docker's API prevents the device busy error, I've seen reports of it happening even with the |
||
) | ||
d.addCallback(image_built) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is an implementation of an interface method,
IDockerClient.remove
.make_idockerclient_tests
, needs to have tests for the new functionality added.