Accessing plugin dependencies that are contained within a docker #186
Replies: 1 comment 2 replies
-
I really REALLY like this idea! I had brought this up in the context of some of our other workflows like the profiling recipe Broad-only link 1 Broad-only link 2, but it has obvious advantages for plugins as well. Both the Docker and the Singularity API (which is what the code snippet I linked below is from ) have minimal dependencies (Singularity looks like it may not even actually have any?), we'd need to dig into license compatibility, but subprocess obviously works too. Certainly performance is going to be a major thing requiring investigation - startup times may be non-trivial, and is the system going to go crazy if we're spinning up Dockers everywhere? But rather than having two plugins, I wonder if we could just have a single plugin with a toggle switch ("Docker/python"); it seems like that would be ideal.
|
Beta Was this translation helpful? Give feedback.
-
When running deep learning-based plugins the installation can be tricky. Moreover, running StarDist and Cellpose in the same environment is also problematic, given the tensorflow and pytorch dependencies, respectively.
So, does it make sense to have a version of these plugins that interacts with a docker image (eg. RunCellposeDocker)? That is, StarDist/Cellpose are contained within a docker image and images are passed through to the docker via a mounted volume.
I quickly tried this with Cellpose using
subprocess
to pull and run the image and it works well. There's also a python api for docker. In this process, the input image is saved using skimage to the mounted directory, segmented using Cellpose CLI, and then the output can be read into the plugin again using skimage. As a result, the only lift for users would be for users to install Docker, which is quite easy.Of course, the saving and reading of files in the middle of a module is a bit weird and slower than creating an environment, but if a user just wants access to Cellpose segmentation with CellProfiler's other modules, I think this makes it as easy as possible. You can also imagine a scenario in which you have multiple plugin modules with conflicting dependencies working together, eg. StarDist + Cellpose + DeepProfiler in the same pipeline, all calling out to their respective docker containers.
Another limitation of this is that the docker image size will be considerably larger on disk than the equivalent in the users home python environment. However, I'd also think this downside is negligible given the ease of use of the docker.
You could even abstract this idea further by having some RunDocker module where you input a docker image name, the command you want to run and the data that will be passed to it.
Here's what the subprocess docker run looks like for Cellpose:
subprocess.check_call("docker run -it --rm -v ${PWD}/img:/img ctromanscoia/cellpose:0.1 cellpose --dir /img --pretrained_model nuclei --verbose", shell=True)
Beta Was this translation helpful? Give feedback.
All reactions