Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

micro-sam as an nf-core module #698

Open
kbestak opened this issue Sep 24, 2024 · 17 comments
Open

micro-sam as an nf-core module #698

kbestak opened this issue Sep 24, 2024 · 17 comments

Comments

@kbestak
Copy link

kbestak commented Sep 24, 2024

Hi,

We would be very interested in adding micro-sam into the nf-core framework (https://nf-co.re) where we are pushing for spatial omics and microscopy tools and workflows. It is a community that introduces standards and tools to facilitate development of Nextflow pipelines. We would want an official nf-core module for micro-sam automatic detection and implement it into the nf-core/molkart (https://nf-co.re/molkart/) and nf-core/mcmicro (https://nf-co.re/mcmicro/).

To adhere to nf-core guidelines, we would need:

  1. CLI for automatic cell segmentation - taking in a multichannel image, optional custom model, outputting a labeled image
  2. a Docker image - preferrably through Biocontainers (e.g. see Cellpose as an example - https://github.com/BioContainers/containers/tree/master/cellpose) (and bioconda, not priority)

The Docker image would require CPU support (we're still working on GPU support options - but it's looking most likely that there will need to be an image depending on drivers).

I would greatly appreciate any help on how to get started with the points above and thanks in advance!

@constantinpape
Copy link
Contributor

Hi @kbestak and thanks for your interest!

Regarding the CLI: we don't have automatic segmentation exposed in our CLI yet, but this is straightforward to implement. We can provide a python script for it in the next few days, and we can then include this into the micro_sam CLI in the next release.

taking in a multichannel image

Note that we support either single or three channel images. Support for a different number of channels is possible in principle, but requires some data processing or model adaptations.

Regarding the docker container: we don't have the capacity to build this ourselves, but are happy to support you in building this. To set up the micro_sam installation you would need to follow these steps:

  • Set up a conda environment from this file (this installs the CPU version)
  • Clone micro_sam and install it via pip.

(and bioconda, not priority)

All our dependencies are on conda-forge. Porting these to bioconda would be a major effort that doesn't make sense.

@kbestak
Copy link
Author

kbestak commented Sep 24, 2024

Hi @constantinpape, thank you for such a quick response!
That all sounds great thank you! I'll keep an eye on for the CLI script after which I'll test out whether a Docker build works with some example images.

@kbestak
Copy link
Author

kbestak commented Sep 27, 2024

A short update here, from the mamba installation, it was extremely easy to get a hosted Docker container as a Seqera container available as: community.wave.seqera.io/library/micro_sam:1.0.1--8aeb5052332a2952

@constantinpape
Copy link
Contributor

Fyi @kbestak , we have implemented the CLI and it is in the dev branch here if you already want to check it out.
We will try to make a release with it soon, but need to fix some other napari compatibility issues first.

@anwai98
Copy link
Contributor

anwai98 commented Oct 21, 2024

Hi @kbestak,
We have made a release at 1.1.0 (including the CLI for automatic segmentation as discussed).

@kbestak
Copy link
Author

kbestak commented Oct 21, 2024

Hi @anwai98, thanks a lot for making this!

I might have found an issue with the napari.util try - except block

First, for full context, here is the build page for the container I'm using for testing: https://wave.seqera.io/view/builds/bd-2ef5d1866f4d0325_1

Attempting to run the following:

docker run community.wave.seqera.io/library/micro_sam:1.1.1--2ef5d1866f4d0325 micro_sam.automatic_segmentation

results in this OSError, whereas the except block in the link above only catches cases for ImportError:

WARNING: Could not load OpenGL library.
Traceback (most recent call last):
  File "/opt/conda/bin/micro_sam.automatic_segmentation", line 6, in <module>
    from micro_sam.automatic_segmentation import main
  File "/opt/conda/lib/python3.11/site-packages/micro_sam/automatic_segmentation.py", line 8, in <module>
    from . import util
  File "/opt/conda/lib/python3.11/site-packages/micro_sam/util.py", line 42, in <module>
    from napari.utils import progress as tqdm
  File "/opt/conda/lib/python3.11/site-packages/napari/utils/__init__.py", line 3, in <module>
    from napari.utils.colormaps.colormap import (
  File "/opt/conda/lib/python3.11/site-packages/napari/utils/colormaps/__init__.py", line 2, in <module>
    from napari.utils.colormaps.colormap import (
  File "/opt/conda/lib/python3.11/site-packages/napari/utils/colormaps/colormap.py", line 19, in <module>
    from napari.utils.color import ColorArray
  File "/opt/conda/lib/python3.11/site-packages/napari/utils/color.py", line 8, in <module>
    from napari.utils.colormaps.standardize_color import transform_color
  File "/opt/conda/lib/python3.11/site-packages/napari/utils/colormaps/standardize_color.py", line 28, in <module>
    from vispy.color import ColorArray, get_color_dict, get_color_names
  File "/opt/conda/lib/python3.11/site-packages/vispy/color/__init__.py", line 12, in <module>
    from .colormap import (Colormap, BaseColormap,  # noqa
  File "/opt/conda/lib/python3.11/site-packages/vispy/color/colormap.py", line 14, in <module>
    import vispy.gloo
  File "/opt/conda/lib/python3.11/site-packages/vispy/gloo/__init__.py", line 47, in <module>
    from . import gl  # noqa
    ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vispy/gloo/gl/__init__.py", line 230, in <module>
    from . import es2 as default_backend  # noqa
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vispy/gloo/gl/es2.py", line 48, in <module>
    raise OSError('GL ES 2.0 library not found')
OSError: GL ES 2.0 library not found

Thanks again!

@anwai98
Copy link
Contributor

anwai98 commented Oct 21, 2024

Hi @kbestak,

Thanks for following up on this.

Can you uninstall napari from your hosted container and try again? (napari isn't a necessary dependency for using the micro_sam library stand-alone).

Let us know if it works!

@kbestak
Copy link
Author

kbestak commented Oct 21, 2024

Hi @anwai98, thanks for the tip!

excluding napari, magicgui and pyqt from the environment list, cloning the repo into the Docker image and adding pip install -e . fixed all issues and the segmentation works on a small test example!

Next week, I'll be working on the nf-core module during an nf-core hackathon prior to the Nextflow Summit.

I have three additional questions:

  1. By any chance, is there a parameter I could set to exclude artefacts around the nuclei in the example below? (I only specified input and output paths)
image
  1. Is there a way to specify a cache directory for the models so that they don't get downloaded for each instance?

  2. Is there a recommended usage of the tile_shape and halo arguments? When attempting to use them, I run into the following issue:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 538, in __getattr__
    return self.__getitem__(item)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 511, in __getitem__
    raise KeyError(item)
KeyError: 'ndim'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/bin/micro_sam.automatic_segmentation", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/micro_sam/micro_sam/automatic_segmentation.py", line 235, in main
    automatic_instance_segmentation(
  File "/micro_sam/micro_sam/automatic_segmentation.py", line 123, in automatic_instance_segmentation
    segmenter.initialize(image=image_data, image_embeddings=image_embeddings)
  File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/micro_sam/micro_sam/instance_segmentation.py", line 486, in initialize
    util.set_precomputed(self._predictor, image_embeddings, i=i)
  File "/micro_sam/micro_sam/util.py", line 899, in set_precomputed
    assert features.ndim in (4, 5), f"{features.ndim}"
           ^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 540, in __getattr__
    raise AttributeError from e
AttributeError

Thank you so much for all the help!

@anwai98
Copy link
Contributor

anwai98 commented Oct 21, 2024

Hi @kbestak,

excluding napari, magicgui and pyqt from the environment list, cloning the repo into the Docker image and adding pip install -e . fixed all issues and the segmentation works on a small test example!

Nice! (in case there are any specific requirements to build the image, let us know)

Next week, I'll be working on the nf-core module during an nf-core hackathon prior to the Nextflow Summit.

That sounds great, thanks!

  1. By any chance, is there a parameter I could set to exclude artefacts around the nuclei in the example below? (I only specified input and output paths)

Could you point me out which model are you using? (i.e. the argument you pass to -m / --model_type, or are you using the default value)? (that will help me understand a bit better the automatic segmentation method running and think about a plausible solution)

  1. Is there a way to specify a cache directory for the models so that they don't get downloaded for each instance?

By default, micro-sam caches the downloaded models at one of these system-dependent directories. Is it the case that running the CLI using the image everytime is caching it remotely over some temporary storage location? (you can verify this by checking the filepath for checkpoint_path). However, this can be overriden from the automatic segmentation CLI by passing filepath to the model checkpoint downloaded manually (using -c / --checkpoint).

  1. Is there a recommended usage of the tile_shape and halo arguments?

The recommendation for tile and halo shapes is to provide 2d tile and halo shapes as a sequence of per-axes spatial shape values, eg. for a tile shape of (512, 512) with a halo shape of (128, 128), the arguments would be as follows: --tile_shape 512 512 --halo 128 128.

@CaroAMN
Copy link

CaroAMN commented Oct 28, 2024

Hi @anwai98,
i'm working on creating the nf-core module for micro-sam during the nextflow hackathon.

Right now i am trying to use the --tile_shape and --halo options and i run into the same issue as @kbestak.

Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 538, in getattr
return self.getitem(item)
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 511, in getitem
raise KeyError(item)
KeyError: 'ndim'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/bin/micro_sam.automatic_segmentation", line 8, in
sys.exit(main())
^^^^^^
File "/micro_sam/micro_sam/automatic_segmentation.py", line 235, in main
automatic_instance_segmentation(
File "/micro_sam/micro_sam/automatic_segmentation.py", line 123, in automatic_instance_segmentation
segmenter.initialize(image=image_data, image_embeddings=image_embeddings)
File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/micro_sam/micro_sam/instance_segmentation.py", line 486, in initialize
util.set_precomputed(self._predictor, image_embeddings, i=i)
File "/micro_sam/micro_sam/util.py", line 899, in set_precomputed
assert features.ndim in (4, 5), f"{features.ndim}"
^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/zarr/hierarchy.py", line 540, in getattr
raise AttributeError from e
AttributeError

i used the following values:
--tile_shape 512 512 --halo 128 128
--tile_shape 1024 1024 --halo 256 256

i tested it on a 2302 x 1800 image

i appreciate any help on that or a hint on what i'm missing here :) Thank you in advance

@anwai98
Copy link
Contributor

anwai98 commented Oct 28, 2024

Hi @CaroAMN,

Can you share with me the shape of your input arrays? (I see the spatial shapes you provided, is it with multiple channels, etc)

EDIT: Hmm I have another hypothesis. Let me check it out and come back to you!

@CaroAMN
Copy link

CaroAMN commented Oct 28, 2024

Hi @CaroAMN,

Can you share with me the shape of your input arrays? (I see the spatial shapes you provided, is it with multiple channels, etc)

I tried it with one single channel .tif with this shape (2302, 1800)

EDIT: Hmm I have another hypothesis. Let me check it out and come back to you!

Thanks :) !

@anwai98
Copy link
Contributor

anwai98 commented Oct 28, 2024

Hi @CaroAMN, can you send me the entire command which is triggered for running the automatic segmentation CLI? (I'll try to reproduce it)

EDIT: Ahha no worries. I managed to reproduce it 😉. I'll get back to you!

@anwai98
Copy link
Contributor

anwai98 commented Oct 29, 2024

Hi @CaroAMN,
The issue you observed should be fixed now (thanks to you and @kbestak for spotting).
Let us know if you come across any other issues!

@constantinpape
Copy link
Contributor

Hi @kbestak and @CaroAMN ,
is the nf-core module of micro_sam available now?

  • If yes: Could you provide a link to it, so that we can reference it in our documentation?
  • If no: What is the issue / how can we help?

@CaroAMN
Copy link

CaroAMN commented Dec 23, 2024

Hi @kbestak and @CaroAMN , is the nf-core module of micro_sam available now?

Hi, not yet, unfortunately. I'm in the process of pushing the container to the Biocontainers repository. The only problem for me was to find time to work on it. But it should be better from next week on time-wise. I will keep you updated :)

@constantinpape
Copy link
Contributor

Thanks for the quick response @CaroAMN !

Maybe relevant for you: we updated the installation instructions and we only depend on conda-forge now, not on the pytorch channel; see https://computational-cell-analytics.github.io/micro-sam/micro_sam.html#from-conda for details.
This should simplify the overall installation.

Let us know if you run into any further issues where we could help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants