Skip to content

0.4.0

Compare
Choose a tag to compare
@github-actions github-actions released this 05 Dec 20:54
· 9 commits to refs/heads/master since this release
f89b4ef
Improve print statements for Polaris application @elaubsch (#82)

This PR makes the print statements for prediction progress more informative. It add a progress bar for the spot detection prediction and removes a less useful progress bar for gene assignment.

Add skip round functionality to spot prediction in Polaris @elaubsch (#68)

This PR adds a check for rounds with no labeling in defined codebook. If detected, these rounds will be skipped during spot detection to prevent hallucination.

Update copyright to 2023 @elaubsch (#48)

This PR updates the copyright to 2023.

Tighten CI configuration @rossbar (#46)

A couple minor tweaks to the CI configuration to (hopefully) prevent duplication and unnecessary runs on experimental branches. Specifically:

  • 0ce6c0c limits CI so that runs are only triggered when either 1) a PR to master is opened, or 2) there is a new push to a branch from which a PR already originates. This should prevent CI from running when commits are pushed up to non-master branches.
  • 48f4bc7 adds a check which will cancel any in-progress jobs when new changes are pushed up. This can help e.g. when you push two commits up in rapid succession. By default, GH will queue up the 2nd set of jobs and wait for the first set to finish. With this option, the first set of jobs will be cancelled and the 2nd job will start immediately.

The motivation for these changes is to reduce the CI load in private repositories.

Add back deleted function and bug fix @elaubsch (#35)

This PR adds back the ca_to_adjacency_matrix which was removed in a previous PR. It also fixes a bug in the graph visualization functions.

🚀 Features

Move masking function outside of Polaris application @elaubsch (#81)

This PR moves the abstracted _mask_spots to be outside of the application. The function is now available in results_utils.py. The unit tests and example notebook for Polaris have been updated.

Add authentication to SpotDetection @elaubsch (#75)

This PR adds authentication to SpotDetection by using fetch_data to download the spot detection model upon instantiation. fetch_data requires a DeepCell API key.

For this reason, a large number of the tests in polaris_tests.py have been temporarily removed, because models cannot be downloaded in the GitHub actions test environment without an API key.

Pixel-wise decoding for Polaris @elaubsch (#73)

This PR introduces an algorithmic change to Polaris that increases the number of pixels sent through the SpotDecoding application. Previously, we performed peak finding to determine which pixels were decoded. Now with this change, we will threshold the spot probability image and decode all pixels above a certain spot probability. Then, we create a mask for all of the pixels decoded to genes and apply this mask to the spot probability image. We then perform peak finding on this masked image to call the gene locations. This change is like a compromise between Polaris' original method and pixel-wise decoding methods that are common in MERFISH analysis pipelines. We found that this method increases the number of spots decoded to genes by Polaris, while yielding results with a better correlation to bulk sequencing data.

This change redefines of the threshold parameter for the predict method. Instead of being used in peak finding, this parameter is used to create a mask for tissue area. Therefore, the default value has been changed.

This PR also includes changes the output of Polaris that are unrelated to the algorithmic change. Polaris previously returned df_spots and df_intensities, but now these two outputs have been concatenated column-wise to yield a single DataFrame.

Add function for barcode assignment to cells for optical pooled screens @elaubsch (#70)

This PR adds a function that processes Polaris predictions for barcode assignment to cells for optical pooled screens. Unit tests have been added for this function.

Add get\_cell\_counts to results utils @elaubsch (#67)

This PR adds the function get_cell_counts to results_utils. This utility function converts the Polaris output format to a gene expression per cell table for compatibility with downstream analysis packages like scanpy and Seurat. Unit tests for this function have been added.

This PR also adds an example notebook to demonstrate how to use get_cell_counts to generate an input for scanpy.

Add gene scatter plot to results utils @elaubsch (#65)

This PR adds an additional function gene_scatter to results_utils. This function creates a scatter plot with Plotly to visualize the location of decoded genes. It also adds some additional arguments to expression_correlation to offer more control over the plot appearance and input.

Add arguments to expression\_correlation @elaubsch (#64)

This PR adds arguments to expression_correlation:

  1. log: Boolean that determines whether to create the scatter plot in log space.
  2. exclude_genes: List of outlier genes excluded from the plot.
  3. exclude_zeros: Boolean that determines whether zero counts from control and experimental sets are excluded.
  4. eps: A small epsilon value added to the counts to avoid errors taking the log of zero counts.

This function yields a figure and is not tested, so no additional testing has been added to cover the code added for these arguments.

This PR also includes minor changes to docstrings in results_utils.py.

Add results\_utils functions @elaubsch (#63)

This PR adds a few utility functions for processing and visualizing the results output by Polaris. Many of these functions require plotly which has been added to the requirements file. Unit tests have been added for the functions added in this PR that don't output a figure. This PR include also includes a minor change to a Polaris application docstring.

Add masking of bright background objects @elaubsch (#57)

This PR adds a function, _mask_spots, to the Polaris application, which creates a mask for bright fluorescent objects in the background image of a FISH sample. It then marks all detected spots inside this mask as 'masked', so they can be exclude from downstream analysis. This information is added to the decoding_result and appears as an additional column in the main Polaris output.

Because of the increasing complexity of the Polaris prediction inputs, a function , _validate_prediction_input was added to Polaris. This method checks the shapes of the inputs spots_image, segmentation_image, and background_image. It also checks the values of threshold and mask_threshold.

Test cases have been added to cover these two new functions.

Add mixed barcode rescue to SpotDecoding @elaubsch (#56)

This PR adds a function _rescue_mixed_spots to the SpotDecoding application. This function addresses the case of mixed barcodes caused by spatial crowding of spots. An argument rescue_mixed has been added to the predict method of SpotDecoding to toggle this function. A test case has been added to cover this function. Print statements have been added to make the prediction more verbose to make the amount of error correction more obvious.

The function _rescue_spots has been refactored to _rescue_errors, because there are now two methods for rescuing spots. The exposed argument rescue_spots has also been changed to rescue_errors.

Regardless of error correction, two items have been added to the dictionary returned by SpotDetection.predict.

  1. spot_index indexes the spots, because rescue_mixed_spots introduces the case that two gene assignments can be made for the same spot. In that case, a new entry is added to the output with the same index as the original spot.

  2. source details the origin of a prediction. Its values can be:

    • 'prediction' from SpotDetection.predict
    • 'error rescue' from rescue_errors
    • 'mixed rescue' from _rescue_mixed_spots
Add Bernoulli to decoding distributions @elaubsch (#54)

This PR adds Bernoulli as an option for decoding distributions. Bernoulli has the same options for numbers of parameters as Relaxed Bernoulli, so the arguments for defining the model distribution have changed. There is a new argument distribution which has valid values ['Gaussian', 'Bernoulli', 'Relaxed Bernoulli']. This is a departure from the previous logic where Relaxed Bernoulli was implied unless params_mode was set to 'Gaussian'. The argument params_mode has the same valid values, except 'Gaussian'.

This PR also adds a _validate_spots_intensities function which will verify the spots_intensities_vec input into the SpotDecoding application, because the different distribution options have different requirements for input values. This function aims to return an error message that will be more interpretable to the user than the PyTorch message. Tests have been added for invalid examples of spots_intensities_vec. Logic has been added to the Polaris application to input the correctly pre-processed spot intensities into the SpotDecoding application. The singleplex version of the app now returns the original pixel values at the spot locations, which resolves #16.

Rescue spots during spot decoding @elaubsch (#52)

This PR adds a function to the SpotDecoding application that rescues the spots whose probability values have a Hamming distance of 1 from a barcode in the codebook. A parameter, rescue_spots, has been added to the predict method of the SpotDecoding application, which determines if the rescuing function is applied to the predictions.

Validate codebook in SpotDecoding @elaubsch (#53)

This PR adds a function to the SpotDecoding application that checks the format of the codebook (df_barcodes) during instantiation of the application. This function requires the following criteria:

  1. The codebook is a Pandas DataFrame
  2. The first column contains the gene names and has the column name 'Gene'
  3. The length of the barcodes is equal to the product of the arguments rounds and channels
  4. The barcodes only contain 0s and 1s
  5. The codebook does not already contain 'Background' and 'Unknown' entries, because they are added automatically

Unit tests have been added to cover these criteria. Before this PR, the SpotDecoding application expected a column 'code_name', containing the gene names. This name has been refactored to 'Gene' which is more accurate/specific.

Add mixture of Gaussians to decoding distributions @elaubsch (#49)

This PR adds mixture of Gaussians as an option in params_mode. These changes include:

  • Addition of decoding functions adapted from PoSTcode (https://github.com/gerstung-lab/postcode).
  • Exposing params_mode as an argument for the Polaris application
  • Addition of logic to use raw pixel values when params_mode=='Gaussian' and probability values otherwise
  • More extensive testing of applications and decoding functions
Add decoding functionality to Polaris @xuefei-wang (#36)

Add decoding part to the repo. Spot decoding has its own application, and is also wrapped into Polaris.
Tests are also provided.

🐛 Bug Fixes

Fix bug for multi-batch predictions with pixel-wise decoding @elaubsch (#74)

This PR addresses a bug that affected multi-batch predictions for Polaris' new pixel-wise decoding method. Now, Polaris iterates through the batch index to find the local maxima in the masked spot probability image.

Remove one data validation check from Polaris @elaubsch (#66)

This PR removes a data validation check that contains an error from Polaris. The check enforces that the segmentation image has one channel when Mesmer predictions usually have two.

Bug fix for output\_to\_df for mixed rescue @elaubsch (#62)

This PR addresses a bug in the input for output_to_df caused by the additional spots added during mixed rescue.

Update application docstrings @elaubsch (#59)

This PR updates the docstrings for the SpotDetection, SpotDecoding, and Polaris applications. It also contains a bug fix for _validate_spots_intensities .

Fix bug in Polaris Bernoulli predictions @elaubsch (#55)

This PR fixes a bug in the Polaris predict function that arises when decoding with a Bernoulli distribution. It adds a test case to catch this condition.

This PR also applies some of the suggested set syntax changes from #54

Fix bug in multi-batch E-step for spot decoding @elaubsch (#51)

This PR contains a few changes in the Relaxed Bernoulli and Gaussian E-step functions for multi-batch predictions. The primary bug was in the RB E-step function, which didn't correctly index the data before prediction when the number of spots exceeded the batch size. The other key bug arose when the number of spots was exactly divisible by the batch size. Other small changes include renaming variables for clarity. Tests have been added for multi-batch predictions and the case where the number of spots are divisible by the batch size.

Allow blank images in training data set @elaubsch (#42)

This PR addresses a known bug in subpixel_distance_transform that prevented images without spots from being included in the training data set for the spot detection model. subpixel_distance_transform now checks the length of input point_list and if its length is zero, it returns a null result. This PR also adds a test for the no spots case for subpixel_distance_transform.

Add command to install deepcell spots in docker container @msschwartz21 (#41)

Currently deepcell-spots does not get installed in the docker container unless you mount the spots folder into the correct folder in the running container. This PR adds the installation to the Dockerfile so that the package is always available.

Update pytest and rm pytest-pep8 dependency. @rossbar (#40)

pytest-pep8 is no longer supported, and pytest<6 is likely the source of the issues with coveralls.

I will re-add some form of linting to CI to replace pytest-pep8 in the near future (this applies to all the deepcell-* libraries). See also vanvalenlab/deepcell-toolbox#128.

🧰 Maintenance

Update DEEPCELL\_VERSION in README @elaubsch (#77)

This PR updates the specified DeepCell version in the Docker build command in the README.

Update example data download in nbs @elaubsch (#76)

This PR updates the data download to use SpotNetExampleData and SpotNet in the example notebooks for this repo.

Update example notebooks @elaubsch (#72)

This PR updates the example notebooks, including a notebook demonstrating training a spot detection model, using the applications, and exporting the results from the Polaris application.

Add tests for results visualization functions @elaubsch (#71)

This PR adds tests for results visualization functions. It also adds the plotly requirement to setup.py and statsmodels to requirements.txt and setup.py. It also removes hamming_dist_hist because the input df_spots requires manipulation beyond the output of Polaris.

Update decoding probability threshold @elaubsch (#69)

This PR updates the threshold probability for barcode assignment during decoding. The new value (0.95) has been determined to yield more consistent/accurate results than the previous value (0.5) in experiments across multiple datasets.

Replace thres\_prob with pred\_prob\_thresh @elaubsch (#61)

This PR replaces thres_prob with pred_prob_thresh for clarity. This variable name is more distinguishable from other thresholds in this code base.

Create utils module in deepcell\_spots @elaubsch (#60)

This PR reorganizes the utility functions in deepcell_spots. It creates a module, deepcell_spots.utils, to which data_utils, preprocessing_utils, postprocessing_utils, and utils (refactored to augmentation_utils) have been added.

Update application docstrings @elaubsch (#59)

This PR updates the docstrings for the SpotDetection, SpotDecoding, and Polaris applications. It also contains a bug fix for _validate_spots_intensities .

Update README and example notebooks @elaubsch (#50)

This PR updates the README and example notebooks to reflect recent changes to the SpotDecoding and Polaris applications.

Standardize application classes @elaubsch (#58)

This PR makes changes to the applications class to make them more similar to the pattern established in deepcell.applications. This update involves two major changes:

  1. It moves the functionality of the predict method of Polaris to a _predict method. Therefore, predict is now a wrapper for _predict.
  2. It removes Application from deepcell_applications as the base class for SpotDecoding, because this application does not need any of the same methods as a normal segmentation application.
Bump default action versions. @rossbar (#47)

Bump action versions to keep CI current, c.f. vanvalenlab/deepcell-tf#653

Maintenance to clip and threshold application arguments @elaubsch (#44)

This PR addresses a naming discrepancy between the SpotDetection and Polaris applications for the clip and threshold parameters. The default value for clip in SpotDetection and Polaris have now been set to True, because this setting gives better spot detection results on a wider range of images.

It also removes a line passing a threshold argument into the SpotDetection application inside the Polaris application. There are two reasons for this: (1) the SpotDetection application in Polaris is instantiated with postprocessing_fn=None, so the threshold argument would not be used, and (2) the default value for threshold in the SpotDetection application prevents an error from being raised about its value.

Update docstring for decoding functions @elaubsch (#45)

This PR addresses some discrepancies between the default values for arguments and default values states in the docstring in decoding_functions.py.

Update pinned deepcell version to 0.12.4 in Dockerfile @elaubsch (#43)

This PR updates the pinned deepcell version in the Dockerfile. This version determines the base image used to build the deepcell-spots image.

Maintenance of DotNetLosses scripts @elaubsch (#37)

This PR removes unused variables in DotNetLosses and makes the style more consistent through the script.

Update Python version in README @elaubsch (#34)

This PR updates the Python version in the Docker run command in the README.

📚️ Documentation

Update DEEPCELL\_VERSION in README @elaubsch (#77)

This PR updates the specified DeepCell version in the Docker build command in the README.