Skip to content

lordjoe/spark-ms-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-ms-clustering

Introduction

The spark-ms-clustering application is the Spark Algorithm of the newly developed mass spectra clustering algorithm. It is used to cluster massive amount of mass spectrometry data in public resources such as the PRIDE repository for MS/MS based proteomics data.

The spark-ms-clustering application relies on the spectra-cluster clustering API. All implementations of relevant clustering algorithms can be found there.

The following descriptions are only based on the clustering pipeline used to create the PRIDE Cluster resource.

Getting help

If you have questions or need additional help, create an issue: https://github.com/bigbio/spark-ms-clustering/issues

Giving your feedback

Please give us your feedback, including error reports, suggestions on improvements, new feature requests. You can do so by opening a new issue at our issues section

How to cite

Please cite this library using one of the following publications:

  • Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Griss J, Perez-Riverol Y, Lewis S, Tabb DL, Dianes JA, Del-Toro N, Rurik M, Walzer MW, Kohlbacher O, Hermjakob H, Wang R, Vizcaíno JA. Nat Methods. 2016 Aug;13(8):651-656. Epub 2016 Jun 27. PDF

Contribute

We welcome all contributions submitted as pull request.

License

This project is available under the Apache 2 open source software (OSS) license.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages