Skip to content

Tracing the dependencies of open source software mentioned in the biomedical literature

Notifications You must be signed in to change notification settings

borisveytsman/SoftwareImpactHackathon2023_Tracing_dependencies

Repository files navigation

Exploring the dependencies of the CZI mentions dataset

zoomed_out

Exploring the Dependencies of Mentioned Software in the CZI Software Mentions Dataset

scatter plot of number of mentions on x-axis, Katz score on y-axis and colored by ecosystem

We construct a graph of dependencies between software packages mentioned in the CZI Software Mentions Dataset. We then use the Katz centrality score to rank the importance of each software package. The data is available as Brown, E. M. (2023). A Dependency Graph for 460,000 Papers and Their Software Mentions from the CZI Software Mentions Dataset (1.0.0) [Data set]. CZI Research Software Hackathon. Zenodo. https://doi.org/10.5281/zenodo.10048132.

We find some interesting examples of "most important" (given that some of the ecosystems are incorrectly labelled):

  • PACE is the most mentioned software but may not be the most "critical" / connected
  • VELVET is seemingly the "most critical" / connected software that has very few mentions
  • PERMANOVA is seemingly a "true" (mentioned and identified in the correct ecosystem) which is incredidly important and correct and has a number of mentions.

Software That is Important but has No Mentions in the Literature

All of these are NEVER mentioned.

PyPI - six: Python 2 and 3 compatibility utilities.

pypi six is the most important

Bioconductor - BiocIO: a package for basic file handling and some formats

BiocIO is the most important

CRAN - isoband: An R package to generate contour lines and polygons.

isoband is the most important

Exploring the Dependencies of Imported Software within Notebooks from the Combined CZI Software Mentions Dataset and

About this project

This repository was developed as part of the Mapping the Impact of Research Software in Science hackathon hosted by the Chan Zuckerberg Initiative (CZI). By participating in this hackathon, owners of this repository acknowledge the following:

  1. The code for this project is hosted by the project contributors in a repository created from a template generated by CZI. The purpose of this template is to help ensure that repositories adhere to the hackathon’s project naming conventions and licensing recommendations. CZI does not claim any ownership or intellectual property on the outputs of the hackathon. This repository allows the contributing teams to maintain ownership of code after the project, and indicates that the code produced is not a CZI product, and CZI does not assume responsibility for assuring the legality, usability, safety, or security of the code produced.
  2. This project is published under a MIT license.

Code of Conduct

Contributions to this project are subject to CZI’s Contributor Covenant code of conduct. By participating, contributors are expected to uphold this code of conduct.

Reporting Security Issues

If you believe you have found a security issue, please responsibly disclose by contacting the repository owner via the ‘security’ tab above.

Licenses

Licenses are annotated according to the REUSE Specification v3.0. Please see the single files or respective .license files for the actual licenses.

Generally,

  • code is licensed under the MIT license
  • documents are licensed under CC-BY-4.0
  • some data files and other files are licensed under CC0-1.0

Cite this project

To cite this project, please use the metadata in CITATION.cff. You can also copy and paste an APA-formatted string, or a BibTeX entry directly from the "Cite this repository" widget on GitHub.

About

Tracing the dependencies of open source software mentioned in the biomedical literature

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages