Skip to content

Latest commit

 

History

History
63 lines (43 loc) · 1.5 KB

README.md

File metadata and controls

63 lines (43 loc) · 1.5 KB

Multimodal Datasets

mudatasets provides some public datasets with multimodal data, primarily focusing on multimodal omics datasets.

MuData library | MuData documentation

Installation

PyPi version

# Stable, with muon
pip install "mudatasets[muon]"
# Dev
pip install git+https://github.com/gtca/mudatasets

Getting started

import mudatasets as mds

Find available datasets

mds.list_datasets()

Load a dataset

mdata = mds.load("pbmc3k_multiome")
print(mdata)

Some common attributes for .load() are:

  • data_dir= for location to save the dataset (~/mudatasets/ by default)
  • with_info=True for also returning the second argument with dataset description as a dictionary (False by default)
  • backed=True for reading data in a backed format, only for .h5mu and .h5ad files (True by default)
  • files= for downloading specific files from the dataset
  • full=True for downloading all the files defined for the dataset (False by default)

Get dataset info

mds.info("pbmc3k_multiome")

List dataset file names

mds.list_files("pbmc3k_multiome")

Webpage with all the files

mds.serve_webpage(port=8000)

This command will launch a server providing a simple (temporarily created) HTML page at http://localhost:8000 with files across all of the datasets listed.