dcd-map: Map MaveDB data to computable and interoperable variant objects

This library implements a novel method for mapping MaveDB scoreset data to GA4GH Variation Representation Specification (VRS 2.0) objects, enhancing interoperability for genomic medicine applications. See Arbesfeld et. al. (2023) for a preprint edition of the mapping manuscript, or download the resulting mappings directly.

Prerequisites

Universal Transcript Archive (UTA): see README for setup instructions. Users with access to Docker on their local devices can use the available Docker image; otherwise, start a relatively recent (version 14+) PostgreSQL instance and add data from the available database dump.
SeqRepo: see README for setup instructions. The SeqRepo data directory must be writeable; see specific instructions here for more.
Gene Normalizer: see documentation for data setup instructions.
blat: Must be available on the local PATH and executable by the user. Otherwise, its location can be set manually with the BLAT_BIN_PATH env var. See the UCSC Genome Browser FAQ for download instructions.

Installation

Install from PyPI:

python3 -m pip install dcd-mapping

Usage

Use the dcd-map command with a scoreset URN, eg

$ dcd-map urn:mavedb:00000083-c-1

Output is saved in the format <URN>_mapping_results_<ISO datetime>.json in the directory specified by the environment variable MAVEDB_STORAGE_DIR, or ~/.local/share/dcd-mapping by default.

Use dcd-map --help to see other available options.

Notebooks

Notebooks for manuscript data analysis and figure generation are provided within notebooks/analysis. See notebooks/analysis/README.md for more information.

Development

Clone the repo

git clone https://github.com/ave-dcd/dcd_mapping
cd dcd_mapping

Create and activate a virtual environment

python3 -m virtualenv venv
source venv/bin/activate

Install as editable and with developer dependencies

python3 -m pip install -e '.[dev,tests]'

Add pre-commit hooks

pre-commit install

Run tests with pytest

pytest

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.github/workflows		.github/workflows
notebooks/analysis		notebooks/analysis
sample_mappings		sample_mappings
scripts		scripts
src/dcd_mapping		src/dcd_mapping
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
schema.json		schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dcd-map: Map MaveDB data to computable and interoperable variant objects

Prerequisites

Installation

Usage

Notebooks

Development

About

Releases 4

Contributors 3

Languages

License

ave-dcd/dcd_mapping

Folders and files

Latest commit

History

Repository files navigation

dcd-map: Map MaveDB data to computable and interoperable variant objects

Prerequisites

Installation

Usage

Notebooks

Development

About

Resources

License

Stars

Watchers

Forks

Releases 4

Contributors 3

Languages