Skip to content

Commit

Permalink
Add more work
Browse files Browse the repository at this point in the history
  • Loading branch information
jsstevenson committed Dec 8, 2023
1 parent 09adaf9 commit bd23e5f
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 21 deletions.
4 changes: 2 additions & 2 deletions cool_seq_tool/data/data_downloads.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def __init__(self) -> None:

def get_mane_summary(self) -> Path:
"""Identify latest MANE summary data. If unavailable locally, download from
source.
`NCBI FTP server <https://ftp.ncbi.nlm.nih.gov/refseq/MANE/MANE_human/current/>`_.
:return: path to MANE summary file
"""
Expand All @@ -52,7 +52,7 @@ def get_mane_summary(self) -> Path:

def get_lrg_refseq_gene_data(self) -> Path:
"""Identify latest LRG RefSeq Gene file. If unavailable locally, download from
source.
`NCBI FTP server <https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/>`_.
:return: path to acquired LRG RefSeq Gene data file
"""
Expand Down
1 change: 0 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ Description here.
:maxdepth: 2

Installation<install>
Overview<overview>
Usage<usage>
API Reference<reference/index>
Contributing<contributing>
Expand Down
13 changes: 0 additions & 13 deletions docs/source/overview.rst

This file was deleted.

49 changes: 44 additions & 5 deletions docs/source/usage.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,54 @@
Usage
=====

.. _configuration:
Cool-Seq-Tool provides easy access to, and useful operations on, a selection of important genomic resources. Modules are divided into three groups:

* :ref:`Data sources <sources_modules_api_index>`, for basic acquisition and setup for a data source via Python

* :ref:`Data handlers <handlers_modules_api_index>`, for additional operations on top of existing sources

* :ref:`Data mappers <mappers_modules_api_index>`, for functions that incorporate multiple sources/handlers to produce output

Configuration
-------------
A core :py:class:`CoolSeqTool <cool_seq_tool.app.CoolSeqTool>` class encapsulates all of their functions and can be used for easy initialization:

Programmatic access
-------------------
.. code-block:: pycon
>>> from cool_seq_tool.app import CoolSeqTool
>>> cst = CoolSeqTool()
.. _configuration:

REST server
-----------

Possibly staged for deletion?


Environment configuration
-------------------------

Individual classes will accept arguments upon initialization to set parameters regarding data sources. In general, these parameters are also configurable via environment variables, e.g. in a cloud deployment.

.. list-table::
:widths: 25 100
:header-rows: 1

* - Variable
- Description
* - ``LRG_REFSEQGENE_PATH``
- Path to LRG_RefSeqGene file. Used in :py:class:`TranscriptMappings <cool_seq_tool.sources.transcript_mappings.TranscriptMappings>` to provide mappings between gene symbols and RefSeq/Ensembl transcript accessions. If not defined, defaults to the most recent version (formatted as ``data/LRG_RefSeqGene_YYYYMMDD``) within the Cool-Seq-Tool library directory.
* - ``TRANSCRIPT_MAPPINGS_PATH``
- Path to transcript mapping file generated from `Ensembl BioMart <http://www.ensembl.org/biomart/martview>`_. Used in :py:class:`TranscriptMappings <cool_seq_tool.sources.transcript_mappings.TranscriptMappings>`. If not defined, uses a copy of the file that is bundled within the Cool-Seq-Tool installation. See the :ref:`contributor instructions <build_transcript_mappings_tsv>` for information on manually rebuilding it.
* - ``MANE_SUMMARY_PATH``
- Path to MANE Summary file. Used in :py:class:`MANETranscriptMappings <cool_seq_tool.sources.mane_transcript_mappings.MANETranscriptMappings>` to provide MANE transcript annotations. If not defined, defaults to the most recent version (formatted as ``data/MANE.GRCh38vX.X.summary.txt``) within the Cool-Seq-Tool library directory.
* - ``SEQREPO_ROOT_DIR``
- Path to SeqRepo directory (i.e. contains ``aliases.sqlite3`` database file, and ``sequences`` directory). Used by :py:class:`SeqRepoAccess <cool_seq_tool.handlers.seqrepo_access.SeqRepoAccess`. If not defined, defaults to ``/usr/local/share/seqrepo/latest``.
* - ``UTA_DB_URL``
- A `libpq connection string <https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING>`_, i.e. of the form ``postgresql://<user>:<password>@<host>:<port>/<database>/<schema>``, used by the :py:class:`cool_seq_tool.sources.uta_database.UTADatabase` class. By default, it is set to ``postgresql://uta_admin:uta@localhost:5433/uta/uta_20210129b``.
* - ``LIFTOVER_CHAIN_37_TO_38``
- A path to a `chainfile <https://genome.ucsc.edu/goldenPath/help/chain.html>`_ for lifting from GRCh37 to GRCh38. Used by :py:class:`cool_seq_tool.sources.uta_database.UTADatabase` as input to `pyliftover <https://pypi.org/project/pyliftover/>`_. If not provided, pyliftover will fetch it automatically from UCSC.
* - ``LIFTOVER_CHAIN_38_TO_37``
- A path to a `chainfile <https://genome.ucsc.edu/goldenPath/help/chain.html>`_ for lifting from GRCh38 to GRCh37. Used by :py:class:`cool_seq_tool.sources.uta_database.UTADatabase` as input to `pyliftover <https://pypi.org/project/pyliftover/>`_. If not provided, pyliftover will fetch it automatically from UCSC.

Schema support
--------------

0 comments on commit bd23e5f

Please sign in to comment.