Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identifiers to all FamPlex entries #87

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# IPython Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# dotenv
.env

# virtualenv
venv/
ENV/

# Spyder project settings
.spyderproject

# Rope project settings
.ropeproject

# PyCharm project settings
.idea/*

*.pickle
*.gpickle

scratch
scratch/*

.pytest_cache
.DS_Store
16 changes: 6 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,21 @@ python:
before_install:
- sudo apt-get install graphviz
- pwd
- pip install future requests pygraphviz
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
pip install functools32;
fi
- pip install future requests pygraphviz bel_resources
- git clone https://github.com/sorgerlab/indra.git
script:
- pwd
- echo $TRAVIS_BUILD_DIR
- export PYTHONPATH=$PYTHONPATH:$TRAVIS_BUILD_DIR/indra
- python check_references.py
- python export/relations_graph.py
- python famplex/check_references.py
- python famplex/export/export_relations_graph.py
# test if exports have been updated
# bel_resources must be installed to generate belns exports
# generate exports then take git diff and look only at lines that changed
# lines containing date, timestamp, or version should be ignored
- pip install bel_resources
- python export/obo.py
- python export/hgnc_ids.py
- python export/belns.py
- python famplex/export/export_obo.py
- python famplex/export/hgnc_ids.py
- python famplex/export/export_belns.py
- export belns_diff=$(git diff -U0 export/famplex.belns | egrep "^[\+-][^\+-]")
- export belns=$(echo "$belns_diff" | egrep -v "^[\+-](VersionString|CreatedDateTime)")
- export obo_diff=$(git diff -U0 export/famplex.obo | egrep "^[\+-][^\+-]")
Expand Down
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ the FamPlex namespace.

* ```entities.csv```. A registry of the families and complexes defined in the
FamPlex namespace.

* ```descriptions.csv```. Descriptions and citations of some entities. Contains
three columns: the FamPlex name, comma separated reference CURIEs, and a
textual description.

* ```grounding_map.csv```. Explicit mapping of text strings to identifiers in
biological databases.
Expand Down Expand Up @@ -61,7 +65,7 @@ relationships, including sub-families (families within families) and complexes
consisting of families of related subunits (e.g., PI3K, NF-kB).

The ```relations.csv``` file consists of five columns: (1) the namespace for
the subject (e.g., ```HGNC``` for gene names, ```UP``` for Uniprot, or
the subject (e.g., ```HGNC``` for gene names, ```UP``` for UniProt, or
```FPLX``` for the FamPlex namespace), (2) the identifier for the subject,
(3) the relationship (```isa``` or ```partof```), (4) the namespace for the
object, and (5) the identifier for the object.
Expand All @@ -87,10 +91,10 @@ cancer.

Entities are grounded to the following databases:

* Genes/proteins: [Uniprot](http://www.uniprot.org)
* Genes/proteins: [UniProt](http://www.uniprot.org)

* Chemicals: [PubChem](https://pubchem.ncbi.nlm.nih.gov/),
[CHEBI](https://www.ebi.ac.uk/chebi/), and [HMDB](http://www.hmdb.ca/) (for
[ChEBI](https://www.ebi.ac.uk/chebi/), and [HMDB](http://www.hmdb.ca/) (for
metabolites)

* Biological processes: [GO](http://geneontology.org/) and
Expand All @@ -99,7 +103,7 @@ Entities are grounded to the following databases:
* Protein families and named complexes: grounded to entities defined within
the FamPlex repository in the ```entities.csv``` and ```relations.csv```
files, and to identifiers in [PFAM](http://pfam.xfam.org/)
and [Interpro](https://www.ebi.ac.uk/interpro/) when possible.
and [InterPro](https://www.ebi.ac.uk/interpro/) when possible.

## Gene prefixes

Expand Down
Loading