-
Notifications
You must be signed in to change notification settings - Fork 1
/
extended_documentation.html
71 lines (71 loc) · 11.7 KB
/
extended_documentation.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
<ul>
<li style="font-weight: 400;">
<p><strong>Introduction</strong></p>
<p><span style="font-weight: 400;">This website implements an interactive pathway map (IPM) built using INDRA, an automated model assembly system for molecular biology. The goal of INDRA-IPM is to allow users to build, contextualize, and share biological pathway models by describing them in natural language.</span></p>
<br />
<p><span style="font-weight: 400;">The visualization aims to display pathways in a visual style similar to that used by biologists in textbooks and presentations. In addition we offer a layer of contextualization and an interactive user interface:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Nodes represent biological entities mentioned in text.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Nodes representing families are subdivided into pie charts.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Wild-type genes are colored green.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Mutated genes are colored orange.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The intensity of each color corresponds to the expression level of a gene in CCLE.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Clicking any node provides additional context by linking out to CiteAb, HGNC, and UniProt.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Clicking any edge will allow a user to filter the sources and targets of that edge and make a request for evidence found in literature and curated database that is stored in INDRA DB.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The force-directed active layout which snaps nodes to their positions can be disabled by toggling the “Forces” button.</span></li>
</ul>
<br />
<p><span style="font-weight: 400;">We start off displaying a pre-built model which demonstrates all of these features. The RAS Pathway Map model was drawn by Dr. Frank McCormick in collaboration with the NCI RAS Initiative community.</span></p>
<br />
<p><strong>Building models from text</strong></p>
<p><span style="font-weight: 400;">Users have the ability to define their own biological models in text under the “Build” tab. Here, we start off with the text necessary to build The NCI RAS Pathway Map as an example. A full list of the mechanistic relationships that can be represented by INDRA (and therefore INDRA-IPM) can be found at https://indra.readthedocs.io/en/latest/modules/statements.html, and examples of models described in natural language (processed via the TRIPS system and assembled by INDRA) can be found in Gyori, Bachman, et. al. (2017).</span></p>
<br />
<p><span style="font-weight: 400;">Users should note that the natural language processing systems are fairly robust but not without limitations. Proper grammar and punctuation should be used. The reading systems do not consider newlines to be sentence separators and may return erroneous output for sentences which are not terminated with a period.</span></p>
<br />
<p><span style="font-weight: 400;">The recognition and grounding of named entities (proteins, etc.) to database identifiers is done automatically. Nevertheless, using standardized names such as HGNC symbols (as opposed to informal synonyms) is preferred to avoid ambiguity. To normalize node names in the pathway map, the IPM performs name standardization, in which entities mentioned by their synonyms are normalized to standard names such as HGNC symbols (for instance, MEK1, Map2k1 and Mek1 are all normalized to the standard symbol MAP2K1). Note that by clicking on a node, a tooltip opens that allows linking out to databases (HGNC, UniProt, CiteAb), and checking the original text that the standardized node was created from.</span></p>
<br />
<p><span style="font-weight: 400;">INDRA-IPM also recognizes protein families and complexes and grounds them in the FamPlex ontology (</span><a href="https://github.com/sorgerlab/famplex/"><span style="font-weight: 400;">https://github.com/sorgerlab/famplex/</span></a><span style="font-weight: 400;">). In some cases, there is ambiguity in the name of a specific gene and a family it is part of. An example of this is the grounding of “JUN” from text to the JUN family, which also includes the JUN gene. In this case the user can use a synonym such as “c-JUN” that refers to the singular entity in order to reference only the gene and not the family.</span></p>
<br />
<p><span style="font-weight: 400;">We have exposed two reading systems to users. The REACH reader developed by the CLU Lab at the University of Arizona (https://github.com/clulab/reach) is an information extraction system for the biomedical domain, which aims to read scientific literature and extract cancer signaling pathways. We recommend users try REACH first due to its speed. The TRIPS/DRUM system (http://trips.ihmc.us/parser/cgi/drum) developed by IHMC may offer greater mechanistic detail in some use cases (for instance, it supports recognizing complex molecular conditions such as “BRAF-V600E not bound to Vemurafenib”), but it requires significantly longer to run.</span></p>
<br />
<p><strong>Contextualizing Models</strong></p>
<p><span style="font-weight: 400;">Users are able to project data from the Cancer Cell Line Encyclopedia (CCLE) onto their pathway maps. This is done automatically when the IPM is loaded initially (using the LOXIMVI skin cancer cell line) and can be changed to any other CCLE cell line in the Model Options dialogue panel. Wild type genes are colored green, while mutated genes are colored orange. Color intensity indicates the relative level of expression. Context is unavailable for gray nodes because they were not measured in CCLE.</span></p>
<br />
<p><strong>Sharing Models</strong></p>
<p><span style="font-weight: 400;">Users can share models using the NDEx network sharing website (</span><a href="http://ndexbio.org"><span style="font-weight: 400;">http://ndexbio.org</span></a><span style="font-weight: 400;">). To upload the current model, click the “NDEX” button at the bottom of the interface, then click “Upload”. A link to NDEx will appear one the upload is complete.</span></p>
<br />
<p><span style="font-weight: 400;">One can load a model by entering the unique key at the end of this link (e.g., </span><span style="font-weight: 400;">9b901d8f-2e2d-11e9-9f06-0ac135e8bacf) into the Load field. Alternatively, one can share the link in the address bar (e.g., pathwaymap.indra.bio/?uuid=9b901d8f-2e2d-11e9-9f06-0ac135e8bacf) which will send a user to the IPM website and immediately load the shared model. Shared models preserve their text description, INDRA statements, graph layout, cell line context, and any evidence retrieved from INDRA DB.</span></p>
<br />
<p><strong>Exporting Models</strong></p>
<p><span style="font-weight: 400;">Users can export models in a variety of formats.</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">INDRA JSON will export the model statements as a in the JSON format. These can be imported into INDRA or processed separately. The INDRA JSON format is specified at </span><a href="https://github.com/sorgerlab/indra/blob/master/indra/resources/statements_schema.json"><span style="font-weight: 400;">https://github.com/sorgerlab/indra/blob/master/indra/resources/statements_schema.json</span></a></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">PySB, SBML, BNGL, Kappa will export executable models in these formats. These modeling formalisms allow parameterizing and simulating models, and evaluating them against time-course data. More information about these formats and tools supporting them is available at the following places:</span></li>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">PySB: </span><a href="http://pysb.org/"><span style="font-weight: 400;">http://pysb.org/</span></a></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">SBML: </span><a href="http://sbml.org/"><span style="font-weight: 400;">http://sbml.org/</span></a></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">BNGL: </span><a href="http://visualizlab.org/rulebender/index.html"><span style="font-weight: 400;">http://visualizlab.org/rulebender/index.html</span></a><span style="font-weight: 400;"> and </span><a href="https://www.csb.pitt.edu/Faculty/Faeder/?page_id=409"><span style="font-weight: 400;">https://www.csb.pitt.edu/Faculty/Faeder/?page_id=409</span></a></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Kappa: </span><a href="https://kappalanguage.org/"><span style="font-weight: 400;">https://kappalanguage.org/</span></a></li>
</ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">SBGN will export a model in the Systems Biology Graphical Notation format. Documentation and tools supporting SBGN are available at: http://sbgn.github.io/sbgn/. Note that layout information is not included in exported SBGN models, however tools such as Newt (http://web.newteditor.org/) have built-in layout algorithms.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">CX will export a model in the .cx format which can be opened in Cytoscape3 and also uploaded to NDEx. </span></li>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Cytoscape enables network visualization and provides access to a large ecosystem of analysis plugins; more information is available at: </span><a href="https://cytoscape.org/cy3.html"><span style="font-weight: 400;">https://cytoscape.org/cy3.html</span></a></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">NDEx is a network sharing and versioning website with a programmatic API for accessing networks: </span><a href="http://ndexbio.org/"><span style="font-weight: 400;">http://ndexbio.org/</span></a></li>
</ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">PNG will export a high-resolution .png image of the current graph. This feature is useful for taking snapshots of a pathway map for inclusion into documents or presentations.</span></li>
</ul>
<br />
<p><span style="font-weight: 400;">In order to simplify the user interface, only PNG export is available on mobile devices with limited screen width.</span></p>
<br />
<p><strong>Funding</strong></p>
<p><span style="font-weight: 400;">This work was funded by ARO Grants W911NF‐14‐1‐0397 and W911NF‐15‐1‐0544 under the DARPA Big Mechanism and Communicating with Computers programs, and by NIGMS Grant P50GM107618.</span></p>
<br />
<p><strong>Privacy</strong></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Our API backend receives user-generated requests such as those for reading, contextualization, and NDEx sharing.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Our server logs the IP addresses which make requests to the API.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The data from some user requests is forwarded to external APIs such as TRIPS (reading), cBioPortal (contextualization), NDEx in order to implement these functions.</span></li>
</ul>
<br /><br /><br /></li>
</ul>