Enrichment with g:GOSt

Below is more of an explanation, reference and self-reminder of what parameters are used in the enrichment app and how this affects what pathways are displayed. Some of this information on g:Profiler web service is undocumented, difficult to discern from the help docs or cobbled together through various sources (Contact desk).

Parameters sent to g:GOSt

Under the hood, the enrichment app wraps g:GOSt to find pathways from a gene query list.

https://github.com/PathwayCommons/app-ui/blob/80666a3ddb15180c3d51de48e2d82bb723554f08/src/server/external-services/gprofiler/gprofiler.js#L12-L31

Pathway collections

Parameters sf_GO:BP and sf_REAC are boolean flags that select Gene Ontology Biological Process and Reactome pathways, respectively, for inclusion in enrichment analysis. Parameter no_iea is boolean for inclusion of GO assignments 'Inferred from Electronic Annotation'.

If you are interested in including other collections, here are the following (undocumented, via help desk) flags:

sf_GO  - includes BP, CC, MF subcategories of GO
sf_GO:BP - includes GO biological process terms (if used together with sf_GO then the intersection is applied i.e. sf_GO:BP and sf_GO will give only sf_GO:BP terms)
sf_GO:CC - includes GO cellular component terms
sf_GO:MF - includes GO molecular function terms
sf_KEGG - includes KEGG pathways
sf_REAC - includes Reactome pathways
sf_TF - includes transcription factor predictions from Transfac
sf_MI - includes miRBase predictions
sf_HPA - includes Human Protein Atlas data
sf_CORUM - includes CORUM protein complexes
sf_HP - includes Human Phenotype Ontology terms
sf_BIOGRID - includes Biogrid protein complexes

p -value thresholds

Combining the parameters for:

threshold_algo enum set to fdr says to use Benjamini-Hochenberg Procedure as the basis to derive adjusted p-values for each pathway a la R stats function p.adjust
significant boolean set to 0 says to ignore the default significance threshold filter in threshold_algo
user_thr real set to 0.05 says to use this as the adjusted p-value threshold so that only those pathways with adjusted p's below this will be returned

What comes out in the app

Very opinionated app. We set a hard threshold for adjusted p-values (0.05) and data sources (GO: BP and Reactome). There is some room in the analysis to declare the gene set sizes and in the visualization to set the edge similarity threshold but that's it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrichment with g:GOSt

Parameters sent to g:GOSt

Pathway collections

p -value thresholds

What comes out in the app

Clone this wiki locally