Gene ontology(GO)-based autoencoder for embedding single-cell RNA-seq.
Generally the ontoencoder takes three input: X, y and topology (the gene ontology, or any directed acyclic graph you input)
please refer to notebooks/TopoNet*
for examples of supervised learning; notebooks/OntoEncoder*
for unsupervised learning.
Any single-cell RNA-seq can be log-normalized and saved as .h5ad by scanpy
package.
The processing step is recorded in notebooks/GSE71585-single cell.ipynb
processed data are stored in /cellar/users/hsher/ontoencoder/notebooks/tasic.h5ad
(accessible to the Ideker lab)
the topology should be stored in the DCell format. See here for an example
This topology file can be converted to OntoEncoder/TopoNet-compatible format using ontoencoder.topology.topo_reader()
)
Please refer to OntoPrune [https://github.com/algaebrown/ontoPrune] for more information.
I haven't implement that.
environment.yml
should be helpful. Refer to here how to install the same conda environment