diff --git a/_posts/2023-11-09-deep-connectome-clustering.md b/_posts/2023-11-09-deep-connectome-clustering.md
index fea1067f..d5737f00 100644
--- a/_posts/2023-11-09-deep-connectome-clustering.md
+++ b/_posts/2023-11-09-deep-connectome-clustering.md
@@ -14,6 +14,9 @@ authors:
- name: Max Filter
affiliations:
name: MIT
+ - name: Eric Liu
+ affiliations:
+ name: MIT
# must be the exact same name as your blogpost
bibliography: 2023-11-09-deep-connectome-clustering.bib
@@ -22,25 +25,28 @@ bibliography: 2023-11-09-deep-connectome-clustering.bib
# - make sure that TOC names match the actual section names
# for hyperlinks within the post to work correctly.
toc:
- - name: Connectomooes, and what they can teach us
- - name: Unsupervised graph representation learning
- - name: Proposed research questions and methods
+ - name: Motivation
+ - name: Background
+ - name: Methods
+ - name: Experiments
+ - name: Discussion
+ - name: Conclusion
# Below is an example of injecting additional post-specific styles.
# This is used in the 'Layouts' section of this post.
# If you use this post as a template, delete this _styles block.
---
-## Connectomes, and what they can teach us
+## Motivation
{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/fruit-fly-connectome.png" class="img-fluid" %}
The fruit fly connectome.
-Everything you've ever learned, every memory you have, and every behavior that defines you is stored somewhere in the neurons and synapses of your brain. The emerging field of connectomics seeks to build connectomes–or neuron graphs–that map the connections between all neurons in the brains of increasingly complex animals, with the goal of leveraging graph structure to gain insights into the functions of specific neurons, and eventually the behaviors that emerge from their interactions. This, as you can imagine, is quite a difficult task, but progress over the last few years has been promising.
+Everything you've ever learned, every memory you have, and every behavior that defines you is stored somewhere in the neurons and synapses of your big, beautiful brain. The emerging field of connectomics seeks to build connectomes–or neuron graphs–that map the connections between all neurons in the brains of increasingly complex animals, with the goal of leveraging graph structure to gain insights into the functions of specific neurons, and eventually the behaviors that emerge from their interactions. This, as you can imagine, is quite a difficult task, but progress over the last few years has been promising.
-Now, you might be asking yourself at this point, can you really predict the functions of neurons based on their neighbors in the connectome? A paper published by Yan et al. in 2017 asked this same question, searching for an answer in a roundworm (C. elegans) connectome. In their investigation, they discovered a neuron whose behavior had not been previously characterized, which they hypothesized was necessary for locomotion. They tested this hypothesis by ablating the neuron on a living C. elegans, and to the dismay of that poor roundworm, found that it was indeed necessary.
+Now, you might be asking yourself, can you really predict the functions of neurons based on their neighbors in the connectome? A paper published by Yan et al. in 2017 asked this same question, searching for an answer in a roundworm (C. elegans) connectome. In their investigation, they discovered a neuron whose behavior had not been previously characterized, which they hypothesized was necessary for locomotion. They tested this hypothesis by ablating the neuron on a living C. elegans, and to the dismay of that poor roundworm, found that it was indeed necessary.
Although impressive, the C. elegans connectome has only ~300 neurons, compared with the ~100,000,000,000 in the human brain; however, this year (2023):
@@ -51,7 +57,17 @@ This is exciting because the fruit fly dataset presents an opportunity to identi
Furthermore, current efforts to map connectomes of increasingly complex animals makes it desirable to have algorithms that are **able to scale** and handle that additional complexity, with the hopes of one day discovering the algorithms that give rise to consciousness.
-## Unsupervised graph representation learning
+## Background
+
+### Can we learn about human brains by studying connectomes of simpler organisms?
+
+The primate brain exhibits a surprising degree of specialization, particularly for social objects. For instance, neurons in the face fusiform area (FFA) in the IT cortex appear to fire only in response to faces. Furthermore, individuals with lesions in or brain damage to this area lose their ability to recognize faces . In fact, there is even evidence of rudimentary face perception even in newborn infants with limited access to visual “training data,” who preferentially look at photos of faces, and other face-like arrangements, like inverted triangles (two vertices being the eyes and the third the mouth) . While there may not exist a grandmother cell that can recognize your grandmother, there certainly seems to be at least some engineered specialization in the brain. Cognitive scientists theorize that there is a set of core systems for representing object, actions, number, space, and conspecifics (other people!), together constituting what we might call “common sense,” which may help determine the blueprint of the human brain down to the genetic level . Notably, facial recognition exhibits substantial genetic heritability (over 60%!) and appears to be uncorrelated with general intelligence . We might imagine that there are a set of capabilities, including social cognition, that were so critical for human behavior that our brains evolved over hundreds of thousands of years to “hard code” certain structures, like the FFA, to help scaffold them. After all, another person’s face is an important signal for processes like mate selection, friendship formation, and theory of mind. The human brain and the cognitive processes it supports are evolutionary products. And even more importantly, the brain seems to be specialized in some ways, but behave flexibly in others. Through the scientific process, how good of an understanding can we reach about the complex organ sitting between our ears? To what degree are the neuronal assemblages in our brain specialized? How do the communications amongst these neurons grant us our incredible cognitive capabilities?
+
+In 1982, neuroscientist David Marr proposed three levels of analyses to study complex systems like the human mind: the computational level (what task is the system designed to solve?), the algorithmic level (how does the system solve it?), and the implementation level (where and how is the algorithm implemented in the system hardware?) . At one end of the spectrum, we might think about characterizing the computational capabilities of human cognition, like object recognition. On the other end, we might be interested in how object recognition is implemented in the brain itself, in all of its fleshy glory–how an incoming visual signal is processed by composites of receptive fields in the retina (biological “Gabor filters”) and fed to neurons in the primary and secondary visual areas of the cerebral cortex, for instance . In recent years, scientists have developed an interest in understanding the implementation level at an extremely high resolution by charting the connectome–the comprehensive map of all neural connections in the brain. However, if the grandmother cell is too simplistic of a model for knowledge representation in the human brain, then indeed the human connectome may offer an overly complex view. It seems easy to get lost in the wilderness of its approximately 100 trillion neurons and the nearly quadrillion synapses which connect them ! How can we begin to approach this overwhelming terra incognita?
+
+We might consider instead studying the connectome of a much simpler model organism, like the transparent 1mm-long nematode Caenorhabditis elegans, with whom we share an estimated 20-71% of our genes with . Or, maybe even the fruit fly Drosophila melanogaster, 60% of whose genes can also be found in the human genome (Max Planck). Even the study of such model organisms necessitates adding structure to complex, often unlabeled, relational data. And while the fruit fly brain is orders of magnitude less complex than our own, there are still over 3,000 neurons and half a million synapses to explore (Winding et al., 2023). Luckily, mankind’s toolkit for studying graph-like data is well-equipped.
+
+### Unsupervised graph representation learning
The problem of subdividing neurons in a connectome into types based on their synaptic connectivity is a problem of unsupervised graph representation learning, which seeks to find a low-dimensional embedding of nodes in a graph such that similar neurons are close together in the embedding space.
@@ -68,15 +84,103 @@ Spectral embedding is a popular and general machine learning approach that uses
Thus, it stands to reason that deep learning might offer more insights into the functions of neurons in the fruit fly connectome, or at the very least, that exploring the differences between the spectral embedding found by Winding et al. and the embeddings discovered by deep learning methods might provide intuition as to how the methods differ on real datasets.
-## Proposed research questions and methods
+In this project, we explore the differences between functional neuron clusters in the fruit fly connectome identified via spectral embedding by Winding et al. and deep learning. Specifically, we are interested in exploring how spectral embedding clusters differ from embeddings learned by Variational Graph Auto-Encooders (GVAE), which are a more recent architecture proposed by one of the co-authors of the Variational Auto-Encoders (VAE) paper, Max Welling. GVAEs are an interesting intersection of graph neural networks (GNNs) and VAEs, both of which we explored in class, and comparing this technique to spectral embedding is relevant because of our previous discussions of spectral decomposition in class with respect to network scalability and RNN weights.
+
+We hypothesize that a deep learning technique would be better suited to learning graph embeddings of connectomes because they are able to incorporate additional information about neurons (such as the neurotransmitters released at synapses between neurons) and are able to learn a nonlinear embedding space that more accurately represents the topological structure of that particular connectome, learning to weight the connections between some neurons above others.
+
+Before we can discuss the experiments, however, we first provide more detail for Spectral Embedding and Graph Variational Autoencoders and compare the two methods.
+
+## Methods
+
+### Spectral Embedding
+
+One classical approach for understanding graph-like data comes from a class of spectral methods which use pairwise distance measures between data points to embed and cluster data. Spectral methods offer two obvious advantages when compared to other machine learning approaches. One, we can straightforwardly perform clustering for datasets which are inherently relational, like the connectome, where it is not immediately clear how a method like k-means can be used when we only have access to the relationships between data points (the “edges”) and not the node-level features themselves. Two, spectral methods are **nonlinear**, and don’t rely on measures like squared Euclidean distance, which can be misleading for data which are tangled in high dimensions, but which exhibit a lower **intrinsic** dimensionality.
+
+So, how does spectral embedding work, exactly? In short, an adjacency matrix is first calculated from the original dataset, which is then used to compute the graph Laplacian. Next, a normalized graph Laplacian is then eigen-decomposed and generates a lower dimensional embedding space on which simpler linear clustering algorithms, like k-means, can be used to identify untangled clusters of the original data.
+
+This class of methods makes no assumptions about the data (including cluster shape) and can be adjusted to be less noise sensitive–for example, by performing a t-step random walk across the affinity matrix for the data, as in diffusion mapping . An added benefit is that under the hood, spectral embedding can be performed by a series of linear algebra calculations, making it extremely time-efficient. However, as with many unsupervised learning methods, clustering based on spectral embeddings is difficult to scale–in our case, due to the eigen-decomposition step of the graph Laplacian.
+
+
+### Variational Graph Autoencoders
+
+Although Spectral Embedding is still very popular, in recent years, more attention has been paid to the burgeoning field of geometric deep learning, a set of ideas which aim to to solve prediction or embedding tasks by taking into account the relational structure between data points. One example is the variational graph auto-encoder (VGAE), which learns to embed a complex object like a network into a low-dimensional, well-behaved latent space. Kipf and Welling (2016) propose an encoder using a two-layer graph convolutional network, which performs convolutions across local subgraphs of the input network data (not unlike convolution on images, where the graph is a grid!). The graph is projected onto a low dimensional space distributed according to the standard normal through the optimization of a variational lower bound loss, and then upsampled using an inner product between latent variables. They show that this method achieves competitive results on a link prediction task when compared to other methods like spectral clustering and DeepWalk, a random walk-based representation learning algorithm.
+
+On the other hand, some have discovered that spectral embedding leads to more clear separability in low dimensional representation spaces for text data compared to GNN approaches like node2vec, which reportedly achieve state-of-the-art (sota) scores for multilabel classification and link prediction in other datasets . In addition, it appears that simple modifications like performing an error correlation correction on the training data and smoothing predictions on the test data for GNN-free architectures lead to sota-comparable performances . There are even concerns that the performance of geometric deep learning approaches are inflated, particularly in tasks like multi-label node classification, due to the assumption that the number of labels for test data are known to researchers .
+
+Thus, it remains unclear in what circumstances relatively novel geometric deep learning approaches do better compared to established and widely-explored methods like spectral learning, and particularly for novel data like the connectome. In this work, we attempt to gain deeper insights into which method is moroe well-suited to the task of connectome modeling, with the hope of learning about which method should be implemented in future connectomes, such as that of the mouse and eventually the human.
+
+{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/background_visual.jpg" class="img-fluid" %}
+
+ Spectral Clustering (von Luxburg, 2007; Park, Jeon, & Pedryc, 2014) vs (V)GAEs (Kipf & Welling, 2016): A Story in Pictures
+
+
+## Experiments
+
+Now that we have a good idea of how these methods compare to each other in terms of implementation, we explore them from an experimental perspective. Through our experiments, we try to quantitatively and qualitatively address the question of how connectome clusters learned by GVAE compare to the spectral clusters found in the paper. To answer this question, we make use of the fruit fly connectome adjacency matrix provided by Winding et al. as our primary dataset with the hope of answering this question for our readers.
+
+### Experiment 1: Link Prediction
+
+One common way to compare unsupervised graph representation learning algorithms is through a link prediction task, where a model is trained on a subset of the edges of a graph, and then must correctly predict the existence (or non-existence) of edges provided in a test set. If the model has learned a good, compressed representation of the underlying graph data structure, then it will be able to accurately predict both where missing test edges belong, and where they do not.
+
+{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/link-prediction-task.png" class="img-fluid" %}
+
+ A link prediction task. Green lines correspond to the training data, which contains samples of positive samples of edges that are present in the graph, and negative samples of edges that are not present in the graph. The test set in red corresponds to the remainder of positive and negative samples in the graph.
+
+
+We evaluate the models by computing the area under curve (AUC) of the ROC curve, which plots the true positive rate against the false positive rate. A completely random classifier that does not learn anything about the underlying graph structure would get an AUC of 0.5, while a perfect classifier would have an area of 1.0.
+
+Another metric we use to evaluate how good the models are is average precision (AP) of the precision-recall curve, which describes the consistency of the model.
+
+In addition to comparing the models with these metrics, we also explore how robust they are to decreasing dimensionalities of the latent space. We hypothesize that if a model is able to maintain high AUC and AP, even at very low-dimensional embedding spaces, then it is likely better at capturing the structure of the connectome and is more likely to be able to scale to larger datasets, like that of the human brain one day.
+
+Running this experiment yields the following curves, where the x-axis shows the dimensionality of the latent space, and the y-axis shows the AUCs and APs of the respective models.
+
+{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/link-prediction-auc-ap.png" class="img-fluid" %}
+
+From this experiment, we find that both the Graph Autoencoder (GAE) and Variational Graph Autoencoder (VGAE) perform better than Spectral Embedding methods in terms of AUC and AP, indicating that the models might be better suited to capturing the nuances in the fruit fly connectome. At the dimensionality used for spectral embedding in Winding et al., d=24, we find that the models have comparable performance, but as we reduce the dimensionality of the learned embedding, the spectral embedding method quickly breaks down and loses its ability to capture significant features in the data, with an AUC of 0.52 at a dimensionality of 2. Since a score of 0.5 corresponds to a random model, this means that the spectral embedding method is no longer able to capture any meaningful structure in the data at that dimensionality. Winding et al. gets around this by only using spectral embedding to get a latent space of size 24, and then performing a hierarchical clustering algorithm inspired by Gaussian Mixture Models, but the simplicity and robustness of the GAE model seems to show that they may be better suited to modeling the types of functional neurons present in the connectomes of animals.
+
+### Experiment 2: GVAE Latent Exploration
+
+Although the link-prediction experiment gives us a quantitative comparison of the models, we also believe it is important to explore the latent embeddings learned by GAE to see how they qualitatively compare with the learned embeddings used in the Winding et al. work. After observing that the GAE was robust to a latent space of size 2, we decided to look specifically at if there were any similarities between the clusters found by the GAE with the 2-d embedding and the level 7 clusters published by Winding et. al. Also, although the GAE showed better overall performance, we decided to specifically explore the Variational GAE because we expect it to have a latent manifold similar to that of the Variational Autoencoders.
+
+To this end, we first trained a Variational GAE with a 2-d latent space on the full fruit fly connectome and extracted the latent embedding of each node in the connectome.
+
+With this latent embedding, we first visualized the latent space using colors corresponding to the 93 clusters identified by Winding et al. Clusters of the same color in the learned GAE latent space mean that the VGAE identified the same cluster that was identified in the Winding et. al. paper and areas where there are many colors within a cluster mean that GAE found a different cluster compared to spectral embedding.
+
+{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/explore_cluster.png" class="img-fluid" %}
+
+ Coloring the GVAE latent space by the found level 7 clusters Winding et al. Black points correspond to neurons that were not assigned a cluster by Winding et al.
+
+
+As seen in the figure above, we find that while VGAE projects directly to a 2-d latent space without any additional clustering to reduce the dimensionality, the learned embedding still shares many similarities with the spectral embedding down to a dimensionality of 24 followed by Gaussian Mixture Model hierarchical clustering. Therefore, using VGAE to learn a direct 2-d latent space still captures much of the same information that a more complex machine learning algorithm like spectral embedding is able to.
+
+We further explored the learned latent space by looking at whether the learned embedding had any correlation with the cell types identified in the fruit fly larvae connectome. Since the VGAE only had information about the structure of the graph embedding, clusters of similar colors in this figure mean that the cell type within the cluster shared a lot of common structures, like potentially the same degree or being connected to similar types of up or downstream neurons.
+
+We use the same color palette as the Winding et al. paper so that cell types in the level 7 clusters of the Winding et al. paper can be directly compared to the learned VGAE latent embedding.
+
+{% include figure.html path="assets/img/2023-11-09-deep-connectome-clustering/clustering-cell-type.png" class="img-fluid" %}
+
+ Coloring the Winding et al. level 7 clusters (left) and GVAE latent space (right) by cell types. This information was not provided to either algorithm during training, so clusters of the same cell type mean that its type can be inferred from structure only.
+
+
+As seen in the figure above, both spectral embedding and GVAE latent spaces capture knowledge about the cell types when trained purely on the graph structure. We believe this is because cells of this type have similar properties in terms of the types of neighboring neurons they connect to in the connectome, and they may also have special properties like higher degree of connections.
+
+In particular, it is interesting that sensory neurons and Kenyon cells are very well captured by both embeddings, and that MBIN cells and sensory neurons are clustered together by both their spectral embedding algorithm and VGAE.
+
+## Discussion
+
+Our preliminary investigations show that deep learning algorithms such as Graph Autoencoders (GAEs) and Variational Graph Autoencoders (VGAEs) are able to capture at least as much nuance and information about function as spectral embedding algorithms. In addition, they come with the following advangates:
+
+1. With their current implementation, they can easily be run on a GPU, while common spectral embedding algorithms in libraries such as scikit learn are only designed to work on CPUs. Since we take a deep learning approach, our GNN method can use batches optimized via Adam, while spectral embedding only works if the entire adjacency matrix fits in memoruy. This makes deep learning methods **better able to scale to larger datasets** such as the mouse connectome that may come in the next few years.
+2. As shown in experiment 2, GAEs and Variational GAEs are **able to directly learn a robust embedding into a 2-d space** without any additional clustering, making interpretation easy and fast. We suspect that because of its higher performance at embedding connectomes to such low dimensions compared to spectral embedding which performs only marginally better than a random algorithm at such low dimensions, VGAEs must be capturing some addiitonal nuance of the graph structures that spectral embedding is simply not able to encode.
+3. Comparing the 2-d embeddings of VGAE to the clustered 24-d spectral embeddings found in Winding et al. we find that even when compressing to such a low-dimensional space, the semantic information captured does in fact match that of spectral embedding at a higher dimensional space. Coloring by cell type shows that it also **captures information about the function of neurons**, with similar neuron types being clustered together even when they are located all over the brain, such as Kenyon cells. Cells of the same type likely serve simlar functions, so in this respect, VGAE is able to capture information about the function of cells using only knowledge of the graph structure.
+
+However, VGAE does not come without its **limitations**. One large limitation we found while implementing the architecture is that it currently requires graphs to be **undirected**, so we had to remove information about the direction of neurons for this work. Connectomes are inherently directed, so we likely missed some key information about the function of graphs by removing this directional nature of the connectome. Although this is not explored in our work, one simple way to fix this would be to add features to each node corresponding to the in-degree and out-degree of each neuron.
-In this project, I would like to explore the differences between functional neuron clusters in the fruit fly connectome identified via spectral embedding by Winding et al. and deep learning. Specifically, I am interested in exploring how spectral embedding clusters differ from embeddings learned by Variational Graph Auto-Encooders (GVAE), which are a more recent architecture proposed by one of the co-authors of the Variational Auto-Encoders (VAE) paper, Max Welling. I believe GVAEs are an interesting intersection of graph neural networks (GNNs) and VAEs, both of which we explored in class, and that comparing this technique to spectral embedding is also relevant to our learning, because spectral decomposition has been discussed in class with respect to network scalability and RNN weights. My hypothesis is that a deep learning technique would be better suited to learning graph embeddings of connectomes because they are able to incorporate additional information about neurons (such as the neurotransmitters released at synapses between neurons) and are able to learn a nonlinear embedding space that more accurately represents the topological structure of that particular connectome, learning to weight the connections between some neurons above others.
+This brings us to the another limitation of our study, which is that we did not explore **adding features to neurons** in our connectome with the VGAE algorithm. Past work on GAEs has shown that adding features leads to better model results and makes the model better able to capture relevant structures in the data. We did not feel that would be a fair comparison with Winding et al. because spectral embedding methods are not able to include additional features related to nodes that one would get for free when mapping the connectome, but we believe that including these features in the GAE structure would lead to an even better representation of the underlying dataset. Examples of these "free" features we could get that would help us predict functions of neurons include 1) the hemisphere the cell belongs to (e.g., not in fruit flies, but neurons in the left brain of humans correspond to language), 2) the axon I/O ratio, and the dendrite output-input ratio of a neuron.
-My proposed research questions that I'd like my project to address are:
+One final limiation is that our **model only trains on a single connectome**. This means that we aren't able to capture the variation of connectomes within a species. Maybe one day, we will be able to scan connectomes of people in the same way that we are able to scan genomes of people, but that day is likely still far away. We might be able to help this by using the generative compoment of the VGAE to create brains that are physically feasible given the structure of a single connectome, but it would be hard to test. Since we are currently only looking at the connectome of a single species, we likely aren't capturing an embedding space that finds functionally similar neurons in different animals such as C. elegans, which we may be able to do in future work.
-- How do unsupervised deep learning approaches for clustering graph nodes based on structural similarity compare to more traditional machine learning approaches like spectral embedding?
-- How does the theory of Graph Variational Autoencoders combine what we learned about VAEs and graph neural networks? Since both VAE and VGAE have the same co-author, I assume the theory is similar.
-- Which methods are more efficient and would scale better to large datasets (e.g. the mouse connectome)?
-- How do connectome clusters learned by GVAE compare to the spectral clusters found in the paper?
+## Conclusion
-My project would make use of the fruit fly connectome adjacency matrix provided by Winding et al. as its primary dataset.
\ No newline at end of file
+In this work, we asked if Deep Learning techniques like Variational Graph Autoencoders could learn something about the functions of cells in a connectome using only the graph structure. We found that VGAE did in fact capture relevant structures of the graph, even in the undirected case. It performed similarly to spectral embeding, even when embedding directly into a visualizable 2-d latent space. In the future, we may be able to learn about neurons that serve the same purpose across species, or learn about the underlying low level syntactic structures like for-loops or data types that our brain uses to encode consciousness, vision, and more.
\ No newline at end of file
diff --git a/assets/bibliography/2023-11-09-deep-connectome-clustering.bib b/assets/bibliography/2023-11-09-deep-connectome-clustering.bib
index 05378fac..1463cba0 100644
--- a/assets/bibliography/2023-11-09-deep-connectome-clustering.bib
+++ b/assets/bibliography/2023-11-09-deep-connectome-clustering.bib
@@ -62,4 +62,356 @@ @online{januszewski2023google
url = {https://blog.research.google/2023/09/google-research-embarks-on-effort-to.html},
year={2023},
publisher={Google Research}
-}
\ No newline at end of file
+}
+
+@article{gross_genealogy_2002,
+ title = {Genealogy of the "grandmother cell"},
+ volume = {8},
+ issn = {1073-8584},
+ doi = {10.1177/107385802237175},
+ abstract = {A "grandmother cell" is a hypothetical neuron that responds only to a highly complex, specific, and meaningful stimulus, such as the image of one's grandmother. The term originated in a parable Jerry Lettvin told in 1967. A similar concept had been systematically developed a few years earlier by Jerzy Konorski who called such cells "gnostic" units. This essay discusses the origin, influence, and current status of these terms and of the alternative view that complex stimuli are represented by the pattern of firing across ensembles of neurons.},
+ language = {eng},
+ number = {5},
+ journal = {The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry},
+ author = {Gross, Charles G.},
+ month = oct,
+ year = {2002},
+ pmid = {12374433},
+ keywords = {Animals, History, 20th Century, Humans, Models, Neurological, Neurons, Neurophysiology, Perception, Poland, Temporal Lobe},
+ pages = {512--518},
+}
+
+@incollection{ricci_hierarchical_2020,
+ address = {New York, NY},
+ title = {Hierarchical Models of the Visual System},
+ isbn = {978-1-4614-7320-6},
+ url = {https://doi.org/10.1007/978-1-4614-7320-6_345-2},
+ language = {en},
+ urldate = {2023-12-11},
+ booktitle = {Encyclopedia of {Computational} {Neuroscience}},
+ publisher = {Springer},
+ author = {Ricci, Matthew and Serre, Thomas},
+ editor = {Jaeger, Dieter and Jung, Ranu},
+ year = {2020},
+ doi = {10.1007/978-1-4614-7320-6_345-2},
+ pages = {1--14},
+}
+
+@article{xu_cortisol_2019,
+ title = {Cortisol Excess-Mediated Mitochondrial Damage Induced Hippocampal Neuronal Apoptosis in Mice Following Cold Exposure},
+ volume = {8},
+ issn = {2073-4409},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6627841/},
+ doi = {10.3390/cells8060612},
+ abstract = {Cold stress can induce neuronal apoptosis in the hippocampus, but the internal mechanism involving neuronal loss induced by cold stress is not clear. In vivo, male and female C57BL/6 mice were exposed to 4 °C, 3 h per day for 1 week. In vitro, HT22 cells were treated with different concentrations of cortisol (CORT) for 3 h. In vivo, CORT levels in the hippocampus were measured using ELISA, western blotting, and immunohistochemistry to assess the neuronal population and oxidation of the hippocampus. In vitro, western blotting, immunofluorescence, flow cytometry, transmission electron microscopy, and other methods were used to characterize the mechanism of mitochondrial damage induced by CORT. The phenomena of excessive CORT-mediated oxidation stress and neuronal apoptosis were shown in mouse hippocampus tissue following cold exposure, involving mitochondrial oxidative stress and endogenous apoptotic pathway activation. These processes were mediated by acetylation of lysine 9 of histone 3, resulting in upregulation involving Adenosine 5‘-monophosphate (AMP)-activated protein kinase (APMK) phosphorylation and translocation of Nrf2 to the nucleus. In addition, oxidation in male mice was more severe. These findings provide a new understanding of the underlying mechanisms of the cold stress response and explain the apoptosis process induced by CORT, which may influence the selection of animal models in future stress-related studies.},
+ number = {6},
+ urldate = {2023-12-11},
+ journal = {Cells},
+ author = {Xu, Bin and Lang, Li-min and Li, Shi-Ze and Guo, Jing-Ru and Wang, Jian-Fa and Wang, Di and Zhang, Li-Ping and Yang, Huan-Min and Lian, Shuai},
+ month = jun,
+ year = {2019},
+ pmid = {31216749},
+ pmcid = {PMC6627841},
+ pages = {612},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\3HXJZQV3\\Xu et al. - 2019 - Cortisol Excess-Mediated Mitochondrial Damage Indu.pdf:application/pdf},
+}
+
+@article{coifman_geometric_2005,
+ title = {Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps},
+ volume = {102},
+ issn = {0027-8424},
+ shorttitle = {Geometric diffusions as a tool for harmonic analysis and structure definition of data},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1140422/},
+ doi = {10.1073/pnas.0500334102},
+ number = {21},
+ urldate = {2023-12-12},
+ journal = {Proceedings of the National Academy of Sciences of the United States of America},
+ author = {Coifman, R. R. and Lafon, S. and Lee, A. B. and Maggioni, M. and Nadler, B. and Warner, F. and Zucker, S. W.},
+ month = may,
+ year = {2005},
+ pmid = {15899970},
+ pmcid = {PMC1140422},
+ pages = {7426--7431},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\FB7369L7\\Coifman et al. - 2005 - Geometric diffusions as a tool for harmonic analys.pdf:application/pdf},
+}
+
+@article{sporns_human_2005,
+ title = {The Human Connectome: A Structural Description of the Human Brain},
+ volume = {1},
+ issn = {1553-734X},
+ shorttitle = {The {Human} {Connectome}},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1239902/},
+ doi = {10.1371/journal.pcbi.0010042},
+ abstract = {The connection matrix of the human brain (the human “connectome”) represents an indispensable foundation for basic and applied neurobiological research. However, the network of anatomical connections linking the neuronal elements of the human brain is still largely unknown. While some databases or collations of large-scale anatomical connection patterns exist for other mammalian species, there is currently no connection matrix of the human brain, nor is there a coordinated research effort to collect, archive, and disseminate this important information. We propose a research strategy to achieve this goal, and discuss its potential impact.},
+ number = {4},
+ urldate = {2023-12-12},
+ journal = {PLoS Computational Biology},
+ author = {Sporns, Olaf and Tononi, Giulio and Kötter, Rolf},
+ month = sep,
+ year = {2005},
+ pmid = {16201007},
+ pmcid = {PMC1239902},
+ pages = {e42},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\BYL3FNMR\\Sporns et al. - 2005 - The Human Connectome A Structural Description of .pdf:application/pdf},
+}
+
+@article{lai_identification_2000,
+ title = {Identification of Novel Human Genes Evolutionarily Conserved in Caenorhabditis elegans by Comparative Proteomics},
+ volume = {10},
+ issn = {1088-9051},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC310876/},
+ abstract = {Modern biomedical research greatly benefits from large-scale genome-sequencing projects ranging from studies of viruses, bacteria, and yeast to multicellular organisms, like Caenorhabditis elegans. Comparative genomic studies offer a vast array of prospects for identification and functional annotation of human ortholog genes. We presented a novel comparative proteomic approach for assembling human gene contigs and assisting gene discovery. The C. elegans proteome was used as an alignment template to assist in novel human gene identification from human EST nucleotide databases. Among the available 18,452 C. elegans protein sequences, our results indicate that at least 83\% (15,344 sequences) of C. elegans proteome has human homologous genes, with 7,954 records of C. elegans proteins matching known human gene transcripts. Only 11\% or less of C. elegans proteome contains nematode-specific genes. We found that the remaining 7,390 sequences might lead to discoveries of novel human genes, and over 150 putative full-length human gene transcripts were assembled upon further database analyses., [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF132936–AF132973, AF151799–AF151909, and AF152097.]},
+ number = {5},
+ urldate = {2023-12-12},
+ journal = {Genome Research},
+ author = {Lai, Chun-Hung and Chou, Chang-Yuan and Ch'ang, Lan-Yang and Liu, Chung-Shyan and Lin, Wen-chang},
+ month = may,
+ year = {2000},
+ pmid = {10810093},
+ pmcid = {PMC310876},
+ pages = {703--713},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\23AMD565\\Lai et al. - 2000 - Identification of Novel Human Genes Evolutionarily.pdf:application/pdf},
+}
+
+@misc{noauthor_facts_nodate,
+ title = {Facts},
+ url = {https://www.mpg.de/10973625/why-do-scientists-investigate-fruit-flies},
+ abstract = {The high genetic similarity with mammals and its high fidelity make Drosophila to a popular model organism for scientists},
+ language = {en},
+ urldate = {2023-12-12},
+ file = {Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\CQ7HSWWH\\why-do-scientists-investigate-fruit-flies.html:text/html},
+}
+
+@article{kanwisher_fusiform_2006,
+ title = {The fusiform face area: a cortical region specialized for the perception of faces},
+ volume = {361},
+ issn = {0962-8436},
+ shorttitle = {The fusiform face area},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1857737/},
+ doi = {10.1098/rstb.2006.1934},
+ abstract = {Faces are among the most important visual stimuli we perceive, informing us not only about a person's identity, but also about their mood, sex, age and direction of gaze. The ability to extract this information within a fraction of a second of viewing a face is important for normal social interactions and has probably played a critical role in the survival of our primate ancestors. Considerable evidence from behavioural, neuropsychological and neurophysiological investigations supports the hypothesis that humans have specialized cognitive and neural mechanisms dedicated to the perception of faces (the face-specificity hypothesis). Here, we review the literature on a region of the human brain that appears to play a key role in face perception, known as the fusiform face area (FFA)., outlines the theoretical background for much of this work. The face-specificity hypothesis falls squarely on one side of a longstanding debate in the fields of cognitive science and cognitive neuroscience concerning the extent to which the mind/brain is composed of: (i) special-purpose (‘domain-specific’) mechanisms, each dedicated to processing a specific kind of information (e.g. faces, according to the face-specificity hypothesis), versus (ii) general-purpose (‘domain-general’) mechanisms, each capable of operating on any kind of information. Face perception has long served both as one of the prime candidates of a domain-specific process and as a key target for attack by proponents of domain-general theories of brain and mind. briefly reviews the prior literature on face perception from behaviour and neurophysiology. This work supports the face-specificity hypothesis and argues against its domain-general alternatives (the individuation hypothesis, the expertise hypothesis and others)., outlines the more recent evidence on this debate from brain imaging, focusing particularly on the FFA. We review the evidence that the FFA is selectively engaged in face perception, by addressing (and rebutting) five of the most widely discussed alternatives to this hypothesis. In , we consider recent findings that are beginning to provide clues into the computations conducted in the FFA and the nature of the representations the FFA extracts from faces. We argue that the FFA is engaged both in detecting faces and in extracting the necessary perceptual information to recognize them, and that the properties of the FFA mirror previously identified behavioural signatures of face-specific processing (e.g. the face-inversion effect)., asks how the computations and representations in the FFA differ from those occurring in other nearby regions of cortex that respond strongly to faces and objects. The evidence indicates clear functional dissociations between these regions, demonstrating that the FFA shows not only functional specificity but also area specificity. We end by speculating in on some of the broader questions raised by current research on the FFA, including the developmental origins of this region and the question of whether faces are unique versus whether similarly specialized mechanisms also exist for other domains of high-level perception and cognition.},
+ number = {1476},
+ urldate = {2023-12-12},
+ journal = {Philosophical Transactions of the Royal Society B: Biological Sciences},
+ author = {Kanwisher, Nancy and Yovel, Galit},
+ month = dec,
+ year = {2006},
+ pmid = {17118927},
+ pmcid = {PMC1857737},
+ pages = {2109--2128},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\QEZ2AHFH\\Kanwisher and Yovel - 2006 - The fusiform face area a cortical region speciali.pdf:application/pdf},
+}
+
+@article{spelke_core_2007,
+ title = {Core knowledge},
+ volume = {10},
+ issn = {1363-755X, 1467-7687},
+ url = {https://onlinelibrary.wiley.com/doi/10.1111/j.1467-7687.2007.00569.x},
+ doi = {10.1111/j.1467-7687.2007.00569.x},
+ abstract = {Human cognition is founded, in part, on four systems for representing objects, actions, number, and space. It may be based, as well, on a fifth system for representing social partners. Each system has deep roots in human phylogeny and ontogeny, and it guides and shapes the mental lives of adults. Converging research on human infants, non-human primates, children and adults in diverse cultures can aid both understanding of these systems and attempts to overcome their limits.},
+ language = {en},
+ number = {1},
+ urldate = {2023-12-12},
+ journal = {Developmental Science},
+ author = {Spelke, Elizabeth S. and Kinzler, Katherine D.},
+ month = jan,
+ year = {2007},
+ pages = {89--96},
+ file = {Spelke and Kinzler - 2007 - Core knowledge.pdf:C\:\\Users\\eliu\\Zotero\\storage\\H2TPHI7R\\Spelke and Kinzler - 2007 - Core knowledge.pdf:application/pdf},
+}
+
+@article{otsuka_face_2014,
+ title = {Face recognition in infants: A review of behavioral and near-infrared spectroscopic studies},
+ volume = {56},
+ copyright = {© 2013 Japanese Psychological Association},
+ issn = {1468-5884},
+ shorttitle = {Face recognition in infants},
+ url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/jpr.12024},
+ doi = {10.1111/jpr.12024},
+ abstract = {Recent developmental studies investigating face recognition ability in infants’ have provided evidence not only that infants show selective attention to faces, but also that they can discriminate between faces from birth, and that biases in face processing such as the face inversion and other race effects exist even in infancy. Studies measuring the hemodynamic responses to facial images in the infants’ brain using near-infrared spectroscopy (NIRS) have also reported differential cortical activity in response to face and nonface images in infants. This paper will review recent findings on infants face recognition provided by both behavioral studies and neuroimaging studies using NIRS. These converging lines of evidence point to the early onset of face recognition ability in infancy.},
+ language = {en},
+ number = {1},
+ urldate = {2023-12-12},
+ journal = {Japanese Psychological Research},
+ author = {Otsuka, Yumiko},
+ year = {2014},
+ note = {\_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/jpr.12024},
+ keywords = {face recognition, fNIRS, infants, preference},
+ pages = {76--90},
+ file = {Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\ZXLV3D2S\\Otsuka - 2014 - Face recognition in infants A review of behaviora.pdf:application/pdf;Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\9S9ZPW57\\jpr.html:text/html},
+}
+
+@article{gochin_neural_1994,
+ title = {Neural ensemble coding in inferior temporal cortex},
+ volume = {71},
+ issn = {0022-3077},
+ url = {https://journals.physiology.org/doi/abs/10.1152/jn.1994.71.6.2325},
+ doi = {10.1152/jn.1994.71.6.2325},
+ abstract = {1. Isolated, single-neuron extracellular potentials were recorded sequentially in area TE of the inferior temporal cortex (IT) of two macaque monkeys (n = 58 and n = 41 neurons). Data were obtained while the animals were performing a paired-associate task. The task utilized five stimuli and eight stimulus pairings (4 correct and 4 incorrect). Data were evaluated as average spike rate during experimental epochs of 100 or 400 ms. Single-unit and population characteristics were measured using a form of linear discriminant analysis and information theoretic measures. To evaluate the significance of covariance on population code measures, additional data consisting of simultaneous recordings from {\textless} or = 8 isolated neurons (n = 37) were obtained from a third macaque monkey that was passively viewing visual stimuli. 2. On average, 43\% of IT neurons were activated by any of the stimuli used (60\% if those inhibited also are included). Yet the neurons were rather unique in the relative magnitude of their responses to each stimulus in the test set. These results suggest that information may be represented in IT by the pattern of activity across neurons and that the representation is not sparsely coded. It is further suggested that the representation scheme may have similarities to DNA or computer codes wherein a coding element is not a local parametric descriptor. This is a departure from the V1 representation, which appears to be both local and parametric. It is also different from theories of IT representation that suggest a constructive basis set or “alphabet”. From this view, determination of stimulus discrimination capacity in IT should be evaluated by measures of population activity patterns. 3. Evaluation of small groups of simultaneously recorded neurons obtained during a fixation task suggests that little information about visual stimuli is conveyed by covariance of activity in IT when a 100-ms time scale is used as in this study. This finding is consistent with a prior report, by Gochin et al., which used a 1-ms time scale and failed to find neural activity coherence or oscillations dependent on stimuli. 4. Population-stimulus-discrimination capacity measures were influenced by the number of neurons and to some extent the number and type of stimuli. 5. Information conveyed by individual neurons (mutual information) averaged 0.26 bits. The distribution of information values was unimodal and is therefore more consistent with a distributed than a local coding scheme.(ABSTRACT TRUNCATED AT 400 WORDS)},
+ number = {6},
+ urldate = {2023-12-12},
+ journal = {Journal of Neurophysiology},
+ author = {Gochin, P. M. and Colombo, M. and Dorfman, G. A. and Gerstein, G. L. and Gross, C. G.},
+ month = jun,
+ year = {1994},
+ note = {Publisher: American Physiological Society},
+ pages = {2325--2337},
+ file = {Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\FG3IIB64\\Gochin et al. - 1994 - Neural ensemble coding in inferior temporal cortex.pdf:application/pdf},
+}
+
+@article{mcclamrock_marrs_1991,
+ title = {Marr's three levels: A re-evaluation},
+ volume = {1},
+ issn = {1572-8641},
+ shorttitle = {Marr's three levels},
+ url = {https://doi.org/10.1007/BF00361036},
+ doi = {10.1007/BF00361036},
+ abstract = {Marr's account of the analysis of complex information-processing tasks as having three levels — the levels of computational theory, representation and algorithm, and hardware implementation — is reconsidered. I argue that the notion of “level” here runs together two distinctive sort of explanatory shifts — that of grain and that of contextual function. I then offer a revision of the account which avoids this problem, and suggest how this might play a role in the practice of theory evaluation.},
+ language = {en},
+ number = {2},
+ urldate = {2023-12-12},
+ journal = {Minds and Machines},
+ author = {McClamrock, Ron},
+ month = may,
+ year = {1991},
+ keywords = {decomposition, explanation, function, Levels},
+ pages = {185--196},
+ file = {Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\7GRLEEM8\\McClamrock - 1991 - Marr's three levels A re-evaluation.pdf:application/pdf},
+}
+
+@article{winding_connectome_2023,
+ title = {The connectome of an insect brain},
+ volume = {379},
+ url = {https://www.science.org/doi/10.1126/science.add9330},
+ doi = {10.1126/science.add9330},
+ abstract = {Brains contain networks of interconnected neurons and so knowing the network architecture is essential for understanding brain function. We therefore mapped the synaptic-resolution connectome of an entire insect brain (Drosophila larva) with rich behavior, including learning, value computation, and action selection, comprising 3016 neurons and 548,000 synapses. We characterized neuron types, hubs, feedforward and feedback pathways, as well as cross-hemisphere and brain-nerve cord interactions. We found pervasive multisensory and interhemispheric integration, highly recurrent architecture, abundant feedback from descending neurons, and multiple novel circuit motifs. The brain’s most recurrent circuits comprised the input and output neurons of the learning center. Some structural features, including multilayer shortcuts and nested recurrent loops, resembled state-of-the-art deep learning architectures. The identified brain architecture provides a basis for future experimental and theoretical studies of neural circuits.},
+ number = {6636},
+ urldate = {2023-12-12},
+ journal = {Science},
+ author = {Winding, Michael and Pedigo, Benjamin D. and Barnes, Christopher L. and Patsolic, Heather G. and Park, Youngser and Kazimiers, Tom and Fushiki, Akira and Andrade, Ingrid V. and Khandelwal, Avinash and Valdes-Aleman, Javier and Li, Feng and Randel, Nadine and Barsotti, Elizabeth and Correia, Ana and Fetter, Richard D. and Hartenstein, Volker and Priebe, Carey E. and Vogelstein, Joshua T. and Cardona, Albert and Zlatic, Marta},
+ month = mar,
+ year = {2023},
+ note = {Publisher: American Association for the Advancement of Science},
+ pages = {eadd9330},
+ file = {Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\ZI3WMQVN\\Winding et al. - 2023 - The connectome of an insect brain.pdf:application/pdf},
+}
+
+@article{shakeshaft_genetic_2015,
+ title = {Genetic specificity of face recognition},
+ volume = {112},
+ issn = {0027-8424},
+ url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4611634/},
+ doi = {10.1073/pnas.1421881112},
+ abstract = {Diverse cognitive abilities have typically been found to intercorrelate highly and to be strongly influenced by genetics. Recent twin studies have suggested that the ability to recognize human faces is an exception: it is similarly highly heritable, but largely uncorrelated with other abilities. However, assessing genetic relationships—the degree to which traits are influenced by the same genes—requires very large samples, which have not previously been available. This study, using data from more than 2,000 twins, shows for the first time, to our knowledge, that the genetic influences on face recognition are almost entirely unique. This finding provides strong support for the view that face recognition is “special” and may ultimately illuminate the nature of cognitive abilities in general., Specific cognitive abilities in diverse domains are typically found to be highly heritable and substantially correlated with general cognitive ability (g), both phenotypically and genetically. Recent twin studies have found the ability to memorize and recognize faces to be an exception, being similarly heritable but phenotypically substantially uncorrelated both with g and with general object recognition. However, the genetic relationships between face recognition and other abilities (the extent to which they share a common genetic etiology) cannot be determined from phenotypic associations. In this, to our knowledge, first study of the genetic associations between face recognition and other domains, 2,000 18- and 19-year-old United Kingdom twins completed tests assessing their face recognition, object recognition, and general cognitive abilities. Results confirmed the substantial heritability of face recognition (61\%), and multivariate genetic analyses found that most of this genetic influence is unique and not shared with other cognitive abilities.},
+ number = {41},
+ urldate = {2023-12-12},
+ journal = {Proceedings of the National Academy of Sciences of the United States of America},
+ author = {Shakeshaft, Nicholas G. and Plomin, Robert},
+ month = oct,
+ year = {2015},
+ pmid = {26417086},
+ pmcid = {PMC4611634},
+ pages = {12887--12892},
+ file = {PubMed Central Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\3BWFRQWD\\Shakeshaft and Plomin - 2015 - Genetic specificity of face recognition.pdf:application/pdf},
+}
+
+@article{leeds_comparing_2013,
+ title = {Comparing visual representations across human fMRI and computational vision},
+ volume = {13},
+ issn = {1534-7362},
+ url = {https://doi.org/10.1167/13.13.25},
+ doi = {10.1167/13.13.25},
+ abstract = {Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation.},
+ number = {13},
+ urldate = {2023-12-12},
+ journal = {Journal of Vision},
+ author = {Leeds, Daniel D. and Seibert, Darren A. and Pyles, John A. and Tarr, Michael J.},
+ month = nov,
+ year = {2013},
+ pages = {25},
+ file = {Full Text:C\:\\Users\\eliu\\Zotero\\storage\\PEEDSIUP\\Leeds et al. - 2013 - Comparing visual representations across human fMRI.pdf:application/pdf;Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\8H96G4NZ\\article.html:text/html},
+}
+
+@misc{von_luxburg_tutorial_2007,
+ title = {A Tutorial on Spectral Clustering},
+ url = {http://arxiv.org/abs/0711.0189},
+ abstract = {In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.},
+ urldate = {2023-12-12},
+ publisher = {arXiv},
+ author = {von Luxburg, Ulrike},
+ month = nov,
+ year = {2007},
+ note = {arXiv:0711.0189 [cs]},
+ keywords = {Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning},
+ file = {arXiv.org Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\UW9MRBMW\\0711.html:text/html;Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\QD2NUMFN\\von Luxburg - 2007 - A Tutorial on Spectral Clustering.pdf:application/pdf},
+}
+
+@article{park_spectral_2014,
+ title = {Spectral clustering with physical intuition on spring–mass dynamics},
+ volume = {351},
+ issn = {0016-0032},
+ url = {https://www.sciencedirect.com/science/article/pii/S0016003214000532},
+ doi = {10.1016/j.jfranklin.2014.02.017},
+ abstract = {In this paper, we provide a new insight into clustering with a spring–mass dynamics, and propose a resulting hierarchical clustering algorithm. To realize the spectral graph partitioning as clustering, we model a weighted graph of a data set as a mass–spring dynamical system, where we regard a cluster as an oscillating single entity of a data set with similar properties. And then, we describe how oscillation modes are related with eigenvectors of a graph Laplacian matrix of the data set. In each step of the clustering, we select a group of clusters, which has the biggest number of constituent clusters. This group is divided into sub-clusters by examining an eigenvector minimizing a cost function, which is formed in such a way that subdivided clusters will be balanced with large size. To find k clusters out of non-spherical or complex data, we first transform the data into spherical clusters located on the unit sphere positioned in the (k−1)-dimensional space. In the sequel, we use the previous procedure to these transformed data. The computational experiments demonstrate that the proposed method works quite well on a variety of data sets, although its performance degrades with the degree of overlapping of data sets.},
+ number = {6},
+ urldate = {2023-12-12},
+ journal = {Journal of the Franklin Institute},
+ author = {Park, Jinho and Jeon, Moongu and Pedrycz, Witold},
+ month = jun,
+ year = {2014},
+ pages = {3245--3268},
+ file = {ScienceDirect Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\Q8P772RM\\S0016003214000532.html:text/html},
+}
+
+@article{palamuttam_evaluating_nodate,
+ title = {Evaluating Network Embeddings: Node2Vec vs Spectral Clustering vs GCN},
+ abstract = {Node classification on popular social network datasets in the graph setting arises in various real world networks. Being able to label a particular entity in a graph based on its neighboring graph structure and predicting relationships between entities plays an important role in analyzing social networks and content on the web. With the the resurgence of machine learning, supervised learning problems have been shown to accomplish a number of different tasks from classifying animals to determining the model of cars [7]. These methods have been further extended by the resurgence of Deep Learning to accomplish the same tasks such as classifying images with higher accuracy. However, it is noted by Leskovec et al. [2] that prediction tasks on graphs require careful feature engineering. In this paper we adapt work by Kipf et al [4][5] in order to leverage Graph Convolutional Neural Networks as a means of evaluating spectral embeddings in comparison to embeddings generated by the Node2Vec algorithm.},
+ language = {en},
+ author = {Palamuttam, Rahul and Mall, Serra},
+ file = {Palamuttam and Mall - Evaluating Network Embeddings Node2Vec vs Spectra.pdf:C\:\\Users\\eliu\\Zotero\\storage\\MYU4TY47\\Palamuttam and Mall - Evaluating Network Embeddings Node2Vec vs Spectra.pdf:application/pdf},
+}
+
+@misc{grover_node2vec_2016,
+ title = {node2vec: Scalable Feature Learning for Networks},
+ shorttitle = {node2vec},
+ url = {http://arxiv.org/abs/1607.00653},
+ abstract = {Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.},
+ urldate = {2023-12-12},
+ publisher = {arXiv},
+ author = {Grover, Aditya and Leskovec, Jure},
+ month = jul,
+ year = {2016},
+ note = {arXiv:1607.00653 [cs, stat]},
+ keywords = {Computer Science - Machine Learning, Computer Science - Social and Information Networks, Statistics - Machine Learning},
+ file = {arXiv.org Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\JTV22CVH\\1607.html:text/html;Full Text PDF:C\:\\Users\\eliu\\Zotero\\storage\\9D7IEEW7\\Grover and Leskovec - 2016 - node2vec Scalable Feature Learning for Networks.pdf:application/pdf},
+}
+
+@misc{huang_combining_2020,
+ title = {Combining Label Propagation and Simple Models Out-performs Graph Neural Networks},
+ url = {http://arxiv.org/abs/2010.13993},
+ doi = {10.48550/arXiv.2010.13993},
+ abstract = {Graph Neural Networks (GNNs) are the predominant technique for learning over graphs. However, there is relatively little understanding of why GNNs are successful in practice and whether they are necessary for good performance. Here, we show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in the label structure: (i) an "error correlation" that spreads residual errors in training data to correct errors in test data and (ii) a "prediction correlation" that smooths the predictions on the test data. We call this overall procedure Correct and Smooth (C\&S), and the post-processing steps are implemented via simple modifications to standard label propagation techniques from early graph-based semi-supervised learning methods. Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks, with just a small fraction of the parameters and orders of magnitude faster runtime. For instance, we exceed the best known GNN performance on the OGB-Products dataset with 137 times fewer parameters and greater than 100 times less training time. The performance of our methods highlights how directly incorporating label information into the learning algorithm (as was done in traditional techniques) yields easy and substantial performance gains. We can also incorporate our techniques into big GNN models, providing modest gains. Our code for the OGB results is at https://github.com/Chillee/CorrectAndSmooth.},
+ urldate = {2023-12-12},
+ publisher = {arXiv},
+ author = {Huang, Qian and He, Horace and Singh, Abhay and Lim, Ser-Nam and Benson, Austin R.},
+ month = nov,
+ year = {2020},
+ note = {arXiv:2010.13993 [cs]},
+ keywords = {Computer Science - Machine Learning, Computer Science - Social and Information Networks},
+ file = {arXiv Fulltext PDF:C\:\\Users\\eliu\\Zotero\\storage\\KAJCKCBD\\Huang et al. - 2020 - Combining Label Propagation and Simple Models Out-.pdf:application/pdf;arXiv.org Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\MP6Z4TNI\\2010.html:text/html},
+}
+
+@misc{lin_use_2021,
+ title = {On the Use of Unrealistic Predictions in Hundreds of Papers Evaluating Graph Representations},
+ url = {http://arxiv.org/abs/2112.04274},
+ doi = {10.48550/arXiv.2112.04274},
+ abstract = {Prediction using the ground truth sounds like an oxymoron in machine learning. However, such an unrealistic setting was used in hundreds, if not thousands of papers in the area of finding graph representations. To evaluate the multi-label problem of node classification by using the obtained representations, many works assume in the prediction stage that the number of labels of each test instance is known. In practice such ground truth information is rarely available, but we point out that such an inappropriate setting is now ubiquitous in this research area. We detailedly investigate why the situation occurs. Our analysis indicates that with unrealistic information, the performance is likely over-estimated. To see why suitable predictions were not used, we identify difficulties in applying some multi-label techniques. For the use in future studies, we propose simple and effective settings without using practically unknown information. Finally, we take this chance to conduct a fair and serious comparison of major graph-representation learning methods on multi-label node classification.},
+ urldate = {2023-12-12},
+ publisher = {arXiv},
+ author = {Lin, Li-Chung and Liu, Cheng-Hung and Chen, Chih-Ming and Hsu, Kai-Chin and Wu, I.-Feng and Tsai, Ming-Feng and Lin, Chih-Jen},
+ month = dec,
+ year = {2021},
+ note = {arXiv:2112.04274 [cs]},
+ keywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning},
+ file = {arXiv Fulltext PDF:C\:\\Users\\eliu\\Zotero\\storage\\6AMYQ534\\Lin et al. - 2021 - On the Use of Unrealistic Predictions in Hundreds .pdf:application/pdf;arXiv.org Snapshot:C\:\\Users\\eliu\\Zotero\\storage\\YXR35ABC\\2112.html:text/html},
+}
diff --git a/assets/img/2023-11-09-deep-connectome-clustering/background_visual.jpg b/assets/img/2023-11-09-deep-connectome-clustering/background_visual.jpg
new file mode 100644
index 00000000..a51d9a36
Binary files /dev/null and b/assets/img/2023-11-09-deep-connectome-clustering/background_visual.jpg differ
diff --git a/assets/img/2023-11-09-deep-connectome-clustering/clustering-cell-type.png b/assets/img/2023-11-09-deep-connectome-clustering/clustering-cell-type.png
new file mode 100644
index 00000000..401376f7
Binary files /dev/null and b/assets/img/2023-11-09-deep-connectome-clustering/clustering-cell-type.png differ
diff --git a/assets/img/2023-11-09-deep-connectome-clustering/explore_cluster.png b/assets/img/2023-11-09-deep-connectome-clustering/explore_cluster.png
new file mode 100644
index 00000000..b95c844f
Binary files /dev/null and b/assets/img/2023-11-09-deep-connectome-clustering/explore_cluster.png differ
diff --git a/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-auc-ap.png b/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-auc-ap.png
new file mode 100644
index 00000000..69a38d78
Binary files /dev/null and b/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-auc-ap.png differ
diff --git a/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-task.png b/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-task.png
new file mode 100644
index 00000000..ab9bf798
Binary files /dev/null and b/assets/img/2023-11-09-deep-connectome-clustering/link-prediction-task.png differ