generated from jtr13/cctemplate
-
Notifications
You must be signed in to change notification settings - Fork 78
/
igraph-in-r.Rmd
283 lines (211 loc) · 11.2 KB
/
igraph-in-r.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
# igraph in r
Mouwei Lin (lm3756) and Linhao Yu (ly2590)
This will be a very brief introduction about a R package called igraph, which is a complex network analysis tool in R. It will contain commonly used and important functions. We believe if you look at this cheat sheet, you can handle most of the network visualization problems.
```{r, include=FALSE}
# this prevents package loading message from appearing in the rendered version of your problem set
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
```
## 1. igraph Basics
First we download and install the package.
```{r}
#install.packages('igraph')
library(igraph) #this is a must install package
#install.packages('devtools')
#library(devtools) #this is a must install package
#install_github('kassambara/navdata')
library(navdata) #this is a must install package
#install.packages('tidyverse')
library(tidyverse) #this is a must install package
#install.packages('igraphdata')
library(igraphdata) #this is a must install package
```
We will use an open source data set named **phone.call** from **navdata**.
```{r}
data("phone.call")
head(phone.call)
```
### 1.1 Create igraph Network Object
igraph has its specific network (graph) object called 'igraph'. The simplest way to create an 'igraph' object is to use **graph.formula()** and specify every node and edge individually.
```{r}
graph.formula(A-B-C-D,A-E-F,Z-D-X)
```
But in real world, we usually have a lot of data. Therefore, it is not a good idea to define every node and edge manually. More generally, we will use these two methods:
```{r}
# prepare the data
name<-data.frame(c(phone.call$source,phone.call$destination))
nodes<-name%>%
distinct()%>%
mutate(location=c("western","western","central","nordic","southeastern",
"southeastern","southeastern","southern","sourthern",
"western","western","central","central","central","central","central"))
colnames(nodes)<-c("label","location")
edges<-phone.call%>%
rename(from=source,to=destination,weight=n.call)
```
#### (1) build igraph object from dataframe:
**graph_from_data_frame()**
To use this method, we need two dataframes, one is edge frame, the other is vertices frame.
```{r}
net_pc<-graph_from_data_frame(
d=edges,vertices=nodes,
directed=TRUE)
```
we can see that the graph is created. We can use **V()** or **E()** to visit vertices or edges, and **as_edgelist(net, names=T)**, **as_adjacency_matrix(net, attr="weight")** to catch edges list and adjacent matrix:
```{r}
V(net_pc)
V(net_pc)$location
E(net_pc)
as_edgelist(net_pc, names=T)
as_adjacency_matrix(net_pc, attr="weight")
```
#### (2) build igraph object from adjacent matrix:
**graph_from_adjacency_matrix()**
If the graph is using
```{r}
adjacent_matrix<-as_adjacency_matrix(net_pc, attr="weight")
net_am<-graph_from_adjacency_matrix(adjacent_matrix)
```
now we simply plot it to take a look
```{r}
plot(net_pc)
plot(net_am)
```
### 1.2 Basic igraph Visualization Instructions
The plot function in igraph is very strong and it has a lot of parameters to make the network beautiful and clear. In this section we will only give a general introduction to important visualization methods, we will have a more detailed introduction in the next section.
A large number of parameters are used to display various properties of nodes, edges and graphs. The parameters related to nodes start with **vertex.XXX**, and the parameters related to edges start with **edge.XXX**
In addition to specifying the parameters of nodes and edges in **plot()**, you can also use the previously mentioned **V()** and **E()** to add the corresponding properties directly into the igraph object. The difference between the two methods is that the parameters specified in **plot()** do not change the properties of the plot. For example, we first specify the color of the node according to the position, and the width of the edge according to the weight (these two attributes will be saved in the net_pc object), and then specify the size of the node in the parameters of **plot()** (proportional to the degree of the node, The node degree is the number of edges connected to this node), the size and position of the node marker, the color of the edge, the size of the arrow, and the degree of curvature of the edge.
```{r}
# Calculate node's degree
deg<-degree(net_pc,mode="all")
# Set up the color
vcolor<-c("orange","red","lightblue","tomato","yellow")
# Set specific node's Color
V(net_pc)$color<-vcolor[factor(V(net_pc)$location)]
# Set specific edge's weight
E(net_pc)$width<-E(net_pc)$weight/2
# Set up vertex.size, vertex.label.cex & dist, edge color & arrow size & curve in graph
plot(net_pc,vertex.size=3*deg,
vertex.label.cex=.7,vertex.label.dist=1,
edge.color="gray50",edge.arrow.size=.4, edge.curved=.1)
# Add legend
legend(x=-1.5,y=1.5,levels(factor(V(net_pc)$location)),pch=21,col="#777777",pt.bg=vcolor)
```
### 1.3 Network Layout
Network layout refers to the method of determining the coordinates of each node in the network.
A variety of layout algorithms are provided in igraph. Among them, Force-directed layout algorithms are the most useful. Force-directed layouts try to get an aesthetically pleasing graph where the edges are similar in length and cross as little as possible. They model graphics as a physical system. Nodes are “charged particles” that repel each other when they get too close. These edges act as springs, attracting connected nodes together. As a result, nodes are evenly distributed in the illustrated area, and the layout is intuitive as nodes that share more connections are closer to each other. The disadvantage of these algorithms is that they are slow and therefore less frequently used in graphs larger than **1000** vertices.
When using force-directed layout, you can use the **niter** parameter to control the number of iterations to perform. The default setting is 500 iterations. For large graphs, you can lower this number to get results faster and check that they are reasonable.
**Fruchterman-Reingold** is the most widely used Force-directed layout method:
```{r}
# Fruchterman-Reingold layout method
l <- layout_with_fr(net_pc)
plot(net_pc, layout=l)
```
The Fruchterman Reingold layout is random and different every run will result in slightly different layout configurations. Saving the layout in the object l allows us to obtain the exact same result multiple times (it is also possible to specify a random state by setting seed **seed()**)
```{r}
# All the layout methods in igraph
layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1]
layouts
```
There are 18 methods of layout in igraph, we won't go into detail for each layout method because most of them are not widely used except Fruchterman Reingold layout. However, we will give an example to show how those layouts look like:
```{r}
layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1]
# Remove layouts that do not apply to our graph.
layouts <- layouts[!grepl("bipartite|merge|norm|sugiyama|tree", layouts)]
par(mfrow=c(5,3), mar=c(1,1,1,1))
for (layout in layouts) {
print(layout)
l <- do.call(layout, list(net_pc))
plot(net_pc, vertex.label="",edge.arrow.mode=0,
layout=l,main=layout) }
```
### 2. Decorating igraph visualizations
### 2.1 Details for decorating igraph visualization
In this section, we still use the example above, but we customize several parameters to see their influence for visualization. \
First, we customize the nodes parameters.
```{r}
set.seed(111)
l <- layout_with_fr(net_pc)
plot(net_pc,
vertex.color = 'pink', # The color of nodes
vertex.frame.color = 'lightblue', # The color of node frames
vertex.shape = c('circle','rectangle'), # The color of node shapes
vertex.size = 25, # The size of a node
vertex.size2 = 15, # For rectangle, we need two parameters to specify its shape
layout = l)
```
Next, we customize the node labels.
```{r}
plot(net_pc,
vertex.color = 'pink',
vertex.frame.color = 'lightblue',
vertex.shape = c('circle','rectangle'),
vertex.size = 25,
vertex.size2 = 15,
vertex.label = 1:length(V(net_pc)), # Change labels to numbers
vertex.label.family = 'Helvetica', # Change the font family of labels
vertex.label.font = 3, # Change the font to italic
vertex.label.cex = 0.8, # Change the size of labels
vertex.label.dist = 0.5, # Change the distance between labels and node frames
layout = l)
```
Apart from nodes, we can also modify edge features while plotting.
```{r}
plot(net_pc,
vertex.color = 'pink',
vertex.frame.color = 'lightblue',
vertex.shape = c('circle','rectangle'),
vertex.size = 25,
vertex.size2 = 15,
vertex.label = 1:length(V(net_pc)),
vertex.label.family = 'Helvetica',
vertex.label.font = 3,
vertex.label.cex = 0.8,
vertex.label.dist = 0.5,
edge.color = 'lightblue', # Color of edges
edge.width = 3, # Width of edges
edge.arrow.size = 0.8, # Size of arrows
edge.arrow.width = 0.8, # Width of arrows
edge.lty = 4, # Line types of edges (4: dot dash)
edge.curved = 0.5, # Curvature of edges
layout = l)
```
### 2.2 Example for advanced igraph visualization
In this section, we introduce an advanced network visualization using Zachary's karate club network dataset. This is a widely-used dataset for network analysis. In this dataset, every node represents a member of the karate club and edges represent members' social connection. During Zachary's study, the administrator "John. A." and the coach "Mr. Hi" had a conflict which led to a split of the club. Now, we want to use igraph to visualize the social network of this karate club. \
First, we plot the network without further decoration.
```{r}
set.seed(111)
data(karate)
l <- layout_with_fr(karate)
igraph.options(vertex.size = 10)
par(mfrow = c(1,1))
plot(karate,
layout = l)
```
Next, we highlight the two leaders ("John. A." and "Mr. Hi") in the network by using rectangles.
```{r}
# Decoration
V(karate)$label <- sub("Actor ","", V(karate)$name)
# Two leaders get shapes different from club members
V(karate)$shape <- "circle"
V(karate)[c("Mr Hi", "John A")]$shape <- "rectangle"
V(karate)$size <- 20
V(karate)$size2 <- 15
plot(karate,
vertex.label = V(karate)$label,
layout = l)
```
Finally, we divide the edges into three categories and use different colors for them. Remember the karate club is split into two factions, so the edges can be divided into: edges inside faction 1, edges inside faction 2, and edges connecting both factions. \
Besides, we also set up the edge width according to its weight.
```{r}
# Define factions
F1<-V(karate)[Faction==1]
F2<-V(karate)[Faction==2]
# Set up edge colors according to factions
E(karate)[F1 %--% F1]$color<-"darkgoldenrod2"
E(karate)[F2 %--% F2]$color<-"lightblue"
E(karate)[F1 %--% F2]$color<-"brown"
# Set up edge width according to weights
E(karate)$width=E(karate)$weight
# Plot the decorated graph, using same layout.
plot(karate,layout=l)
```