-
Notifications
You must be signed in to change notification settings - Fork 4
/
08-spatial-interaction.Rmd
799 lines (611 loc) · 40.8 KB
/
08-spatial-interaction.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
# Spatial Interaction Models{#SpatialInteraction}
![](images/image_break.png){width=100%}
In the [Spatial Networks](#SpatialNetworks) section of this document we cover most of the simple network models for generating spatial networks based on absolute distance, configurations of locations, and territories. There is one general class of spatial network model that we described briefly in the Brughmans and Peeples (2023) book but did not cover in detail as the specifics require considerably more discussion. This includes a wide variety of spatial interaction models such as gravity models, truncated power functions, radiation models, and similar custom derivations of these approaches. In general, a spatial interaction model is a formal mathematical model that is used to predict the movement of people (or other sorts of entities) between origins and destinations. These models typically use information on the relative sizes or "attractiveness" of origins and destinations or other additional information on volumes of flows in and out. Such models have long been popular in geography, economics, and other fields for doing things like predicting the amount of trade between cities or nations or predicting or improving the location of services in a given geographic extent. Statistical spatial interaction models have been used in archaeology as well for both empirical and simulation studies (e.g., Bevan and Wilson 2013; Evans et al. 2011; Gauthier 2020; Paliou and Bevan 2016; Rihll and Wilson 1987) though they have not had nearly the impact they have had in other fields. We suggest that there is considerable potential for these models, in particular in contexts where we have other independent information for evaluating network flows across a study area.
In this section, we briefly outline a few common spatial interaction models and provide examples. For additional detailed overview and examples of these models there are several useful publications (see [Evans et al. 2011](https://www.researchgate.net/publication/277221754_Interactions_In_Space_For_Archaeological_Models); [Rivers et al. 2011](https://plato.tp.ph.ic.ac.uk/~time/networks/arch/BevanRewriteFigTableInText110727.pdf); and [Amati et al. 2018](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5797198/)).
## Simple Gravity Models{#GravityModel}
We'll start here with the very simple gravity model. This model is built on the notion that the "mass" of population at different origins and destinations creates attractive forces between them that is further influenced by the space and travel costs between them. This model takes many forms but the simplest can be written as:
$$x_{ij} = cv_iv_jf(d_{ij})$$
where
* $x_{ij}$ is the number or strength of connection between nodes $i$ and $j$
* $c$ is a proportionality constant (gravitational constant) which balances the units of the formula. For our purposes we can largely ignore this value as it changes all absolute values but not relative values between nodes.
* $v_i$ and $v_j$ are attributes of nodes $i$ and $j$ contributing to their "mass" (attractiveness). This could be population or resource catchment or area or anything other factor that likely influences the attractiveness of the node in terms of the interaction that is the focus of the network.
* $f(d_{ij})$ is the cost or "deterrence" function defining the travel costs between $i$ and $j$. Frequently, this cost function is further specified using inverse power law or exponential decay functions defined respectively by the equations below:
$$f(d_{ij}) = \frac{1}{(1+\beta d{ij})^\gamma} \text{ and } f(d_{ij}) = \frac{1}{e^{\beta d{ij}}}$$
where $\beta$ is a scaling factor for both models sand $\gamma$ determines the weight of the tail in the power law distribution.
There are numerous different configurations of the simple gravity model in the literature. Some versions add exponents to $V_i$ or $V_j$ to vary the importance of inflow and outflow independently, other versions use different derivations of deterrence, and many define inflows and outflows using different sources of empirical information with additional terms to scale the units. Calculating models like these is relatively easy but determining how to parameterize such models is typically the hard part as we discuss further below.
In order to demonstrate simple gravity models, we're going to use a regional data set of Late Intermediate periods Wankarani sites in the Bolivian Altiplano provided online on the [Comparative Archaeology Database at the University of Pittsburgh](http://www.cadb.pitt.edu/mcandrews/index.html) (McAndrews 2005). The details of this database and all of the variables are described at the link provided. This data set has more variables than we will use here but this provides a good example to explore because it includes location and size information for all sites.
First, let's read in the data and create a quick plot showing site locations with points scaled by site area (as a measure of potential attractiveness and outflow potential). We will then select sites dating to the Late Intermediate period for analysis. We further limit consideration to habitation sites. [Download the data here to follow along](data/Wankarani_siteinfo.csv) and [download the base map here](data/bolivia.Rdata).
```{r, fig.width=7, fig.height=3, warning=F, message=F}
wankarani <- read.csv("data/Wankarani_siteinfo.csv")
wankarani <- wankarani[which(wankarani$Period == "Late Intermediate"), ]
wankarani <- wankarani[which(wankarani$Type == "habitation"), ]
load("data/bolivia.Rdata")
library(ggmap)
library(sf)
# Convert attribute location data to sf coordinates and change
# map projection
locations_sf <-
st_as_sf(wankarani, coords = c("Easting", "Northing"), crs = 32721)
loc_trans <- st_transform(locations_sf, crs = 4326)
coord1 <- do.call(rbind, st_geometry(loc_trans)) %>%
tibble::as_tibble() %>%
setNames(c("lon", "lat"))
xy <- as.data.frame(coord1)
colnames(xy) <- c("x", "y")
# Plot original data on map
ggmap(base_bolivia, darken = 0.35) +
geom_point(
data = xy,
aes(x, y, size = wankarani$Area),
color = "red",
alpha = 0.8,
show.legend = FALSE
) +
theme_void()
```
As this plot shows, there is one site that is considerably larger than the others and a few other clusters of large sites in the eastern portion of the study area.
Now we're going to build a small function called `grav_mod` that includes 3 arguments:
* **`attract`** which is the measure of settlement attractiveness in the network. In our example here we will use area as a measure of attractiveness for both the sending and receiving site, though this can be varied.
* **`B`** which is our $\beta$ parameter for the exponential decay cost function outlined above.
* **`d`** which is a matrix of distances among our nodes. Note that this could be something other than physical distance like travel time as well. It helps to have this in units that don't result in very large numbers to keep the output manageable (though the actual absolute numbers don't matter)
```{r}
grav_mod <- function(attract, B, d) {
res <- matrix(0, length(attract), length(attract))
for (i in seq_len(length(attract))) {
for (j in seq_len(length(attract))) {
res[i, j] <-
attract[i] * attract[j] * exp(-B * d[i,j])
}
}
diag(res) <- 0
return(res)
}
```
Now let's try an example using the Wankarani data. For this first example we will set `B` to `1`. We'll take a look at the results by first, creating a heat map of the gravity model for every pair of nodes using the `superheat` package. We will then plot the site size against the estimated flow from our model with both axes transformed to base-10 logarithms to see how those variables relate. We use the packages `scales` to provide exponential notation for the axis labels.
```{r, message=F, warning=F, fig.height = 7, fig.width = 7}
# First calculate a distance matrix. We divide by
# 1000 so results are in kilometers.
d_mat <- as.matrix(dist(wankarani[, 5:6])) / 1000
test1 <-
grav_mod(attract = wankarani$Area,
B = 1,
d = d_mat)
library(superheat)
superheat(test1)
library(scales)
df <- data.frame(Flow = rowSums(test1), Area = wankarani$Area)
ggplot(data = df) +
geom_point(aes(x = Area, y = Flow)) +
scale_x_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
scale_y_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
theme_bw()
```
As these plots illustrate, there are clusters in the estimated flows among nodes, which is not surprising given that the sites themselves form a few clusters. Further, the plot comparing area to flow shows that there is a roughly positive linear relationship between the two but there is also variation with nodes with more or less flow than would be expected based on distance alone.
Next, let's plot a network showing all connections scaled and colored based on their strength in terms of estimated flow and with nodes scaled based on weighted degree (the total volume of flow incident on that node). The blue colored ties are weaker and the yellow colored ties are stronger. For the sake of visual clarity we omit the 25% weakest edges.
```{r, cache=T, message=F,warning=F, fig.width = 7, fig.height = 3}
library(statnet)
library(igraph)
library(ggraph)
sel_edges <- event2dichot(test1, method = "quantile", thresh = 0.25)
test1_plot <- test1 * sel_edges
net <-
graph_from_adjacency_matrix(test1_plot, weighted = TRUE)
# Extract edge list from network object
edgelist <- get.edgelist(net)
# Create data frame of beginning and ending points of edges
edges <- data.frame(xy[edgelist[, 1], ], xy[edgelist[, 2], ])
colnames(edges) <- c("X1", "Y1", "X2", "Y2")
# Calculate weighted degree
dg_grav <- rowSums(test1) / 10000
# Plot data on map
ggmap(base_bolivia, darken = 0.35) +
geom_segment(
data = edges,
aes(
x = X1,
y = Y1,
xend = X2,
yend = Y2,
col = log(E(net)$weight),
alpha = log(E(net)$weight)),
show.legend = FALSE
) +
scale_alpha_continuous(range = c(0, 0.5)) +
scale_color_viridis() +
geom_point(
data = xy,
aes(x, y, size = dg_grav),
alpha = 0.8,
color = "red",
show.legend = FALSE
) +
theme_void()
```
As this plot shows, there are areas characterized by higher and lower flow throughout this study area. The largest site in the study area (shown in the first map above) is characterized by a high weighted degree but there are other smaller sites that also have high weighted degree, especially in the eastern half of the study area.
Let's now take a look at the same data again, this time we set `B` or $\beta$ to `0.1`.
```{r, message=F, warning=F, cache=T, fig.height = 7, fig.width=7}
# First calculate a distance matrix. We divide by
# 1000 so results are in kilometers.
d_mat <- as.matrix(dist(wankarani[, 5:6])) / 1000
test2 <-
grav_mod(
attract = wankarani$Area,
B = 0.1,
d = d_mat
)
superheat(test2)
df <- data.frame(Flow = rowSums(test2), Area = wankarani$Area)
ggplot(data = df) +
geom_point(aes(x = Area, y = Flow)) +
scale_x_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
scale_y_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
theme_bw()
```
```{r, message=F, warning=F, cache=T, fig.height = 3, fig.width=7}
sel_edges <- event2dichot(test2, method = "quantile", thresh = 0.25)
test2_plot <- test2 * sel_edges
net2 <-
graph_from_adjacency_matrix(test2_plot, mode = "undirected",
weighted = TRUE)
# Extract edge list from network object
edgelist <- get.edgelist(net2)
# Create data frame of beginning and ending points of edges
edges <- data.frame(xy[edgelist[, 1], ], xy[edgelist[, 2], ])
colnames(edges) <- c("X1", "Y1", "X2", "Y2")
# Calculate weighted degree
dg <- rowSums(test2) / 10000
# Plot data on map
ggmap(base_bolivia, darken = 0.35) +
geom_segment(
data = edges,
aes(
x = X1,
y = Y1,
xend = X2,
yend = Y2,
col = log(E(net2)$weight),
alpha = log(E(net2)$weight)),
show.legend = FALSE
) +
scale_alpha_continuous(range = c(0, 0.5)) +
scale_color_viridis() +
geom_point(
data = xy,
aes(x, y, size = dg),
alpha = 0.8,
color = "red",
show.legend = FALSE
) +
theme_void()
```
When we lower the `B` $\beta$ parameter we get a stronger linear relationship between site area and flow. Further, when we look at the network, we see a more even degree distribution (though there are still nodes with higher degree) and more distributed edge weights across the network, though the high values are still. concentrated in the eastern cluster.
### Parameterizing the Gravity Model{#ParameterizeGravity}
The simple model above uses just a single parameter $\beta$ and is largely based on empirical information on the distance among settlements and their sizes. The basic assumption here is that larger sizes "attract" more flow and also have more flow to provide to other sites. Distance is also important and the $\beta$ parameter determines the decay rate of distance as the plot below illustrates.
```{r, fig.width = 7, warning=F, message=F}
brk <- seq(0.01, 1, by = 0.01)
out <- as.data.frame(matrix(0, length(brk), 6))
out[, 1] <- brk
for (i in seq_len(length(brk))) {
out[i, 2] <- exp(-brk[i])
out[i, 3] <- exp(-brk[i] * 2)
out[i, 4] <- exp(-brk[i] * 5)
out[i, 5] <- exp(-brk[i] * 10)
out[i, 6] <- exp(-brk[i] * 20)
}
colnames(out) <- c("beta", "D=1", "D=2", "D=5", "D=10", "D=20")
library(reshape2)
df <- melt(out[, 2:6])
df$beta <- rep(brk, 5)
ggplot(data = df) +
geom_line(aes(x = beta, y = value, color = variable)) +
ylab("Decay") +
xlab(expression( ~ beta)) +
theme_bw()
```
As we see in the plot above, the decay rate varies at different values of $\beta$ and also in relation to the distance between points. How then, do we select the appropriate $\beta$ in a model like this? There are a few ways to address this question depending on data availability and the nature of the issue such networks are being used to address. First, do we have some independent measure of network flow among our sites? For example, perhaps we could take information on the number or diversity of trade wares recovered at each site. We might expect sites with greater flow to have higher numbers or more diverse trade ware assemblages. We could evaluate this proposition using regression models and determine which $\beta$ provides the best fit for the data and theory. Often, however, when we are working with simple gravity models we have somewhat limited data and cannot make such direct comparisons. It is possible that we have a theoretical expectation for the shape of the decay curve as shown above (note that distances could also be things like cost-distance here so perhaps we have a notion of maximal travel times or a caloric budget for movement) and we can certainly factor that into our model. As we will see below, however, there are alternatives to the simple gravity model that provide additional avenues for evaluating model fit.
## The Rihll and Wilson "Retail" Model{#RihllWilson}
One of the most popular extensions of the gravity model in archaeology was published by Rihll and Wilson in 1987 for their study of Greek city states in the 9th through 8th centuries B.C. They used a spatial interaction model that is sometimes called the "retail" model. This approach was originally designed for assessing the likely flows of resources into retail shops and how that can lead to shop growth/increased income and the establishment of a small number of super-centers that receive a large share of the overall available flow (often at the expense of smaller shops). When thinking about this model in a settlement context, the "flows" are people and resources and the growth of highly central nodes in the network is used to approximate the development of settlement hierarchy and the growth of large settlement centers.
One of the big advantages of this model is that it only requires the locations of origins and destinations and some measure of the cost of movement among sites. From that, an iterative approach is used to model the growth or decline of nodes based on their configurations in cost space and a couple of user provided parameters. Versions of this model have been used in a number of archaeological studies (e.g., [Bevan and Wilson 2013](https://www.researchgate.net/publication/257154745_Models_of_settlement_hierarchy_based_on_partial_evidence); [Evans and Rivers 2016](https://arxiv.org/abs/1611.07839); [Filet 2017](https://www.frontiersin.org/articles/10.3389/fdigh.2016.00010/full); [Rihll and Wilson 1987](https://www.persee.fr/doc/hism_0982-1783_1987_num_2_1_1300)). Here we present a version of the model inspired by the recent work by these scholars. The code below was based in part on an [R function](https://rdrr.io/github/CRC1266-A2/moin/src/R/Rhill_Wilson_algebraic.R) created as part of an [ISAAKiel](https://isaakiel.github.io/) Summer School program.
Let's first formally describe the Rihll and Wilson model. The model of interaction $T_{ij}$ among a set of sites $k$ can be represented as:
$$T_{ij} = \frac{O_iW_j^\alpha e^{-\beta c_{ij}}}{\Sigma_k W_k^\alpha e^{-\beta c_{jk}}}$$
where
* $T_{ij}$ is the matrix of flows or interaction between nodes $i$ and $j$ (which may or may not be the same set)
* $O_i$ is the estimated weight of flow or interaction out of the origins $i$
* $W_j$ is the estimated weight of flow or interaction into the destinations $j$. In most archaeological applications, this is used to represent some measurement of settlement size.
* $\alpha$ is a parameter which defines the importance of resource flow into destinations. Numbers greater than 1 essentially model increasing returns to scale for every unit of flow.
* $e^{-\beta c_{ij}}$ is the "deterrence" function where $e$ is an exponential ($exp(-\beta c_{ij})$), $c_ij$ is a measure of the cost of travel between nodes $i$ and $j$ and $\beta$ is the decay parameter that describes the rate of decay in interaction at increasing distance.
* $\Sigma_k W_k^\alpha e^{-\beta c_{jk}}$ is the sum for all nodes $k$ of the expression using the $W_j$, $\alpha$, and deterrence function terms as described above.
In the case here (and in many archaeological examples) we start with a simple naive assumption of flow by setting all values of $O$ and $W$ equal to 1. Our goal, indeed, is to estimate $W$ for all nodes iteratively. In order to do this after each calculation of $T_{ij}$ we calculate two new values $D_j$ and $\Delta W_j$ defined as:
$$\begin{aligned} D_j & = \Sigma_i T_{ij}\\
\\
\Delta W_j & = \epsilon(D_j - KW_j) \end{aligned}$$
where
* $D_j$ is a vector of values defined as the sum weights incident on a given node (column sums of $T_{ij}$)
* $\Delta W_j$ is the change in values of $W$ (or estimated settlement size)
* $\epsilon$ is a control parameter that determines how quickly $W$ can change at each iterative step.
* $K$ is a factor that is used to convert size $W$ to the sum of flows $D_j$
For the purposes of our examples here, we set $K$ to 1 to assume that sum of flows and size are in equal units and we set $\epsilon$ to 0.01 so that the model does not converge too rapidly.
By way of example, we will use the original Greek city states data used by Rihll and Wilson (1987) and [put online](https://figshare.com/articles/dataset/Locations_of_108_Archaeic_Greek_settlements_used_in_Rihll_and_Wilson_1987_/868961) by Tim Evans based on his own work using these data and related spatial interaction models. [Download the data here to follow along](data/Rihll_Wilson.csv) and [download the Greece basemap here]("data/greece.Rdata").
Let's first map the Greek city states data.
```{r, fig.width = 7, warning=F, message=F, cache=T}
library(sf)
library(ggmap)
load("data/greece.Rdata")
dat <- read.csv(file = "data/Rihll_Wilson.csv")
locs <-
st_as_sf(dat,
coords = c("Longitude_E", "Latitude_N"),
crs = 4326)
ggmap(map) +
geom_point(data = dat, aes(x = Longitude_E, y = Latitude_N)) +
theme_void()
```
In the code below, we define a vector for the initial values of $O_i$ and $W_j$ setting the value to 1 for every site. We then define an initial $\alpha = 1.05$ (to suggest that flows in provide slight increasing returns) and $\beta = 0.1$ (which is a low decay rate such that long distance connections will retain some importance). The code then iteratively calculates values of $T_{ij}$ until the sum of weights stops changing by more than some very low threshold for a number of time steps. In the chunk of code below we calculate $T_{ij}$ and then plot histogram of $W_j$ in final state of model.
```{r, warning=F, message=F, cache=T}
# Set model parameters and initial variable states
Oi <- rep(1, nrow(dat))
Wj <- rep(1, nrow(dat))
alpha <- 1.05
beta <- 0.1
eps <- 0.01
K <- 1
# Define distance among points in kilometers. Because our
# points are in geographic coordinates we use the distm
# function. See the section on Spatial Networks for more.
library(geosphere)
d <- as.matrix(distm(dat[, c(7, 6)])) / 1000
# Dj is initial set as a vector of 1s like Wj
Dj <- Wj
# Create objects for keeping track of the number
# of iterations and the conditions required to end
# the loop.
end_condition <- 1
iter <- 0
# Define the deterrence function as a exponential
det <- exp(-beta * d)
# Create while loop that will continue to iterate Tij
# until 10,000 iterations or until the end_condition
# object is less than the threshold indicated.
while (!(end_condition < 1e-5) & iter < 10000) {
# Set Wj to Dj
Wj <- Dj
# Calculate Tij as indicated above
Tij <-
apply(det * Oi %o% Wj ^ alpha, 2, '/',
(Wj ^ alpha %*% det))
# Calculate change in W using equation above
delta_W <- eps * (colSums(Tij) - (K * Wj))
# Calculate new Dj
Dj <- delta_W + Wj
# Add to iterator and check for end conditions
iter <- iter + 1
end_condition <- sum((Dj - Wj) ^ 2)
}
hist(Wj, breaks = 15)
```
As this shows, the Rihll and Wilson model generates flow weights with a heavy tailed distribution with these parameters. This means that a small number of nodes are receiving lots of flow but most are receiving very little.
In order to look at this geographically, Rihll and Wilson defined what they call "Terminal Sites". Terminal sites in the network are nodes where the total flow of inputs into the site $Wj$ is bigger than the largest flow out of the site. Let's define our terminal sites for our model run above and then examine them.
```{r, warning=F, message=F}
terminal_sites <- NULL
for (i in 1:109) terminal_sites[i] <- sum(Tij[-i, i]) > max(Tij[i, ])
knitr::kable(dat[terminal_sites,])
```
Interesting, our terminal sites include many historically important and larger centers (such as Athens, Thebes, and Megara), despite the fact that we included no information about size in our model.
Now let's map them. Points are scaled based on the weight of inflow and terminal sites are colored in blue.
```{r, warning=F, message=F}
ggmap(map) +
geom_point(
data = dat,
aes(
x = Longitude_E,
y = Latitude_N,
size = Wj,
color = terminal_sites
),
show.legend = FALSE
) +
theme_void()
```
As this map illustrates, the terminal sites are roughly evenly distributed across the study area rather than clustered together. Now, let's see what happens when we vary our parameters $\alpha$ and $\beta$.
For the next map, we will set $alpha = 1.15$ and leave $\beta$ as is. To make it easier to calculate everything, we're going to call an R script using `source()` that includes the full function and outputs $W_j$, $T_{ij}$, the number of iterations, and a logical vector indicating which nodes are terminals.
```{r, warning=F, message=F, cache=T}
source("scripts/rihll_wilson.R")
rw2 <- rihll_wilson(alpha = 1.15, beta = 0.1, dist_mat = d)
ggmap(map) +
geom_point(
data = dat,
aes(
x = Longitude_E,
y = Latitude_N,
size = rw2$Wj,
color = rw2$terminals
),
show.legend = FALSE
) +
theme_void()
knitr::kable(dat[rw2$terminals,])
```
Increasing $\alpha$ increases the importance of inflow so we end up with fewer terminal sites. Notably, we are still retaining many large and historically important cities despite not including information on site size in our model.
Now let's set $\alpha = 1.05$ and increase $\beta = 0.35$:
```{r, warning=F, message=F, cache=T}
rw3 <- rihll_wilson(alpha = 1.05, beta = 0.35, dist_mat = d)
ggmap(map) +
geom_point(
data = dat,
aes(
x = Longitude_E,
y = Latitude_N,
size = rw3$Wj,
color = rw3$terminals
),
show.legend = FALSE
) +
theme_void()
knitr::kable(dat[rw3$terminals,])
```
As this map illustrates, increasing $\beta$ increases the distance decay meaning that local interactions are more important leading to a more even distribution of $W_j$ values and more terminal sites (which are further somewhat spatially clustered).
### Parameterizing the Retail Model{#ParameterizingRetail}
How might we select appropriate values for $\alpha$ and $\beta$ in our model? The approach Rihll and Wilson and many subsequent researchers have taken (see Bevan and Wilson 2013; Filet 2017) is to use our knowledge of the archaeological record and regional settlement patterns and to select a model that is most consistent with that knowledge. Our model creates more or fewer highly central nodes depending on how we set our parameters. As we noted above, the terminal nodes we defined consistently include historically important and large sites like Athens suggesting that our model is likely doing something reasonable. One potential approach would be to run comparisons for a plausible range of values for $\alpha$ and $\beta$ and to evaluate relationships with our own archaeological knowledge of settlement hierarchy and which sites/places are defined as terminal sites or highly central places in our model.
In order to test a broader range of parameter values and their impact on the number of terminal sites, we have created a function that takes the parameters, data, and distance matrix and outputs just the number of terminals.
Let's run this for a range of plausible parameter values:
```{block, type = "rmdwarning"}
If you attempt to run this on your own note that it takes quite a while to complete.
```
```{r, cache = T, fig.width=7, fig.height=7, warning=F, message=F, eval=F}
source("scripts/terminals_by_par.R")
alpha_ls <- seq(0.90, 1.25, by = 0.01)
beta_ls <- seq(0.05, 0.40, by = 0.01)
res <- matrix(NA, length(alpha_ls), length(beta_ls))
row.names(res) <- alpha_ls
colnames(res) <- beta_ls
for (i in seq_len(length(alpha_ls))) {
for (j in seq_len(length(beta_ls))) {
res[i, j] <-
terminals_by_par(
alpha = alpha_ls[i],
beta = beta_ls[j],
dist_mat = d
)
}
}
```
In case you want to see the data but don't want to wait for the above chunk to run, we have created an Rdata object with the output. Let's load those data and plot them as a heat map/tile plot:
```{r, fig.width = 7, fig.height=6}
load(file = "data/retail_pars.Rdata")
library(reshape2)
library(ggraph)
res_df <- melt(res)
ggplot(data = res_df) +
geom_tile(aes(x = Var2, y = Var1, fill = value)) +
scale_fill_viridis(option = "turbo") +
xlab(expression( ~ beta)) +
ylab(expression( ~ alpha)) +
theme_bw()
```
As this plot shows low values for both $\alpha$ and $\beta$ tend to generate networks with lots of terminals but the relationship between these parameters is not linear. Based on this and our knowledge of the archaeological record, we could make an argument for evaluating a particular combination of parameters but there is certainly no single way to make that decision. To see further expansions of such an approach that attempts to deal with incomplete survey data and other kinds of considerations of settlement prominence, see the published example by [Bevan and Wilson (2013)](https://www.researchgate.net/publication/257154745_Models_of_settlement_hierarchy_based_on_partial_evidence).
## Truncated Power Functions{#TruncatedPower}
Another similar spatial interaction model was used in a study by [Menze and Ur (2012)](https://dash.harvard.edu/handle/1/8523994) in their exploration of networks in northern Mesopotamia. Their model is quite similar to the simple gravity model we saw above but with a couple of additional parameters and constraints. We leave the details of the approach to the published article but briefly describe their model here. This truncated power function requires information on settlement location, some measure of size or "attraction," and three parameter values. Edge interaction $E_{ij}$ in this model can be formally defined as:
$$E_{ij}(\alpha,\beta,\gamma) = V_i^\gamma V_j^\gamma d_{ij}^{-\alpha} e^{(-d_{ij}/\beta)}$$
where
* $V$ is some measure of the attractiveness of node $i$ or $j$, typically defined in terms of settlement size.
* $d_{ij}$ is the distance between nodes $i$ and $j$. Again, this can use measures of distance other than simple Euclidean distances.
* $\alpha$ is a constraint on the distance between nodes.
* $\beta$ is the physical distance across which distance decay should be considered (defined in units of $d$).
* $\gamma$ is used to define the importance of $V$ on interaction where values above 1 suggest increasing returns to scale.
The model output $E_{ij}$ is, according to Menze and Ur, meant to approximate the movement of people among nodes across the landscape. In order to evaluate this function, we replicate the results of the Menze and Ur paper with one small change. We drop the bottom 50% smallest sites from consideration due to the large sample size to keep run times manageable (but this could certainly be changed in the code below). We use the replication data set provided by Menze and Ur online [here](https://dataverse.harvard.edu/dataset.xhtml;jsessionid=adeb8ff43c833f1efe447dc9e8ba?persistentId=hdl%3A1902.1%2F17731&version=&q=&fileTypeGroupFacet=%22Text%22&fileAccess=&fileTag=&fileSortField=type&fileSortOrder=).
Let's read in the data, omit the rows without site volume estimates, and then remove the lowest 50% of sites in terms of volume values. We then plot the sites with points scaled by site volume. [Download the data here to follow along](data/menze_ur_sites.csv).
```{r, fig.width = 7, fig.height = 5}
mesop <- read.csv("data/menze_ur_sites.csv")
mesop <- mesop[-which(is.na(mesop$volume)),]
mesop <- mesop[which(mesop$volume > quantile(mesop$volume, 0.5)),]
ggplot(mesop) +
geom_point(aes(x = x, y = y, size = volume),
show.legend = FALSE) +
scale_size_continuous(range = c(1, 3)) +
theme_void()
```
And here we implement the truncated power approach rolled into a function called `truncated_power`. We use the values selected as optimal for the Menze and Ur (2012) paper.
```{block, type="rmdwarning"}
Note that the block below takes several minutes to run.
```
```{r}
d <- as.matrix(dist(mesop[, 1:2])) / 1000
truncated_power <- function (V, d, a, y, B) {
temp <- matrix(0, nrow(d), ncol(d))
for (i in seq_len(nrow(d))) {
for (j in seq_len(ncol(d))) {
temp[i, j] <- V[i] ^ y * V[j] ^ y * d[i, j] ^ -a * exp(-d[i, j] / B)
if (temp[i, j] == Inf) {
temp[i, j] <- 0
}
}
}
return(temp)
}
res_mat <- truncated_power(V = mesop$volume, d = d, a = 1, y = 1, B = 4)
```
We can now plot the sites again, this time with points scaled based on the total volume of flow incident on each node.
```{r, fig.width = 7, fig.height = 5}
edge_flow <- rowSums(res_mat)
ggplot(mesop) +
geom_point(aes(
x = x,
y = y,
size = edge_flow,
alpha = edge_flow
),
show.legend = FALSE) +
scale_size_continuous(range = c(1, 3)) +
theme_void()
```
If we compare the plot above to figure 8 in Menze and Ur (2012) we see highly central sites in the same locations suggesting that we've reasonably approximated their results even though we are using a slightly different sample. Further, as the next plot shows, if we remove the few sites that are isolated in our sample (due to us removing the bottom 50% of sites) we also see the same strong linear correlation between the log of site volume and the log of our measure of interaction.
```{r, fig.width = 5, fig.height =5}
rem_low <- which(edge_flow > quantile(edge_flow, 0.01))
library(ggplot2)
library(scales)
df <- data.frame(Volume = mesop$volume[rem_low], Interaction = edge_flow[rem_low])
ggplot(data = df) +
geom_point(aes(x = Volume, y = Interaction)) +
scale_x_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
scale_y_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
theme_bw()
```
We could go about parameterizing this truncated power function in much the same way that we saw with the models above (i.e., testing values and evaluating results against the archaeological pattern). Indeed that is what Menze and Ur do but with a slight twist on what we've seen so far. In their case, they are lucky enough to have remotely sensed data on actual trails among sites for some portion of their study area. How they selected model parameters is by testing a range of values for all parameters and selecting the set that produced the closest match between site to site network edges and the orientations of actual observed trails (there are some methodological details I'm glossing over here so refer to the article for more). As this illustrates, there are many potential ways to select model parameters based on empirical information.
## Radiation Models{#RadiationModels}
In 2012 Filippo Simini and colleagues ([Simini et al. 2012](https://dspace.mit.edu/handle/1721.1/77896)) presented a new model, designed specifically to model human geographic mobility called the radiation model. This model was created explicitly as an alternative to various gravity models and, in certain cases, was demonstrated to generate improved empirical predictions of human population movement between origins and destinations. This model shares some basic features with gravity models but importantly, the approach includes no parameters at all. Instead, this model uses simply measures of population at a set of sites and the distances between them. That is all that is required so this model is relatively simple and could likely be applied in many archaeological cases. Let's take a look at how it is formally defined:
$$T_{ij} = T_i \frac{m_in_j}{(m_i + s_{ij})(m_i + n_j + s_{ij})}$$
where
* $T_i$ is the total number of "commuters" or migrating individuals from node $i$.
* $m_i$ and $n_j$ are the population estimates of nodes $i$ and $j$ respectively.
* $s_{ij}$ is the total population in a circle centered at node $i$ and touching node $j$ excluding the populations of both $i$ and $j$.
We have defined a function for calculating radiation among a set of sites using just two inputs:
* **`pop`** is a vector of population values.
* **`d_mat`** is a distance matrix among all nodes.
A script containing this function can also be downloaded [here](scripts/radiation.R)
```{r}
radiation <- function(pop, d_mat) {
## create square matrix with rows and columns for every site
out <-
matrix(0, length(pop), length(pop))
for (i in seq_len(length(pop))) {
# start loop on rows
for (j in seq_len(length(pop))) {
# start loop on columns
if (i == j)
next()
# skip diagonal of matrix
m <- pop[i] # set population value for site i
n <- pop[j] # set population value for site j
# find radius as distance between sites i and j
r_ij <-
d_mat[i, j]
# find all sites within the distance from i to j centered on i
sel_circle <-
which(d_mat[i, ] <= r_ij)
# remove the site i and j from list
sel_circle <-
sel_circle[-which(sel_circle %in% c(i, j))]
s <- sum(pop[sel_circle]) # sum population within radius
# calculate T_i and output to matrix
temp <-
pop[i] * ((m * n) / ((m + s) * (m + n + s)))
if (is.na(temp)) temp <- 0
out[i, j] <- temp
}
}
return(out)
}
```
In order to test this model, we will use the same Wankarani settlement data we used above for the simple gravity model. We will use site area divided by 500 as our proxy for population here. Again we limit our sample to Late Intermediate period sites and habitations. [Download the data here to follow along](data/Wankarani_siteinfo.csv).
```{r}
wankarani <- read.csv("data/Wankarani_siteinfo.csv")
wankarani <- wankarani[which(wankarani$Period == "Late Intermediate"), ]
wankarani <- wankarani[which(wankarani$Type == "habitation"), ]
d <- as.matrix(dist(wankarani[, 5:6]))
rad_test <- radiation(pop = wankarani$Area / 500, d_mat = d)
```
Now let's plot the resulting network with each node scaled by the total incident flow (row sums of the output of the function above). We plot network edges with weights indicated by color (blue indicates low weight and yellow indicates high weight).
```{r, fig.width = 7, fig.height = 3, warning=F, message=F}
library(igraph)
library(ggraph)
library(sf)
library(ggmap)
load("data/bolivia.Rdata")
locations_sf <-
st_as_sf(wankarani, coords = c("Easting", "Northing"), crs = 32721)
loc_trans <- st_transform(locations_sf, crs = 4326)
coord1 <- do.call(rbind, st_geometry(loc_trans)) %>%
tibble::as_tibble() %>%
setNames(c("lon", "lat"))
xy <- as.data.frame(coord1)
colnames(xy) <- c("x", "y")
net <-
graph_from_adjacency_matrix(rad_test, mode = "undirected",
weighted = TRUE)
# Extract edge list from network object
edgelist <- get.edgelist(net)
# Create data frame of beginning and ending points of edges
edges <-
data.frame(xy[edgelist[, 1], ],
xy[edgelist[, 2], ])
colnames(edges) <- c("X1", "Y1", "X2", "Y2")
# Calculate weighted degree
dg <- rowSums(rad_test)
# Plot data on map
ggmap(base_bolivia, darken = 0.35) +
geom_segment(
data = edges,
aes(
x = X1,
y = Y1,
xend = X2,
yend = Y2,
col = log(E(net)$weight),
alpha = log(E(net)$weight)
),
show.legend = FALSE
) +
scale_alpha_continuous(range = c(0, 0.5)) +
scale_color_viridis() +
geom_point(
data = xy,
aes(x, y, size = dg),
alpha = 0.8,
color = "red",
show.legend = FALSE
) +
theme_void()
```
This map shows clusters of higher and lower edge weights and again variation in total weighted degree (with higher values in the east). The results are similar, but not identical to the output of the simple gravity model.
```{r, fig.width = 7, fig.heigh = 7, warning=F, message=F}
dg_grav <- rowSums(grav_mod(
attract = wankarani$Area / 1000,
B = 1,
d = d_mat
))
dg_rad <- rowSums(rad_test)
library(ggplot2)
library(scales)
df <-
data.frame(Radiation = dg_rad, Gravity = dg_grav)
ggplot(data = df) +
geom_point(aes(x = Radiation, y = Gravity)) +
scale_x_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
scale_y_log10(
breaks = trans_breaks("log10", function(x)
10 ^ x),
labels = trans_format("log10", math_format(10 ^ .x))
) +
theme_bw()
```
We are aware of few published examples of the use of radiation models for archaeological cases, but there is certainly potential (see Evans 2016).
## Other Spatial Interaction Models{#OtherModels}
There are many other spatial interaction models we haven't covered here. Most are fairly similar in that they take information on site size, perhaps other relevant archaeological information, and a few user selected parameters to model flows across edges and sometimes to iteratively predict sizes of nodes, the weights of flows, or both. Other common models we haven't covered here include the XTENT model (Renfrew and Level 1979; see Ducke and Suchowska 2021 for an example with code for GRASS GIS) and various derivations of MaxEnt (or maximum entropy) models. Another approach that merits mention here is the [ariadne model](https://figshare.com/articles/dataset/ariadne/97746) designed by Tim Evans and used in collaboration with Ray Rivers, Carl Knappett, and others. This model provides a means for predicting site features and estimating optimal networks based on location and very general size information (or other archaeological features). This model has features that make it particularly useful for generating directed spatial networks (see [Evans et al. 2011](https://plato.tp.ph.ic.ac.uk/~time/networks/arch/interactionsArxivSubmissionV2.pdf)). Although there is a basic R implementation for the ariadne model developed by the ISAAKiel team [available here](https://rdrr.io/github/CRC1266-A2/moin/src/R/hamiltonian.R) the computational constraints make this function unfeasible in R for all but very small networks. Instead, if you are interested in applying the ariadne model, we suggest you use the original Java program created by Tim Evans and [available here](https://figshare.com/articles/dataset/ariadne/97746).