Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New vignette #66

Merged
merged 10 commits into from
Dec 10, 2024
Merged
Binary file added .DS_Store
Binary file not shown.
5 changes: 4 additions & 1 deletion R/dist_pres_vs_bg.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,11 @@ dist_pres_vs_bg <- function(
if (inherits(.data, "sf")) {
.data <- .data %>% sf::st_drop_geometry()
}
# subset to only columns which are numeric
# subset to only columns which are numeric and check for NAs
num_vars <- names(.data)[!names(.data) %in% .col]
if (any(is.na(.data[num_vars]))) {
stop("NAs in the dataframe")
}
dist_vec <- numeric()
for (i_var in num_vars) {
vals_list <- list(
Expand Down
4 changes: 4 additions & 0 deletions R/thin_by_cell.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
#' Further thinning can be achieved by aggregating cells in the raster
#' before thinning, as achieved by setting `agg_fact` > 1 (aggregation works in a
#' manner equivalent to [terra::aggregate()]).
#' Note that if `data` is an `sf` object, the function will transform the coordinates
#' to the same projection as the `raster` (recommended); if `data` is a data.frame, it is up
#' to the user to ensure that the coordinates are in the correct units.
#'
#' @param data An [`sf::sf`] data frame, or a data frame with coordinate variables.
#' These can be defined in `coords`, unless they have standard names
Expand All @@ -32,10 +35,11 @@
"use `thin_by_cell_time()`"
)
}
if (inherits(raster, "stars")) raster <- as(raster, "SpatRaster")

Check warning on line 38 in R/thin_by_cell.R

View check run for this annotation

Codecov / codecov/patch

R/thin_by_cell.R#L38

Added line #L38 was not covered by tests
# add type checks for these parameters
return_sf <- FALSE # flag whether we need to return an sf object
if (inherits(data, "sf")) {
data <- data %>% sf::st_transform(terra::crs(raster))
if (all(c("X", "Y") %in% names(data))) {
if (!all(data[, c("X", "Y")] %>% sf::st_drop_geometry() %>% as.matrix() == sf::st_coordinates(data)) |
any(is.na(data[, c("X", "Y")]))) {
Expand Down
7 changes: 5 additions & 2 deletions R/thin_by_cell_time.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
#' Further spatial thinning can be achieved by aggregating cells in the raster
#' before thinning, as achieved by setting `agg_fact` > 1 (aggregation works in a
#' manner equivalent to [terra::aggregate()]).
#' Note that if `data` is an `sf` object, the function will transform the coordinates
#' to the same projection as the `raster` (recommended); if `data` is a data.frame, it is up
#' to the user to ensure that the coordinates are in the correct units.
#'
#' @param data An [`sf::sf`] data frame, or a data frame with coordinate variables.
#' These can be defined in `coords`, unless they have standard names
Expand Down Expand Up @@ -49,14 +52,14 @@ thin_by_cell_time <- function(data, raster, coords = NULL, time_col = "time",
if (inherits(raster, "SpatRasterDataset")) {
raster <- raster[[1]]
}

if(inherits(raster, "stars")) {
d <- stars::st_dimensions(raster)
time <- stars::st_get_dimension_values(raster, "time")
raster <- as(raster, "SpatRaster")
terra::time(raster, tstep = d$time$refsys) <- time
}

time_steps <- terra::time(raster)

if (any(is.na(time_steps))) {
Expand Down
17 changes: 16 additions & 1 deletion R/thin_by_dist.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,15 @@
#' `c("longitude", "latitude")`, or `c("lon", "lat")`
#' @param dist_min Minimum distance between points (in units appropriate for
#' the projection, or meters for lonlat data).
#' @param dist_method method to compute distance, either "euclidean" or "great_circle".
#' Defaults to "great_circle", which is more accurate but takes slightly longer.
#' @returns An object of class [`sf::sf`] or [`data.frame`], the same as "data".
#' @export
#' @importFrom rlang :=

# This code is an adaptation of spThin to work on sf objects

thin_by_dist <- function(data, dist_min, coords = NULL) {
thin_by_dist <- function(data, dist_min, coords = NULL, dist_method = c("great_circle", "euclidean")) {
return_dataframe <-
FALSE # flag whether we need to return a data.frame
if (!inherits(data, "sf")) {
Expand All @@ -35,6 +37,14 @@ thin_by_dist <- function(data, dist_min, coords = NULL) {
return_dataframe <- TRUE
}

#use the proper method of distance calculation by changing projection if necessary
dist_method <- match.arg(dist_method)
if (dist_method == "great_circle") {
# store the original projection
original_crs <- sf::st_crs(data)
data <- sf::st_transform(data, 4326)
}

# compute distances with sf, using the appropriate units for the projection
dist_mat <- sf::st_distance(data)
units(dist_min) <- units(dist_mat)
Expand Down Expand Up @@ -85,6 +95,11 @@ thin_by_dist <- function(data, dist_min, coords = NULL) {

## Subset the original dataset
thinned_points <- data[points_to_keep, ]
# if we used great circle distances, we need to transform back
if (dist_method == "great_circle") {
thinned_points <- sf::st_transform(thinned_points, original_crs)
}

if (return_dataframe) {
thinned_points <- thinned_points %>%
dplyr::bind_cols(sf::st_coordinates(thinned_points)) %>% # re-add coordinates
Expand Down
18 changes: 17 additions & 1 deletion R/thin_by_dist_time.R
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,16 @@
#' @param dist_min Minimum distance between points (in units appropriate for
#' the projection, or meters for lonlat data).
#' @param interval_min Minimum time interval between points, in days.
#' @param dist_method method to compute distance, either "euclidean" or "great_circle".
#' Defaults to "great_circle", which is more accurate but takes slightly longer.
#' @returns An object of class [`sf::sf`] or [`data.frame`], the same as "data".
#' @export
#' @importFrom rlang :=

# This code is an adaptation of spThin to work on sf objects

thin_by_dist_time <- function(data, dist_min, interval_min, coords = NULL,
time_col = "time", lubridate_fun = c) {
time_col = "time", lubridate_fun = c, dist_method = c("great_circle", "euclidean")) {
return_dataframe <-
FALSE # flag whether we need to return a data.frame
# cast to sf if needed
Expand All @@ -44,6 +46,14 @@ thin_by_dist_time <- function(data, dist_min, interval_min, coords = NULL,
return_dataframe <- TRUE
}

#use the proper method of distance calculation by changing projection if necessary
dist_method <- match.arg(dist_method)
if (dist_method == "great_circle") {
# store the original projection
original_crs <- sf::st_crs(data)
data <- sf::st_transform(data, 4326)
}

# create a vector of times formatted as proper dates
time_lub <- data[, time_col] %>%
as.data.frame() %>%
Expand Down Expand Up @@ -108,6 +118,12 @@ thin_by_dist_time <- function(data, dist_min, interval_min, coords = NULL,

## Subset the original dataset
thinned_points <- data[points_to_keep, ]

# if we used great circle distances, we need to transform back
if (dist_method == "great_circle") {
thinned_points <- sf::st_transform(thinned_points, original_crs)
}

if (return_dataframe) {
thinned_points <- thinned_points %>%
dplyr::bind_cols(sf::st_coordinates(thinned_points)) %>% # re-add coordinates
Expand Down
133 changes: 133 additions & 0 deletions data-raw/projections.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
title: "Projecting your map"
author: "Andrea"
date: "2024-11-22"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Map projections in R

We start with a raster object that has a geographic coordinate reference system (CRS) and we want to project it to a different CRS. We will use the `terra` package to do this. As an example, we will use a map of the Iberian peninsula:


```{r}
library(pastclim)
download_dataset(dataset = "WorldClim_2.1_10m")
land_mask <-
get_land_mask(time_ce = 1985, dataset = "WorldClim_2.1_10m")

# Iberia peninsula extension
iberia_poly <-
terra::vect(
"POLYGON((-9.8 43.3,-7.8 44.1,-2.0 43.7,3.6 42.5,3.8 41.5,1.3 40.8,0.3 39.5,
0.9 38.6,-0.4 37.5,-1.6 36.7,-2.3 36.3,-4.1 36.4,-4.5 36.4,-5.0 36.1,
-5.6 36.0,-6.3 36.0,-7.1 36.9,-9.5 36.6,-9.4 38.0,-10.6 38.9,-9.5 40.8,
-9.8 43.3))"
)

crs(iberia_poly) <- "+proj=longlat"

# crop the extent
land_mask <- crop(land_mask, iberia_poly)
land_mask <- mask(land_mask, iberia_poly)
```

We plot it with `tidyterra`:

```{r}
library(tidyterra)
library(ggplot2)
ggplot() +
geom_spatraster(data = land_mask, aes(fill = land_mask_1985))
```
Now we project it. To define our projection we use a proj4 string, which provides information
on the type of projection, its parameters and the units of distance in which the new coordinates
will be expressed. we can use the website `projectionwizard.org` (https://link.springer.com/chapter/10.1007/978-3-319-51835-0_9) to find an appropriate equal
area projection for any region, and obtain its associated proj4 string.
In this case, we will use a Alberts Equal Area Conic projection centered on the Iberian peninsula. The proj4 string for this projection is:
```{r}
iberia_proj4 <- "+proj=aea +lon_0=-4.0429687 +lat_1=36.7790694 +lat_2=42.6258819 +lat_0=39.7024757 +datum=WGS84 +units=km +no_defs"
```
Note that we set the units to km so that have smaller numbers in the new axes.

For rasters (maps), we use the `terra` function `project` to change the CRS. We pass the raster object and the proj4 string as arguments:
```{r}
land_mask_proj <- terra::project(land_mask, y = iberia_proj4)
```

And replot it:
```{r}
ggplot() +
geom_spatraster(data = land_mask_proj, aes(fill = land_mask_1985))
```
Note that the coordinate system has now changed. We can get the true coordinates using the terra::plot function:
```{r}
terra::plot(land_mask_proj)

```

We now need to bring in some points. they come in as lat long, so we use the appropriate proj4 string
for raw coordinates.
```{r}
library(tidysdm)
library(sf)
data(lacerta)
lacerta <- st_as_sf(lacerta, coords = c("longitude", "latitude"))
st_crs(lacerta) <- "+proj=longlat"
#4326
```

Let's inspect the object:
```{r}
lacerta
```
We can see in the `geometry` column that are coordinates are in arc-degrees long and lat.

Now we project them to the same CRS as the raster, using the appropriate `sf` function
to project points:
```{r}
lacerta_proj <- st_transform(lacerta, iberia_proj4)
```

```{r}
lacerta_proj
```
Now the coordinates are in kilometers on the new axes.

Plot the projected points on top of the project map:
```{r}
ggplot() +
geom_spatraster(data = land_mask_proj, aes(fill = land_mask_1985)) +
geom_sf(data = lacerta_proj) + guides(fill="none")
```

The next step is to project the environemntal variables (layers of a raster). First we
extract the unprojected layers from `pastclim`
```{r}
download_dataset("WorldClim_2.1_10m")
climate_vars <- get_vars_for_dataset("WorldClim_2.1_10m")
climate_present <- pastclim::region_slice(
time_ce = 1985,
bio_variables = climate_vars,
data = "WorldClim_2.1_10m",
crop = iberia_poly
)
```
And now we project them
```{r}
climate_present_proj <- terra::project(climate_present, y = iberia_proj4)
```

Let's plot a layer with the points on top:
```{r}
ggplot() +
geom_spatraster(data = climate_present_proj, aes(fill = bio01)) +
geom_sf(data = lacerta_proj) + guides(fill="none") +
scale_fill_gradientn(colors = terrain.colors(7), na.value = "transparent")
```


Binary file modified data/lacerta.rda
Binary file not shown.
Binary file modified data/lacerta_ensemble.rda
Binary file not shown.
Binary file modified data/lacertidae_background.rda
Binary file not shown.
4 changes: 4 additions & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
Albers
BCI
Babak
BlockCV
Breugel
CMD
CRS
DALEX
Ecography
Ecol
Expand Down Expand Up @@ -79,6 +81,7 @@ lacertidae
lat
lh
lon
longlat
lonlat
lqpht
lubridate
Expand All @@ -99,6 +102,7 @@ pearson
pred
prepping
prob
proj
pseudoabsence
pseudoabsences
quasiquotation
Expand Down
Binary file modified inst/extdata/lacerta_climate_present_10m.rds
Binary file not shown.
Binary file modified inst/extdata/lacerta_land_mask.rds
Binary file not shown.
3 changes: 3 additions & 0 deletions man/thin_by_cell.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/thin_by_cell_time.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion man/thin_by_dist.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion man/thin_by_dist_time.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions tests/testthat/test_dist_pres_vs_bg.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
test_that("NAs in dataframe", {
# create dataframe with NAs
test_dataset <- tibble(
class = factor(c("presence", "presence", "presence", "presence", "presence",
"absence", "absence", "absence", "absence", "absence")),
cld6190_ann = c(76, 76, 57, 57, 57, 46, 74, 74, NA, 50),
dtr6190_ann = c(104, 104, 114, 112, 113, 143, 93, 93, NA, 161),
frs6190_ann = c(2, 2, 1, 3, 3, 112, 0, 0, NA, 133),
h_dem = c(121, 121, 211, 363, 303, 1040, 134, 63, NA, 3624)
)

# compute differences between presence and background
expect_error(test_dataset %>% dist_pres_vs_bg(class),
"NAs in the dataframe")
})



Loading
Loading