You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the Data Carpentry Geospatial Curriculum Advisory Committee (CAC) has adviced to reference coordinate systems by their EPSG codes instead of using proj4strings (see #368). Some of the lesson' data (downloaded using /episodes/setup.R) doesn't meet this advice.
Regarding this issue, I found, after run the code below, that:
Some raster files are missing EPSG codes.
Some rasters are compatible with EPSG codes. I compared their CRSs' text descriptions before and after re-projecting them using the EPSG code corresponding to their original CRS.
Most vector files' CRSs contain EPSG codes (the only exception is newPlots_latLon.shp but it easy to infer it). However, after applying the same procedure used for the raster data, I found differences in their CRSs' WKT before and after applying a projection. This implies the their CRS were build either using a PROJ string or an old EPSG database.
Some vector files have Z coordinates but they are all set to 0. This could make the sf package produce warnings or even throw errors in lessons' code.
If we assign EPSG codes (to both raster and vector files) and remove their Z coordinates, we can simplify the lesson's code by, for example, removing workarounds and validation code.
Regarding the lesson's data, which is currently hosted in FigShare, I think we could:
Host the lesson's data along the lesson's repo,
Or we could build an R package with the data and publish it to CRAN,
Or we could drop the current data and replace it using data already hosted in a CRAN package.
Any of these alternatives would simplify managing the lesson and also installing and running its code.
I'm opening this issue following the suggestion by @tobyhodges in order to close #368 and move on with the discussion regarding the CRSs of this lesson's data.
Bests,
Alber
###############################################################################
# Check the data for the Carpentries lesson r-raster-vector-geospatial
###############################################################################
library(dplyr); library(purrr); library(sf);
library(terra); library(tibble); library(utils);
#---- Setup ----
#NOTE: This variable should point to the episodes' directory.
episode_dir <- "~/Documents/github/datacarpentry/r-raster-vector-geospatial/episodes"
setwd(episode_dir)
# Download the lesson data using the script setup.R
source(file.path(episode_dir, "setup.R"))
# Find the directory with the data.
data_dir <- file.path(episode_dir, "data")
stopifnot("Data directory not found!" = dir.exists(data_dir))
#--- Utilitary ----
get_crs_wkt <- function(crs) { return(crs$wkt) }
get_epsg <- function(crs) { return(crs$epsg) }
has_z <- function(obj_sf) { return(!is.null(sf::st_z_range(obj_sf))) }
has_m <- function(obj_sf) { return(!is.null(sf::st_m_range(obj_sf))) }
get_input <- function(crs) { return(crs$input) }
#---- Raster data ----
raster_tb <-
data_dir %>%
list.files(pattern = "*.tif$", full.names = TRUE, recursive = TRUE) %>%
tibble::as_tibble() %>%
dplyr::rename(file_path = value) %>%
dplyr::mutate(r = purrr::map(file_path, terra::rast),
crs_terra = purrr::map_chr(r, terra::crs),
crs_sf = purrr::map(r, sf::st_crs),
epsg = purrr::map_int(crs_sf, get_epsg),
epsg = dplyr::if_else(is.na(epsg), NA, paste("EPSG:", epsg)))
#NOTE: Some (30) rasters have NA instead of a valid CRS.
raster_tb %>%
dplyr::count(epsg)
#NOTE: Re-projecting rasters using EPSG codes doesn't change their CRSs at all.
# So, the lesson data can be re-projected using EPSG codes without
# breaking the code in the lessons.
raster_tb %>%
dplyr::filter(!is.na(epsg)) %>%
dplyr::mutate(r_proj = purrr::map2(r, epsg, terra::project),
r_proj_crs = purrr::map_chr(r_proj, terra::crs),
crs_diff = purrr::map2_dbl(crs_terra, r_proj_crs,
utils::adist)) %>%
dplyr::count(crs_diff)
#---- Vector data ----
vector_tb <-
data_dir %>%
list.files(pattern = "*.shp$", full.names = TRUE, recursive = TRUE) %>%
tibble::as_tibble() %>%
dplyr::rename(file_path = value) %>%
dplyr::mutate(v = purrr::map(file_path, sf::read_sf),
v_crs = purrr::map(v, sf::st_crs),
v_wkt = purrr::map(v_crs, get_crs_wkt),
epsg = purrr::map_int(v_crs, get_epsg),
has_z = purrr::map_lgl(v, has_z),
has_m = purrr::map_lgl(v, has_m),
v_no_z = purrr::map(v, sf::st_zm),
crs_input = purrr::map_chr(v_crs, get_input))
#NOTE: There is a vector missing EPSG code.
vector_tb %>%
dplyr::filter(is.na(epsg)) %>%
dplyr::select(file_path, epsg) %>%
dplyr::mutate(file_path = basename(file_path)) %>%
print(n = Inf)
#NOTE: There some vectors with Z coordinates, but all of them are 0s.
vector_tb %>%
dplyr::filter(has_z) %>%
dplyr::mutate(file_path = basename(file_path),
z_range = purrr::map(v, sf::st_z_range)) %>%
dplyr::select(file_path, has_z, z_range) %>%
print(n = Inf) %>%
pull(z_range)
# Add missing EPSG by hand.
vector_tb <-
vector_tb %>%
dplyr::mutate(epsg = dplyr::if_else((crs_input == "WGS 84" & is.na(epsg)),
4326, epsg)) %>%
# Re-project using EPSGs.
dplyr::mutate(new_v = purrr::map2(v_no_z, epsg, sf::st_transform),
new_crs = purrr::map(new_v, sf::st_crs),
new_crs_wkt = purrr::map(new_crs, get_crs_wkt),
crs_diff = purrr::map2_dbl(v_wkt, new_crs_wkt,
utils::adist))
#NOTE: The CRSs' WKT change after projection using EPSG codes.
vector_tb %>%
dplyr::select(file_path, crs_diff)
The text was updated successfully, but these errors were encountered:
The example dataset is published on FigShare, where there is the option of creating a new version of the record if and when the file change. I think the record is owned by NEON, but I would be happy to try to coordinate with them to publish a new version.
How could the content be improved?
Good evening,
the Data Carpentry Geospatial Curriculum Advisory Committee (CAC) has adviced to reference coordinate systems by their EPSG codes instead of using proj4strings (see #368). Some of the lesson' data (downloaded using /episodes/setup.R) doesn't meet this advice.
Regarding this issue, I found, after run the code below, that:
newPlots_latLon.shp
but it easy to infer it). However, after applying the same procedure used for the raster data, I found differences in their CRSs' WKT before and after applying a projection. This implies the their CRS were build either using a PROJ string or an old EPSG database.sf
package produce warnings or even throw errors in lessons' code.If we assign EPSG codes (to both raster and vector files) and remove their Z coordinates, we can simplify the lesson's code by, for example, removing workarounds and validation code.
Regarding the lesson's data, which is currently hosted in FigShare, I think we could:
Any of these alternatives would simplify managing the lesson and also installing and running its code.
I'm opening this issue following the suggestion by @tobyhodges in order to close #368 and move on with the discussion regarding the CRSs of this lesson's data.
Bests,
Alber
The text was updated successfully, but these errors were encountered: