Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on using the precip data #83

Open
ldecicco-USGS opened this issue Nov 1, 2023 · 10 comments
Open

Question on using the precip data #83

ldecicco-USGS opened this issue Nov 1, 2023 · 10 comments

Comments

@ldecicco-USGS
Copy link

I'm curious if you have any references on how to extract the data from this package for my specific needs (which I think are a little common). I'm hoping it's basically a matter of pointing me towards some other package that I should learn about.

Here's my ideal workflow:

  1. I have a USGS site.
  2. I get the basin shapefile (from NHDplusTools?)
  3. I get daily precipitation from climateR (getGridMET was what I started with)
  4. I run some code to get the sum of the precip within the basin per day (so basically a data frame with date/time, precip).

Any chance you have an example that does something like this? I like all the plotting examples, but I'm not sure how to get the numbers out. My ultimate goal is to look at precip and discharge at particular sites.

@Rapsodia86
Copy link

Will this help?

library(climateR)
library(terra)
library(AOI)
AOI = aoi_get(state = "NY")

test_data <- getGridMET(
  AOI = AOI,
  varname = "pr",
  startDate = "2023-09-29",
  endDate  = "2023-10-29")

aoi_proj <- project(vect(AOI),test_data$precipitation_amount)

pr_crop <-crop(test_data$precipitation_amount,aoi_proj)

pr_df1 <- t(terra::extract(pr_crop,aoi_proj,touches=F,sum, na.rm=TRUE))  #difference in masking approach. See below or in the help of the "terra" package.
pr_df2 <- t(terra::extract(pr_crop,aoi_proj,touches=T,sum, na.rm=TRUE))

#touches logical. If TRUE, values for all cells touched by lines or polygons are extracted, not just those on the line render path, or whose center point is within the polygon. Not relevant for points; and always considered TRUE when weights=TRUE or exact=TRUE

@ldecicco-USGS
Copy link
Author

AWESOME, thanks! Here's what I added on in case anyone else is trying to do the same thing:

library(AOI)
library(terra)
library(climateR)
library(dataRetrieval)
library(sf)
library(nhdplusTools)
library(tidyverse)

nldi_nwis <- list(featureSource = "nwissite",
                  featureID = "USGS-04087214")

site <- get_nldi_feature(nldi_nwis)

basin <- get_nldi_basin(nldi_feature = nldi_nwis)

AOI = aoi_get(x = basin)

test_data <- getGridMET(
  AOI = AOI,
  varname = "pr",
  startDate = "2023-09-29",
  endDate  = "2023-10-29")

aoi_proj <- project(vect(AOI),test_data$precipitation_amount)

pr_crop <- crop(test_data$precipitation_amount,aoi_proj)

pr_df1 <- t(terra::extract(pr_crop,aoi_proj,touches=F,sum, na.rm=TRUE))  #difference in masking approach. See below or in the help of the "terra" package.
pr_df1 <- pr_df1[-1, ]
pr_df1 <- data.frame(pr_df1)
pr_df1$Date <- as.Date(gsub("pr_", "", row.names(pr_df1)))
names(pr_df1) <- c("precip", "Date")

discharge <- readNWISdv("04087214", "00060",
                        startDate = "2023-09-29",
                        endDate  = "2023-10-29")


discharge2 <- discharge |> 
  rename(Discharge = X_00060_00003) |> 
  left_join(pr_df1, by = "Date") |> 
  pivot_longer(names_to = "param", 
               cols = c(Discharge, precip))

ggplot(data = discharge2,
       aes(x = Date, y = value)) +
  geom_point() +
  geom_line() +
  facet_grid(param ~ ., scales = "free_y")

image

@ldecicco-USGS
Copy link
Author

Is there a favorite sub-daily precip source? I can put together a quick example of sub-daily if that would be helpful (and if such precip data exists which I assume does...)

@mikejohnson51
Copy link
Owner

mikejohnson51 commented Nov 2, 2023

Hi @ldecicco-USGS and @Rapsodia86,

Thanks for the awesome examples! I'll add one using the zonal package we've built for some weather service work. On larger (or incremental) basins, or larger time extracts, its proven to be a bit more performant.

library(climateR)
library(dataRetrieval)
library(zonal) #mikejohnson51/zonal
library(tidyverse)
library(terra)

siteID <- '04087214'
startDate <- "2023-09-29"
endDate  <-"2023-10-29"

# Get basin with dataRetrieval
basin <- findNLDI(nwis = siteID, find = "basin")[['basin']] |>
  mutate(siteID = siteID)

# Get discharge with dataRetrieval
discharge <- readNWISdv(siteID, "00060", startDate = startDate, endDate  = endDate ) |>
  renameNWISColumns() |>
  select(Date, Flow)

# Get rainfall grids with climateR
pr <- getGridMET(AOI = basin, varname = "pr", startDate = startDate, endDate  = endDate) 

# summarize rainfall to basin with zonal
data <-execute_zonal(pr, geom = basin, ID = "siteID", fun = "sum", join = F) |>
  pivot_longer(-siteID) |>
  mutate(Date = as.Date(time(pr[[1]]))) |>
  select(Date, pr = value) |>
  left_join(discharge, by = "Date")

# Build plot
pivot_longer(data, names_to = "param", cols = c(Flow, pr)) |>
 ggplot(aes(x = Date, y = value)) +
  geom_point() +
  geom_line() +
  facet_grid(param ~ ., scales = "free_y")

Created on 2023-11-02 by the reprex package (v2.0.1)

So sub daily, NLDAS is often used and climateR does offer NLDAS support. However, I just found an error was returned from the server and will look into it! More to come on that (hopefully soon).

@mikejohnson51
Copy link
Owner

For now - it does appear the NLDAS server is down (https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002.das) when that is back up I will share an example!

@mikejohnson51
Copy link
Owner

mikejohnson51 commented Nov 8, 2023

Looks like its back up! Using the same basin as above we can get NLDAS forcing data. For this one you will need a NASA EarthData account and to populate climateR::writewriteNetrc() ...

After that, both @Rapsodia86's excellent terra solution, or, the zonal solution would work to summarize the data.

Hope it helps!

# Get rainfall grids with climateR
pr <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 

# summarize rainfall to basin with zonal
data <-execute_zonal(pr, geom = basin, ID = "siteID", fun = "sum", join = F) |>
  pivot_longer(-siteID) |>
  mutate(Date = time(pr[[1]]))

ggplot(data, aes(x = Date, y = value)) +
  geom_point() +
  geom_line() 

Created on 2023-11-07 by the reprex package (v2.0.1)

@ldecicco-USGS
Copy link
Author

I created an Earthdata account and ran the writeNetrc function. When I then ran:

pr <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 
Note:Caching=1
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details: 
Warning:oc_open: Could not read url
Error in open.nc(glue("{url}?{T_name}")) : NetCDF: I/O failure

Does that error ring a bell?

@mikejohnson51
Copy link
Owner

Hi @ldecicco-USGS! That error seemed familiar and I checked an old chat history with @program-- .

The quick fix should be to set Sys.setenv(CURLOPT_SSL_VERIFYPEER = FALSE) following this.

However turning off SSL verification shouldn't be a permanent solution because of the security risks associated with it (that said I trust the NASA endpoint completely, and their cert seems ok). It might be worth checking with the USGS IT folks (assuming you're on a GFE) about the message.

Hope that helps and maybe @program-- can provide a little more nuance if I have missed any.

@program--
Copy link
Contributor

@ldecicco-USGS To add on to what @mikejohnson51 said, I'm not sure how USGS handles it, but I know other agencies have proxies that handle SSL termination when on GFE/using an agency VPN. That might also be a cause (again, under the assumption that you're on GFE or using a VPN).

@ldecicco-USGS
Copy link
Author

I've tried both on and off our VPN with no luck.

I added:

Sys.setenv(CURLOPT_SSL_VERIFYPEER = FALSE,
           CURLOPT_SSL_VERIFYHOST = FALSE,
           CURLOPT_SSL_VERIFYSTATUS = FALSE,
           CURLOPT_VERBOSE = TRUE)
pr2 <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 
Note:Caching=1
*   Trying 198.118.197.99:443...
* Connected to hydro1.gesdisc.eosdis.nasa.gov (198.118.197.99) port 443 (#0)
* ALPN: offers http/1.1
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details: 
Warning:oc_open: Could not read url
Error in open.nc(glue("{url}?{T_name}")) : NetCDF: I/O failure

I think "unable to get local issuer certificate" might mean it's a me problem 🤷‍♀️.

I know our IT department sets this by default:

Initiating curl with CURL_SSL_BACKEND: openssl
curl::curl_version()$ssl_version
[1] "OpenSSL/3.1.2 (Schannel)"

I'll try to figure out if there's a simple setting I can adjust. Or possibly I need to copy our crt file somewhere else.

@dblodgett-usgs are you able to get this to work on a gov computer? And if so, did you do anything special?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants