Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ee_closest_distance_to_val #17

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

zackarno
Copy link
Collaborator

I added function to calculate the closest distance between any point geom and specified pixel value (image/imageCollection). There is an rmd which i think explains it better.

  • To do this I had to do a update to ee_timeseries with a new method to allow it take ee$Image as well as ee$ImageCollection. I did this hacky work around so that there would be minimal changes to the code in order limit downstream code breakage. This works for now, but it makes the syntax of using the imageCol parameter a little murky since we can also use just images.
  • I also updated map_date_to_bandname to allow image

I think it might be worth thinking about the syntax on ee_timeseries . Perhaps it could work to just copy the syntax used in rgee::ee_extract - this might clear things up and improve compatibility. As for the name, i think it's fine - if you provide an image you just end up with 1 point in time. However, I could see a potential argument for changing it.

As for the function itself, I am still considering it a bit experimental - i think it is working, but i need to test it more. What do you think about merging this type of function to the main with this caveat?

@joshualerickson
Copy link
Owner

Hey Zach, let me think on this one a little bit. I'm taking a look and getting an idea of what you're looking at doing (looks really cool). It reminds me of a 'get_*' function but with other applications so I'm more inclined to keep the naming convention and args. I'll makes some comments and get back with you soon. Thanks for doing this, looks sweet!

As to the ee_timeseries idea, I'm not sure that we need to change it too much. I could see removing the filter args so that the user has to pre-process. Ok, wait. I do like that idea... So if we take out the temporal args and replace with ee_extract() then it would essentially be a tidy timeseries. Something like,

aoi <- exploreRGEE::huc

imageCol = ee$ImageCollection("LANDSAT/LC08/C01/T1_SR")

filt_IC<- ee_year_month_filter(imageCol = imageCol,
                                                   startDate = '2018-01-01', 
                                                   endDate = '2020-12-31',
                                                   months = c(6, 10), 
                                                   stat = 'mean')
tidy_ts <- filt_IC %>%
                ee_timeseries(geom = aoi,
                                        fun = ee$Reducer$mean(),,
                                        sf = FALSE,
                                        via = "getInfo",
                                        container = "rgee_backup",
                                        lazy = FALSE,
                                        quiet = FALSE
                                         )

The cons would be losing the sugar. For example, same thing but removes the need to 'pre-process'. With ... you can always pass to ee_extract().

aoi <- exploreRGEE::huc

imageCol = ee$ImageCollection("LANDSAT/LC08/C01/T1_SR")

tidy_ts<- ee_timeseries(imageCol, scale = 250,
                            geom = huc,
                            startDate = '2018-01-01',
                            endDate = '2020-12-31',
                            temporal = 'year_month',
                            temporal_stat = 'mean',
                            reducer = 'mean',
                            months = c(6,10))

I'm not sure how I feel about this. We might consider something like ee_tidy_extract(), which would be a way to get a more inclusive function (not limited on structure) and it does what it says; tidies the extract function for you. I'm leaning more towards this: drop ee_timeseries() and create a new function ee_tidy_extract(). Let me know what you think and I'll create a branch to work on this.

I can work on this and review the PR. Might be a week or so but stay in touch. Thanks!

@zackarno
Copy link
Collaborator Author

zackarno commented Apr 1, 2022

Cool, yeah i think there is no harm in separating the filter/pre-process and the extract steps – then we will inevitably build a wrapper to combine them again which will allow the sugar, but end up with nicer code i think.

I think ee_tidy_extract was in the back of my mind somewhere when I made ee_extract_long which does a lot of the same. I was thinking long = tidy.

However, now thinking about it, I think I have two ideas which might be good. The first one would be a pretty quick fix and basically builds upon what you are proposing by splitting ee_timeseries with ee_tidy_extract and the second is more ambitious, but could be really cool

Idea 1:

  • Take preprocessing steps out and create ee_tidy_extract
  • Then wrap them up again with a new function something like ee_extract_by or ee_tidy_extract_by with the by argument options being “month”, “year” ,”year_month”… could potentially allow for filterDate arguments, not sure if this would be nice.

Option 2:
If we want to go really tidy… just imagine this (i think the example/comments explain best here):

imageCol |> 
  # A) calendarYear + compositing steps
  group_by("year") |> # options for year, month, year_month
  summarise(
    .temporal_fun = mean # also have option for supplying args to .temporal_fun like: list(mean, median, min, max,...)
  ) |> 
  # B) extracting
  ee_tidy_extract(y= geom,.spatial_fun=mean) # spatial fun would be used for polygons if you want the mean or median for the shape for example

# for image or imageCollections that you just want "all" extracted it would be

image |> # or imageCol
  ee_tidy_extract(y= geom)

# so now the top example again without comments, just because it looks so nice
imageCol |> 
  group_by("year") |> 
  summarise(
    .temporal_fun = mean
  ) |> 
  ee_tidy_extract(y= geom,.spatial_fun=mean) 

We could potentially implement the idea 1 first and then transition to idea 2 if we think it's good or just skip to 2. It might be a lot of work. It is kind of interesting because if we go with idea 1 -> idea 2 the work flow kind of mirrors what happened with the survey package which had svyby argument which was then revamped dplyr/tidystyle in the srvyr package which switched it to the group_by flow. If we wanted to go that route, we could probably get some good ideas from looking at how the author did it in srvyr. However, since we are dealing with imageCollections rather than tabular data it is also quite different

@joshualerickson
Copy link
Owner

I like option 2! So just create methods that handle image/imageCollection classes for group_by() and summarise()? Then we can use group_by() for other aggregations of imageCols, etc. I like that 👍. We could also do a method for filter(), e.g. filter(startDate >= '2019-01-01', months == c(6,10)). Then you could do,

imageCol |>
filter(startDate >= '2019-01-01', months == c(6,10)) |>
group_by('year_month') |>
summarise(.temporal_stat = 'mean') |>
ee_extract_tidy(y = aoi, reducer = ee$Reducer$mean())

I think it's worth working on this and establishing this type of style. I'll review your ee_closest_distance_to_val function and run some tests but i'll leave ee_timeseries() for now. I think it could stay but eventually label as deprecated. Nice work man! Let's start a branch to work on this, tidy_methods maybe? Have a good one!

@zackarno
Copy link
Collaborator Author

zackarno commented Apr 5, 2022

Yes, methods for group_by , summarise, and filter - exactly! I am wondering if this is the point where we should create a new package - one with a narrow focus - bringing dplyr syntax to rgee? maybe something like tidyRGEE? I still want a lot of the cool functionality that has been developed and that we are working on in exploreRGEE, so think we should keep it going in parallel - adding in processing/analysis functionality there for the time being. What do you think?

@joshualerickson
Copy link
Owner

Ok, so a lightweight package {rgeeTidy} or {tidyRGEE} with S3 methods for tidying image and imageCollections that we can then use for {exploreRGEE} or {easyrgee}. Sounds good to me! I'll leave this PR open until we get that implemented and then we can just tie it into the code. Might make sense to chat again about direction as well. I'm thinking it's getting more relevant to separate {exploreRGEE} and focus on {easyrgee} or from scratch? Let me know what you think. Did you want to get this going? I've got some time and don't mind but if you want to do it then I'd say go for it :). Have a good one!

@zackarno
Copy link
Collaborator Author

zackarno commented Apr 6, 2022

Yeah, i think that makes sense. I'll email to see if we can arrange a chat!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants