Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added gh_fill.R #35

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

gvanweerden
Copy link

Added a function to fill geohashes with the contained geohashes.

Initial add of function to fill geohash with contained geohashes
@@ -0,0 +1,28 @@
#' Fill geohash prefix with members
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package isn't currently using roxygen2, so you'll need to either write the .Rd directly, or convert the rest of the package to use roxygen2.



gh_fill <- function(geohashes, precision) {
if (uniqueN(nchar(geohashes)) > 1) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uniqueN is a data.table function, which we don't import. length(unique()) should be fine here.

unlist(strsplit("0123456789bcdefghjkmnpqrstuvwxyz", split = ""))

grid <-
do.call(data.table::CJ, append(list(geohashes), replicate(new_levels, base32, FALSE)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you do a performance comparison of CJ() vs. expand.grid() for typical inputs? E.g. 1e4 inputs and adding 1-3 characters of precision...

Depending on that we'll add the data.table dependency

if (uniqueN(nchar(geohashes)) > 1) {
stop("Input Geohashes must all have the same precision level.")
}
if (sum(grepl("['ailo]", geohashes)) > 0) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer any(grepl()).

also, elsewhere in the package, we allow upper-case input, we should strive to match case here.

#' @param geohashes Character vector of input geohashes. They must all be of same precision
#' @param precision Positive integer scalar controlling the 'zoom level' – how many characters should be used in the output.
#' @return Character vector of geohashes corresponding to the input.
#' @export
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add examples and tests, it will especially make it easier to review when I can see expected inputs/outputs


new_levels <- precision - nchar(geohashes[1])

base32 <-
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably better to run this at compile-time in the package environment, rather than every time gh_fill() is invoked

@MichaelChirico
Copy link
Owner

Thanks Garnet! Sorry for the long delay. I no longer work with geohashes very much so this package has lost priority (open to new maintainers!).

If I understand right, the function would indeed be useful. Needs some package "productionization" stuff, and I'd like to justify adding data.table as a dependency if you don't mind.

@MichaelChirico
Copy link
Owner

Friendly ping if you'd like this merged, I've got a new version ready to submit soon.

@gvanweerden
Copy link
Author

Hi Michael, I have added the changes that you referenced in the earlier messages. I hope it's been done in an okay manner.

I wrote a test to compare data.table::CJ() vs base::expand.grid. Often when I use this function it is to create vectors in the 10's of millions. I'm just not sure if the speed increase warrants a dependency on data.table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants