You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Dirk, sorry if this is not exactly as much effort as you expect for an issue, I just wanted to flag something reported to collapse (#648), which is present in both my hash functions written in C and your hash functions, and that is the following:
library(collapse)
#> Warning: package 'collapse' was built under R version 4.3.3#> collapse 2.0.17, see ?`collapse-package` or ?`collapse-documentation`x= round(rnorm(100))
unique(x) # R#> [1] 1 0 2 -1 -2
funique(x) # My hash function in C#> [1] 1 0 0 2 -1 -2
funique(x, sort=TRUE) # Rcpp::sugar::sort_unique()#> [1] -2 -1 0 0 1 2# More explicit proofcollapse:::sortuniqueCpp(x)
#> [1] -2 -1 0 0 1 2# The solutiony=x+0L
funique(y)
#> [1] 1 0 2 -1 -2collapse:::sortuniqueCpp(y)
#> [1] -2 -1 0 1 2
In words: R functions like round() create signed and unsigned zeros, whose hashes differ. A quite efficient remedy is to add an integer zero (gives like a 3% slower execution on my very efficient C hash). I'm considering to roll this out, but of course cannot control your code. So just pushing it to you as food for thought.
The text was updated successfully, but these errors were encountered:
Thanks for the note. We will ponder this, Making a change may risk upsetting existing code, but documenting your workaround of adding an explicit zero seems like a good idea.
I'd be surprised if adding a zero breaks any code - except for this particular issue which is clearly not meant to be as far as R is concerned. But yeah, haven't checked yet what base R does to avoid sign bits with zeros - probably something smarter but slower...
Hi Dirk, sorry if this is not exactly as much effort as you expect for an issue, I just wanted to flag something reported to collapse (#648), which is present in both my hash functions written in C and your hash functions, and that is the following:
Created on 2024-10-31 with reprex v2.0.2
In words: R functions like
round()
create signed and unsigned zeros, whose hashes differ. A quite efficient remedy is to add an integer zero (gives like a 3% slower execution on my very efficient C hash). I'm considering to roll this out, but of course cannot control your code. So just pushing it to you as food for thought.The text was updated successfully, but these errors were encountered: