You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For completeness would it be worth including a vector based stack for comparison in the benchmarking vignette? The performance of one based on Martin Morgan's suggestion on stackoverflow is better than a list/environment based one.
library(dplyr)
#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, union
library(forcats)
library(ggplot2)
library(collections)
#> #> Attaching package: 'collections'#> The following object is masked from 'package:utils':#> #> stack# vector based stack (based on https://stackoverflow.com/a/18678440)vec_stack<-function(type="double", length=1000L) {
v<- vector(type, length)
i<-1Llen<- length(v)
list(
push=function(elt) {
if (typeof(elt) !=type)
stop("types must match")
if (i==len) {
length(v) <<-1.6*lenlen<<- length(v)
}
v[[i]] <<-elti<<-i+1L
},
pop=function() {
i<<-i-1Lv[[i]]
},
clear=function() {
v<<- vector(type, length)
}
)
}
# list based stack from benchmark vignettelist_stack<-function() {
self<- environment()
q<-NULLn<-NULLpush<-function(item) {
if (is.null(item)) {
q[n+1] <<-list(item)
} else {
q[[n+1]] <<-item
}
n<<-n+1invisible(self)
}
pop<-function() {
if (n==0) stop("stack is empty")
v<-q[[n]]
q<<-q[-n]
n<<-n-1v
}
clear<-function() {
q<<-list()
n<<-0invisible(self)
}
clear()
self
}
# bench mark based on one in vignette (slightly extended n)bench_stack<-bench::press(
n= c(10, 50, 100, 200, 500, 1000),
bench::mark(
`base::list_stack_grow`= {
q<- list_stack()
x<- rnorm(n)
for (iin1:n) q$push(x[i])
for (iin1:n) q$pop()
},
`base::vec_stack_pre_allocate`= {
q<- vec_stack(length=n)
x<- rnorm(n)
for (iin1:n) q$push(x[i])
for (iin1:n) q$pop()
},
`base::vec_stack_grow`= {
q<- vec_stack(length=2L)
x<- rnorm(n)
for (iin1:n) q$push(x[i])
for (iin1:n) q$pop()
},
`collections::stack`= {
q<- stack()
x<- rnorm(n)
for (iin1:n) q$push(x[i])
for (iin1:n) q$pop()
},
check=FALSE
)
) |>
mutate(expression= fct_reorder(
as.character(expression), median, .fun=mean, .desc=TRUE))
#> Running with:#> n#> 1 10#> 2 50#> 3 100#> 4 200#> 5 500#> 6 1000# plotbench_stack %>%
ggplot(aes(x=n, y=median)) +
geom_line(aes(color=expression)) +
scale_colour_brewer(palette="Set2", direction=-1) +
ggtitle("push and pop n times") + ylab("time")
The vector based stack works so much better now by not making copies with each append (something in 3.4.x?)
If you really want to use it in practice, just make sure you trim the vector length when it is shortened enough, say the length is less than 0.5 * the preallocated length.
Feel free to submit the change for the vignette, I will render the site once the update is merged.
For completeness would it be worth including a vector based stack for comparison in the benchmarking vignette? The performance of one based on Martin Morgan's suggestion on stackoverflow is better than a list/environment based one.
Created on 2023-01-04 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: