`refine` takes long time to run for "small" datasets #222

gaow · 2024-04-08T12:14:27Z

I have been using susie() with refine=T for various analysis. I noticed for smaller sample size it can take very long time to run. For example even with the simulated data shown in ?susieR::susie,

     library(susieR)
     set.seed(1)
     n = 1000
     p = 1000
     beta = rep(0,p)
     beta[1:4] = 1
     X = matrix(rnorm(n*p),nrow = n,ncol = p)
     X = scale(X,center = TRUE,scale = TRUE)
     y = drop(X %*% beta + rnorm(n))
     st = proc.time()
     res1 = susie(X,y,L = 10, refine=TRUE)
     proc.time() - st

It takes more than two minutes,

    user   system  elapsed 
2208.592 2871.796  130.205

but without refine it's two seconds. @zouyuxin perhaps we should evaluate and improve the behavior of refine -- have you noticed it when you develop that feature?

The text was updated successfully, but these errors were encountered:

pcarbo · 2024-04-08T14:09:54Z

@gaow With refine = TRUE, susie is being called an additional 16 times, so this much longer running time isn't surprising. (However, it would be helpful if the refinement step provided more updates on its progress.)

One workaround would be to set max_iter to a smaller value.

gaow · 2024-04-08T14:24:08Z

Thanks @pcarbo

One workaround would be to set max_iter to a smaller value

You mean in the "refine" codes? I think most of the time SuSiE converges in < 20 iterations anyways? It's the 16 times it is being called that seems a bit too much. In many other examples especially with larger sample size, it is much less than 16 times. I wonder if there is a way to fundamentally improve it ...

pcarbo · 2024-04-08T16:18:11Z

Yes, there is quite possibly room for improvement in the refinement step, but I don't have any clever ideas at the moment. Suggestions are welcome.

pcarbo added the enhancement New feature or request label Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`refine` takes long time to run for "small" datasets #222

`refine` takes long time to run for "small" datasets #222

gaow commented Apr 8, 2024

pcarbo commented Apr 8, 2024

gaow commented Apr 8, 2024

pcarbo commented Apr 8, 2024

refine takes long time to run for "small" datasets #222

refine takes long time to run for "small" datasets #222

Comments

gaow commented Apr 8, 2024

pcarbo commented Apr 8, 2024

gaow commented Apr 8, 2024

pcarbo commented Apr 8, 2024

`refine` takes long time to run for "small" datasets #222

`refine` takes long time to run for "small" datasets #222