[ENH] Assessing Performance #134

samihamdan · 2021-05-07T11:59:46Z

Problem
I noticed that sometimes performance of julearn seems to be not that great. I am not sure whether this is a real trend or just normal frustration with the speed of ml.

Solution
I am not sure whether this is actually a problem, but it would be nice to in general assess performance to keep and eye on how much overhead we add to sklearn. Even if we do not change the speed it is good to make realistic expectations of potential users.

Considerations
How does it change with more data or more transformers.
It could be that each transformation from np.array to pd.DataFrame has a big impact. On the other hand the implementation of confound removal could also be the reason for long computation times in real world observations.

Screenshot
I did one very simple observation with only one transformer.
If I use 4x of the data I still have a similar 3x worse performance of julearn.

fraimondo · 2021-05-07T12:53:47Z

What about the same test using the numpy version of julearn API?

…

On Fri, 7 May 2021, 14:00 Sami Hamdan, ***@***.***> wrote: *Problem* I noticed that sometimes performance of julearn seems to be not that great. I am not sure whether this is a real trend or just normal frustration with the speed of ml. *Solution* I am not sure whether this is actually a problem, but it would be nice to in general assess performance to keep and eye on how much overhead we add to sklearn. Even if we do not change the speed it is good to make realistic expectations of potential users. *Considerations* How does it change with more data or more transformers. It could be that each transformation from np.array to pd.DataFrame has a big impact. On the other hand the implementation of confound removal could also be the reason for long computation times in real world observations. *Screenshot* I did one very simple observation with only one transformer. If I use 4x of the data I still have a similar 3x worse performance of julearn. [image: image] <https://user-images.githubusercontent.com/44375312/117445735-5477ca00-af3b-11eb-8242-98078bcbb696.png> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#134>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABCJDA63NEPVLRM6S6J4CZ3TMPI4NANCNFSM44JVLTPQ> .

samihamdan · 2021-05-07T13:12:44Z

For this simplistic example no difference between both julearn APIs.
But good point for the general assessment of speed.

fraimondo · 2022-07-21T08:09:09Z

What about adding some benchmark tests to compare if a PR makes a huge mess with performance? Can it be done using CI? Maybe @synchon can help with this one.

synchon · 2022-07-21T09:00:16Z

I'll take a look.

fraimondo · 2023-04-06T10:22:03Z

@synchon will take a look at it soon

samihamdan added the enhancement New feature or request label May 7, 2021

fraimondo added the needs thinking We need more discussion around this topic label Jul 21, 2022

fraimondo assigned synchon Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Assessing Performance #134

[ENH] Assessing Performance #134

samihamdan commented May 7, 2021

fraimondo commented May 7, 2021 via email

samihamdan commented May 7, 2021

fraimondo commented Jul 21, 2022

synchon commented Jul 21, 2022

fraimondo commented Apr 6, 2023

[ENH] Assessing Performance #134

[ENH] Assessing Performance #134

Comments

samihamdan commented May 7, 2021

fraimondo commented May 7, 2021 via email

samihamdan commented May 7, 2021

fraimondo commented Jul 21, 2022

synchon commented Jul 21, 2022

fraimondo commented Apr 6, 2023