Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden drop in computing performance? #43

Open
arisbw opened this issue Sep 28, 2017 · 20 comments
Open

Sudden drop in computing performance? #43

arisbw opened this issue Sep 28, 2017 · 20 comments

Comments

@arisbw
Copy link

arisbw commented Sep 28, 2017

Hi, I was running the code from my computer (Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz and 12 GB RAM). I tried to train with 30k rows and 318 variables. But, it seems strange that there was sudden drop in the computing task that I didn't make any progress in fitting model/continue to next fold (I use 5 fold cross validation, 8 models in first layer and 1 model in second layer. And, I also use all of my 4 threads). Basically, with this data, I had sudden drop after 30 minutes running StackNet. Here is my screenshot from my last work.

2017-09-28

I also tried to play with threads command, but nothing changed.

@arisbw arisbw changed the title Sudden drop in computing performance Sudden drop in computing performance? Sep 28, 2017
@kaz-Anova
Copy link
Owner

I have seen that before in windows...Strangely if you just press enter into the screen it continues. Not sure why this happens.... Can you try that and let me know?

@arisbw
Copy link
Author

arisbw commented Sep 28, 2017

Already tried it but nothing happened.

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

Strangely enough, this event also occurred when I ran the code from ubuntu 16.04.

@kaz-Anova
Copy link
Owner

That is indeed strange. I have not seen that before... Does it always hang at the same place ? could you send me the file you run this and the parameters' file (as well as the command you run) and tell me where I should expect the pause in order to try and replicate?

@goldentom42
Copy link

I'm also on Ubuntu 16.04 LTS I can give it a try if you wish.

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

Thanks guys. Here are the files. You will expect sudden drop in 5th fold of first layer (4th model).

@goldentom42
Copy link

ok got the files. just started stacknet on win8.1 and will let you know.

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

If you happen to produce output file, could you please send it back to me? Thanks.

@goldentom42
Copy link

Yeh sure ;-)
By the way looking at the param file I saw that RandomForest had 5 threads when you have 4 logical cores. Did you try reducing that number?
I'm just wondering if StackNet is not waiting for that thread to complete.
This may sound completely stupid but never know...

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

Ah I see. You seem right, but it should break at first fold, right? But now I try to run it again with modified threads params.

@goldentom42
Copy link

Issue reproduced. perf drops at 30% on the 5th fold and 4 first models.
3 metrics are displayed but not the 4th one.

@goldentom42
Copy link

Reducing the number of threads does not change the problem. However after reducing the number of estimators or iterations of the models I managed to get StackNet go through the all process...

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

OK. Could you please share that modified params?

@kaz-Anova
Copy link
Owner

I am also running it right now and will let you know the results . I have many cores available and did not encounter a problem at 2nd fold (e.g. I am in 4th now) , which makes me think this is in general related with threading...

@kaz-Anova
Copy link
Owner

It is reproduced. Don't know why , but it seems to be at the predict() of the Softmaxclassifier . The reason CPU stops is not relevant - it has to do with the fact that we are in scoring and threading is not used. For some reason there must be a bug the code causing an infinite loop somewhere.

It does not throw error though..

@kaz-Anova
Copy link
Owner

I will try to find a workaround.

@goldentom42
Copy link

@arisbw, the reduced estimators were really low like 3 or 4... so it won't help you.
As @kaz-Anova said above, you may want to remove softmaxclassifier for now.

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

OK, I'll make sure to remove softmaxclassifier for now. Thank you @goldentom42 @kaz-Anova

@kaz-Anova
Copy link
Owner

@arisbw .

You dont need to remove it . You just change the seed of Softmax to 10 it works fine:

softmaxnnclassifier usescale:True seed:10 Type:SGD maxim_Iteration:35 C:0.0005 learn_rate:0.001 smooth:0.0001 h1:50 h2:40 connection_nonlinearity:Relu init_values:0.05 verbose:false

Honestly I have not a single clue why...
These are the results of all models:

Average of all folds model 0 : 0.7867379723430724
 Average of all folds model 1 : 0.7929649557885149
 Average of all folds model 2 : 0.7866359111370649
 Average of all folds model 3 : 0.7764824969782087
 Average of all folds model 4 : 0.7805320681382869
 Average of all folds model 5 : 0.7754102012289547
 Average of all folds model 6 : 0.7642405924462954
 Average of all folds model 7 : 0.7817682598159342

@arisbw
Copy link
Author

arisbw commented Sep 29, 2017

...... now it goes much weirder that I can imagine. Again, thanks @kaz-Anova!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants