Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #5

sanika1201 · 2019-12-09T03:53:56Z

Description
Goal: Compare performance of S-Rerf with different classifiers on grasp detection using real EEG data.

This demo is a Jupyter Notebook documentation analyzing the performance of S-Rerf against classifiers like K-Nearest Neighbors, Random Forest and Multi-Layer Perceptron on structured EEG data. To keep the structure of the data, binning (based on the concept of moving average filter) is done before training on the data. The challenge faced is that the data is highly unbalanced so it is balanced before training. The metric used for evaluation are precision curves, balanced accuracy and mean test error.

Output: The precision, balanced accuracy and mean test error plots that compare performance of S-Rerf with different classifiers.

Code and Details of the demo:
https://nbviewer.jupyter.org/github/NeuroDataDesign/team-forbidden-forest/blob/master/Sanika/Final_PR_upload.ipynb

* update max_features to accept a fraction > 1.0 * put inequality in easier to read form.

bdpedigo · 2019-12-12T19:07:42Z

@sanika1201 I don't understand why the one commit here is Jesse's, did you mean to PR the notebook somewhere? I know your situation is a bit special, however.

bdpedigo · 2019-12-12T19:08:10Z

Some of your line lengths are way too long, my rule of thumb is <88 chars

bdpedigo · 2019-12-12T19:29:02Z

remove old code that is commented out
when you index by [:, 32] in cell 4, what is that doing?
I don't understand how you are doing the downsampling/resampling whatever. are you just grabbing time points at random?
This line Y_train_downsampled = X_train_downsampled.iloc[:,32] looks suspect to me, can you explain?
Again, can you explain X_train_downsampled.drop(X_train_downsampled.columns[[32]],axis=1,inplace = True) to me?
I'd just do all imports at the beginning of the notebook
raw,y_raw,raw_t,y_rar_t = None,None,None,None print (raw)?
can you plot some of the data? Maybe a few each positive and negative examples? It is hard for me to understand what is going on without it, and that might help understand what is going on for you too. May also want to consider doing so before and after your train test splitting as well as resampling so that you can make sure you are not messing anything up in that process
looks like the precision plot is still not making sense if I am understanding correctly
can you remind me what is the true class imbalance?

I think my main feedback is I want to better understand how you are splitting your data before debugging the downstream stuff too much. I am worried that may be part of the issue. I think to do that I would like to see some sample time series from each class, before and after all of your preprocessing. Let me know if that does not make sense or you don't agree

sanika1201 · 2019-12-20T06:49:04Z

@sanika1201 I don't understand why the one commit here is Jesse's, did you mean to PR the notebook somewhere? I know your situation is a bit special, however.

@bdpedigo , I meant to PR to NeuroDataDesign/SPORF, i dont know how the commit got included. Should make a different PR?

sanika1201 · 2019-12-20T07:39:06Z

remove old code that is commented out

when you index by [:, 32] in cell 4, what is that doing?

I don't understand how you are doing the downsampling/resampling whatever. are you just grabbing time points at random?

This line Y_train_downsampled = X_train_downsampled.iloc[:,32] looks suspect to me, can you explain?

Again, can you explain X_train_downsampled.drop(X_train_downsampled.columns[[32]],axis=1,inplace = True) to me?

I'd just do all imports at the beginning of the notebook

raw,y_raw,raw_t,y_rar_t = None,None,None,None print (raw)?

can you plot some of the data? Maybe a few each positive and negative examples? It is hard for me to understand what is going on without it, and that might help understand what is going on for you too. May also want to consider doing so before and after your train test splitting as well as resampling so that you can make sure you are not messing anything up in that process

looks like the precision plot is still not making sense if I am understanding correctly

can you remind me what is the true class imbalance?

I think my main feedback is I want to better understand how you are splitting your data before debugging the downstream stuff too much. I am worried that may be part of the issue. I think to do that I would like to see some sample time series from each class, before and after all of your preprocessing. Let me know if that does not make sense or you don't agree

@bdpedigo I have made the changes we discussed and uploaded the latest code and plots to this PR.

bdpedigo · 2019-12-20T16:37:08Z

@sanika1201 I don't understand why the one commit here is Jesse's, did you mean to PR the notebook somewhere? I know your situation is a bit special, however.

@bdpedigo , I meant to PR to NeuroDataDesign/SPORF, i dont know how the commit got included. Should make a different PR?

would rather you remove just that one commit, i don't like remaking PRs because you lose all of the comments

bdpedigo · 2019-12-20T16:39:54Z

the notebook itself should be part of this PR, just FYI

bdpedigo · 2019-12-20T16:45:15Z

I think we have talked about this already, but moving average filter is not what I meant by binning at all.

Binning for a single channel:

divide single timeseries into n bins, each of width m.
stack those individual bins into a n by m matrix, X. Input X as the training data

Binning for multichannel

For each channel 1...C, form X_1 ... X_C data matrices described above
concatenate columns of X_1 ... X_C to make X_big, a n by C x m matrix

bdpedigo · 2019-12-20T16:46:35Z

does that make sense? I want to make sure I am being clear. Though I think we may be out of time to actually do this right now, but I still want to make sure it is clear for the future.

bdpedigo · 2019-12-20T16:47:00Z

Plots look good though, and I think make more sense than what you have shown in the past

sanika1201 · 2019-12-20T16:55:59Z

I think we have talked about this already, but moving average filter is not what I meant by binning at all.

Binning for a single channel:

divide single timeseries into n bins, each of width m.

stack those individual bins into a n by m matrix, X. Input X as the training data

Binning for multichannel

For each channel 1...C, form X_1 ... X_C data matrices described above

concatenate columns of X_1 ... X_C to make X_big, a n by C x m matrix

Yes, I understand this, and it makes more sense. Due to memory limitations, I decided to down-sample it to one value representing each bin, which was the mean. I went through a few recommendations on kaggle and this was one of the suggestions which gave decent results on Neural Network so i went ahead with this.

bdpedigo · 2019-12-20T16:57:52Z

I see. in that case feels like we are mostly limited by compute power at this point?

sanika1201 · 2019-12-20T17:00:50Z

I see. in that case feels like we are mostly limited by compute power at this point?

Yes. If we can get a little more compute power next semester, will try to get better results on this with the improvements you mentioned above.

bdpedigo · 2019-12-20T17:00:56Z

plots are clear, and this should scale up nicely once we get you some actual compute resources, and at that point i think we will be able to actually compare results. I don't have much more to recommend right now so I think you are done. Nice work!

sanika1201 · 2019-12-20T17:02:23Z

plots are clear, and this should scale up nicely once we get you some actual compute resources, and at that point i think we will be able to actually compare results. I don't have much more to recommend right now so I think you are done. Nice work!

Thanks!

sanika1201 · 2019-12-20T17:39:42Z

the notebook itself should be part of this PR, just FYI

@bdpedigo , I think the other commit got added to this pull request instead of my notebook. Should i just make another PR and link this PR there so that the comments are not lost?

update max_features to accept a fraction > 1.0 (#340)

a7a3c7e

* update max_features to accept a fraction > 1.0 * put inequality in easier to read form.

sanika1201 closed this Dec 20, 2019

sanika1201 reopened this Dec 20, 2019

sanika1201 mentioned this pull request Dec 20, 2019

Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #5

Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #5

sanika1201 commented Dec 9, 2019 •

edited

Loading

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

sanika1201 commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019 •

edited

Loading

sanika1201 commented Dec 20, 2019

Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #5

Are you sure you want to change the base?

Demo on Comparison of performance of S-Rerf against other classifiers on Real EEG data for Grasp detection #5

Conversation

sanika1201 commented Dec 9, 2019 • edited Loading

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

sanika1201 commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019

bdpedigo commented Dec 20, 2019

sanika1201 commented Dec 20, 2019 • edited Loading

sanika1201 commented Dec 20, 2019

sanika1201 commented Dec 9, 2019 •

edited

Loading

sanika1201 commented Dec 20, 2019 •

edited

Loading