Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama cannot handle repeated CVs #64

Open
berndbischl opened this issue Mar 11, 2014 · 6 comments
Open

llama cannot handle repeated CVs #64

berndbischl opened this issue Mar 11, 2014 · 6 comments

Comments

@berndbischl
Copy link
Owner

No description provided.

@berndbischl
Copy link
Owner Author

We could IN PRINCIPLE simulate repeated CVs with BatchExperiments.

Although this might be useless work, as handling this in llama out-of-box is still preferable, and "working around this" in this package might be ugly.

@larskotthoff
Copy link
Collaborator

I'm not sure if either this package or LLAMA should provide a facility for this. Providing CV folds as part of the task spec may be useful, but providing detailed specs for a whole series of experiments is going a bit too far I think.

@berndbischl
Copy link
Owner Author

If the data set is so small (or imbalanced or strange or whatever....) 10CV is not the best resampling splitting. So we have to select and store a better one on the server. And handle it in the experiments.

You were the one who suggested that he wants to look at the variance if we change the splitting?

@berndbischl
Copy link
Owner Author

The problem might be, that 10CV is OK, but we as scientists might worry and we would only stop worrying when we SEE that the variance is not so big.

Like I said, BatchExperiments would allow to do that in like 10 minutes of coding.

I will do this later (after submitting the paper), then we can study the effect.

@larskotthoff
Copy link
Collaborator

I completely agree, I just don't think that this should be part of the package itself.

@berndbischl
Copy link
Owner Author

I think we do this:
convert2llama will create a list of what it creates now, for the reps of the CV.
We will then iter over this, maybe directly with BE, like I said, then it is auto-parallel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants