-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corpus reader can create batch size of 0 even when training prefixes are present #188
Comments
persephone/persephone/tests/experiments/test_na.py Lines 97 to 122 in d73db45
Specifically on line 126 |
Added some tests for |
I see in the constructor there's the following: if not num_train:
if not batch_size:
batch_size = 64
num_train = len(corpus.get_train_fns()[0]) This will find the Do we want to warn if no feature files are found? It seems as though this class has a precondition that the feature files are already preprocessed and that a warning would make sense here. |
I just pushed a check in train_batch_gen() that throws an exception if the Corpus has no training utterances. I'm removing the bug status now, I'll keep the issue open since there's an open question of whether to move some of that batch_size logic in the constructor to .train_batch_gen() since it's only relevant to that. The reason I don't want to do this immediately is because it's probably a good idea to load the dev and test data in batches too since conceivably they won't all fit in memory. Perhaps the simplest option is to just have the batch_size argument have a kwarg default that is reasonable (32 or something) and use this in generating train, dev and test batches. I can get rid of most of the logic, including that exception that talks about divisibility of num_train and batch_size I made a while back. Who cares, the model should just feed the remainder in as a batch smaller than the others. |
There's a situation I ran into with the
test_fast
intest_na.py
where if the batch size is not specified the CorpusReader creates a batch size of 0. This seems to be a bug.The text was updated successfully, but these errors were encountered: