Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Arises during pretraining #1

Open
saugatabose28 opened this issue Jun 4, 2021 · 4 comments
Open

Error Arises during pretraining #1

saugatabose28 opened this issue Jun 4, 2021 · 4 comments

Comments

@saugatabose28
Copy link

Hi

I have been trying implementing your text classification code. But I have encountered an issue during pretraining phase. The following error has arisen during execution:
"

> AttributeError                            Traceback (most recent call last)
> 
> <ipython-input-223-27df7ff3f6f4> in <module>()
>       6          optimizer=optimizer,
>       7          scheduler=scheduler,
> ----> 8          num_epochs=NUM_EPOCHS)
>       9 
> 
> 4 frames
> 
> <ipython-input-214-4ac07f19f9a1> in pretrain(model, optimizer, train_iter, valid_iter, scheduler, valid_period, num_epochs)
>      26                            attention_mask=mask)
>      27 
> ---> 28             loss = torch.nn.CrossEntropyLoss()(y_pred, target)
>      29 
>      30             loss.backward()
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
>     887             result = self._slow_forward(*input, **kwargs)
>     888         else:
> --> 889             result = self.forward(*input, **kwargs)
>     890         for hook in itertools.chain(
>     891                 _global_forward_hooks.values(),
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
>    1046         assert self.weight is None or isinstance(self.weight, Tensor)
>    1047         return F.cross_entropy(input, target, weight=self.weight,
> -> 1048                                ignore_index=self.ignore_index, reduction=self.reduction)
>    1049 
>    1050 
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
>    2691     if size_average is not None or reduce is not None:
>    2692         reduction = _Reduction.legacy_get_string(size_average, reduce)
> -> 2693     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
>    2694 
>    2695 
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel, dtype)
>    1670         dim = _get_softmax_dim("log_softmax", input.dim(), _stacklevel)
>    1671     if dtype is None:
> -> 1672         ret = input.log_softmax(dim)
>    1673     else:
>    1674         ret = input.log_softmax(dim, dtype=dtype)
> 
> AttributeError: 'str' object has no attribute 'log_softmax'
>

"
Have you ever faced such issue during execution?

@aramakus
Copy link
Owner

aramakus commented Jun 5, 2021

Thanks for reporting the issue.
The issue is that this line in code in the ROBERTAClassifier declaration needs to be updated from

self.roberta = RobertaModel.from_pretrained('roberta-base')

to

RobertaModel.from_pretrained('roberta-base', return_dict=False)

At the time when this notebook was created, the model returned tuple by default. Probably this has changed some time later, so that it needs to be explicitly configured now. I will update this notebook to implement the changes. This should fix the issue you've got.

@saugatabose28
Copy link
Author

aramkus Hi
sorry for the delayed response. Thank you very much for your kind support. I am overwhelmed that for me you have tried to find out the reason (mostly people forget their work and dont feel to re-work on the same task. thanks.)
I am implementing RoBERTa in one of my research. I would like refer you in my paper. Would be glad if you share any of your paper. Thanks,

@aramakus
Copy link
Owner

aramakus commented Jun 8, 2021

Hi saugatabose28,
Sorry with the delayed reply. My research publications are mostly in modelling and data analysis for physics, NLP is something I like and am passionate about. You are welcome to refer to my Medium post of course. For academic references you might find links to the original papers that discuss batch size, learning scheduler, in my Medium post. Good luck with your research!

@saugatabose28
Copy link
Author

Thanks aramakus. One final thing I would like to draw your attention. Is RoBERTa a voracious RAM eater? is it a slow learner. Actually, I am trying to run our model on one of the dataset, and RoBERTa has been continuing training on train data for over 48 hours (out of 12 epochs, its completed 6 right now.) have you faced the similar issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants