Error Arises during pretraining #1

saugatabose28 · 2021-06-04T16:02:08Z

Hi

I have been trying implementing your text classification code. But I have encountered an issue during pretraining phase. The following error has arisen during execution:
"

> AttributeError                            Traceback (most recent call last)
> 
> <ipython-input-223-27df7ff3f6f4> in <module>()
>       6          optimizer=optimizer,
>       7          scheduler=scheduler,
> ----> 8          num_epochs=NUM_EPOCHS)
>       9 
> 
> 4 frames
> 
> <ipython-input-214-4ac07f19f9a1> in pretrain(model, optimizer, train_iter, valid_iter, scheduler, valid_period, num_epochs)
>      26                            attention_mask=mask)
>      27 
> ---> 28             loss = torch.nn.CrossEntropyLoss()(y_pred, target)
>      29 
>      30             loss.backward()
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
>     887             result = self._slow_forward(*input, **kwargs)
>     888         else:
> --> 889             result = self.forward(*input, **kwargs)
>     890         for hook in itertools.chain(
>     891                 _global_forward_hooks.values(),
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
>    1046         assert self.weight is None or isinstance(self.weight, Tensor)
>    1047         return F.cross_entropy(input, target, weight=self.weight,
> -> 1048                                ignore_index=self.ignore_index, reduction=self.reduction)
>    1049 
>    1050 
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
>    2691     if size_average is not None or reduce is not None:
>    2692         reduction = _Reduction.legacy_get_string(size_average, reduce)
> -> 2693     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
>    2694 
>    2695 
> 
> /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel, dtype)
>    1670         dim = _get_softmax_dim("log_softmax", input.dim(), _stacklevel)
>    1671     if dtype is None:
> -> 1672         ret = input.log_softmax(dim)
>    1673     else:
>    1674         ret = input.log_softmax(dim, dtype=dtype)
> 
> AttributeError: 'str' object has no attribute 'log_softmax'
>

"
Have you ever faced such issue during execution?

The text was updated successfully, but these errors were encountered:

aramakus · 2021-06-05T08:26:11Z

Thanks for reporting the issue.
The issue is that this line in code in the ROBERTAClassifier declaration needs to be updated from

self.roberta = RobertaModel.from_pretrained('roberta-base')

to

RobertaModel.from_pretrained('roberta-base', return_dict=False)

At the time when this notebook was created, the model returned tuple by default. Probably this has changed some time later, so that it needs to be explicitly configured now. I will update this notebook to implement the changes. This should fix the issue you've got.

saugatabose28 · 2021-06-06T17:02:16Z

aramkus Hi
sorry for the delayed response. Thank you very much for your kind support. I am overwhelmed that for me you have tried to find out the reason (mostly people forget their work and dont feel to re-work on the same task. thanks.)
I am implementing RoBERTa in one of my research. I would like refer you in my paper. Would be glad if you share any of your paper. Thanks,

aramakus · 2021-06-08T14:37:36Z

Hi saugatabose28,
Sorry with the delayed reply. My research publications are mostly in modelling and data analysis for physics, NLP is something I like and am passionate about. You are welcome to refer to my Medium post of course. For academic references you might find links to the original papers that discuss batch size, learning scheduler, in my Medium post. Good luck with your research!

saugatabose28 · 2021-06-08T16:35:24Z

Thanks aramakus. One final thing I would like to draw your attention. Is RoBERTa a voracious RAM eater? is it a slow learner. Actually, I am trying to run our model on one of the dataset, and RoBERTa has been continuing training on train data for over 48 hours (out of 12 epochs, its completed 6 right now.) have you faced the similar issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Arises during pretraining #1

Error Arises during pretraining #1

saugatabose28 commented Jun 4, 2021

aramakus commented Jun 5, 2021 •

edited

Loading

saugatabose28 commented Jun 6, 2021

aramakus commented Jun 8, 2021

saugatabose28 commented Jun 8, 2021

Error Arises during pretraining #1

Error Arises during pretraining #1

Comments

saugatabose28 commented Jun 4, 2021

aramakus commented Jun 5, 2021 • edited Loading

saugatabose28 commented Jun 6, 2021

aramakus commented Jun 8, 2021

saugatabose28 commented Jun 8, 2021

aramakus commented Jun 5, 2021 •

edited

Loading