You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The "padding" is done while preprocessing the data. We explode the full ordered list of each user ratings into multiple subsequences. Let me illustrate:
We are using each movie on the sequence as a target with its subsequent past, padding with 0s if past is missing.
We reserve the latest window of the sequence 2, [2.3] ,4] as validation as is the latest known step of the user (closest to actual time).
There is no application limit of this model as you are not losing any datapoints, once you are using it for inference you can use any sequence length assuming your batch_size is 1.
Thanks a lot ! You used a third-party library in the model part to build the transformer module.. But will the transformer automatically ignore the loss value caused by padding?
Obviously, only fixed-length sequences are used here, and there is no filling operation, which limits the scope of application of this model.
The text was updated successfully, but these errors were encountered: