You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I also have a question about weight sharing: for the time LSTM in your example, weights are not shared between layers (they are shared in time only, thanks to clones) while for the depth LSTM, weights are shared between layers and time. This makes a lot a sense, in fact.
But it surprised me at first read because the "tied N-LSTM" is, by definition, sharing weight along all dimensions.
Either
NOT cloning weights of the depth LSTM in times, or
share also the weights of the time LSTM in depth
would be more coherent... do you have any idea also ?
Hi,
I'm just wondering why you use this form in https://github.com/coreylynch/grid-lstm/blob/master/model/GridLSTM.lua#L31
local next_h = nn.CMulTable()({out_gate, nn.Tanh()(next_c)})
in the paper it is
local next_h = nn.Tanh()(nn.CMulTable()({out_gate, next_c}))
Thank you for your response
The text was updated successfully, but these errors were encountered: