Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM hidden layer computation #4

Open
christopher5106 opened this issue Sep 21, 2016 · 2 comments
Open

LSTM hidden layer computation #4

christopher5106 opened this issue Sep 21, 2016 · 2 comments

Comments

@christopher5106
Copy link

Hi,
I'm just wondering why you use this form in https://github.com/coreylynch/grid-lstm/blob/master/model/GridLSTM.lua#L31

local next_h = nn.CMulTable()({out_gate, nn.Tanh()(next_c)})

in the paper it is

local next_h = nn.Tanh()(nn.CMulTable()({out_gate, next_c}))

Thank you for your response

@christopher5106
Copy link
Author

I also have a question about weight sharing: for the time LSTM in your example, weights are not shared between layers (they are shared in time only, thanks to clones) while for the depth LSTM, weights are shared between layers and time. This makes a lot a sense, in fact.

But it surprised me at first read because the "tied N-LSTM" is, by definition, sharing weight along all dimensions.

Either

  1. NOT cloning weights of the depth LSTM in times, or
  2. share also the weights of the time LSTM in depth
    would be more coherent... do you have any idea also ?

Thanks,

@ytoon
Copy link

ytoon commented Dec 16, 2016

I think the paper said the same weight for the time and depth of LSTM. You can refer to the paper 4.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants