-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SquaredDifference values halved? #85
Comments
Good point. You're right, the layer implementation actually computes half of the squared difference (and the gradients accordingly), so As to the reason, this is basically because we wanted to use this layer for regression similar to how some libraries do it (e.g. Caffe's EuclideanLossLayer). This makes the results match for someone coming from another library. Libraries such as Chainer use the 'correct' error though, so may be we should switch to that. We should have a regression example. Any suggestions for the task? |
This probably is the only way to make gradient checking work correctly, though. If we didn't add the constant 1/2, we'd have to multiply errors by 2 during backprop to make everything "mathematically correct" -- in practice we could of course just say that that constant gets absorbed into the learning rate, but it's technically correct to add the 1/2, IMO. |
The backward pass implementation could simply multiply deltas by 2, so the gradient check would work fine. Edit: I meant to say, sure we'll have to modify the backward pass, but this is not a reason to not compute the 'correct' squared difference. |
Or just warn about this somewhere in the docs, though it seems less elegant.
I suggest time series prediction using LSTM :) |
Here's a plan for this issue. We'll change
|
I'm done with making the above change in a private branch. I've named the new layer |
How about EuclideanRE (compatible with SoftmaxCE) |
I don't really like |
I agree, Euclidean loss is not really a name commonly used in NN literature. |
I now realize that |
|
:D Good point. However, I think that I have already changed the older I tried a bit to look for an LSTM regression dataset which would be as recognizable as MNIST/CIFAR but didn't really find one. Open to suggestions. |
Hi everyone!
I'm using brainstorm for multivariable regression with LSTM network and I've just noticed that the
SquaredDifference.outputs.default
values are exactly half of the real squared errors.FYI: using Python 3.3.3 with the latest version of brainstorm from master.
By the way, a regression example would be welcome :)
The text was updated successfully, but these errors were encountered: