We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
首先,非常感谢《零基础入门深度学习》作者hanbingtao付出的辛苦努力,提供了这么好的教程和代码程序。
在学习的过程中我发现,零基础入门深度学习(5) - 循环神经网络的代码实现rnn.py有个明显错误的地方,并且我用梯度检验程序验过了,确实有问题。现提出问题和解决方法如下,供作者参考。
原方法内容:
def calc_delta_k(self, k, activator): ''' 根据k+1时刻的delta计算k时刻的delta ''' state = self.state_list[k+1].copy() element_wise_op(self.state_list[k+1], activator.backward) self.delta_list[k] = np.dot( np.dot(self.delta_list[k+1].T, self.W), np.diag(state[:,0])).T
这里存在2处明显的错误:
分析: state取self.state_list[k].copy()后,再进行element_wise_op操作,获得激活函数的导数数组,用此k层的数组乘以(k+1层的误差项与W的乘积)才是k层的误差项。
修改如下: def calc_delta_k(self, k, activator): ''' 根据k+1时刻的delta计算k时刻的delta ''' state = self.state_list[k].copy() element_wise_op(state, activator.backward) self.delta_list[k] = np.dot( np.dot(self.delta_list[k+1].T, self.W), np.diag(state[:,0])).T
验证情况如下: 验证数据调整如下(输入数据调整为4维,输入数据调整为3个): def data_set(): x = [np.array([[1], [2], [3], [8]]), np.array([[2], [3], [4],[-9]]), np.array([[-1], [-2], [4], [3]])] d = np.array([[1], [2]]) return x, d
验证程序调整如下(输入数据调整为4维,每层的隐藏神经元数调整为3,输入数据调整为3个): def gradient_check(): ''' 梯度检查 ''' # 设计一个误差函数,取所有节点输出项之和 error_function = lambda o: o.sum()
rl = RecurrentLayer(4, 3, IdentityActivator(), 1e-3) # 计算forward值 x, d = data_set() rl.forward(x[0]) rl.forward(x[1]) rl.forward(x[2]) # 求取sensitivity map sensitivity_array = np.ones(rl.state_list[-1].shape, dtype=np.float64) # 计算梯度 rl.backward(sensitivity_array, IdentityActivator()) # 检查梯度 epsilon = 10e-4 for i in range(rl.W.shape[0]): for j in range(rl.W.shape[1]): rl.W[i,j] += epsilon rl.reset_state() rl.forward(x[0]) rl.forward(x[1]) rl.forward(x[2]) err1 = error_function(rl.state_list[-1]) rl.W[i,j] -= 2*epsilon rl.reset_state() rl.forward(x[0]) rl.forward(x[1]) rl.forward(x[2]) err2 = error_function(rl.state_list[-1]) expect_grad = (err1 - err2) / (2 * epsilon) rl.W[i,j] += epsilon print 'weights(%d,%d): expected - actural %f - %f' % ( i, j, expect_grad, rl.gradient[i,j])
按calc_delta_k的原程序,输出如下: D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py weights(0,0): expected - actural 0.000095 - 1.000000 weights(0,1): expected - actural 0.000372 - 1.000000 weights(0,2): expected - actural 0.000512 - 1.000000 weights(1,0): expected - actural 0.000095 - 1.000000 weights(1,1): expected - actural 0.000372 - 1.000000 weights(1,2): expected - actural 0.000512 - 1.000000 weights(2,0): expected - actural 0.000095 - 1.000000 weights(2,1): expected - actural 0.000372 - 1.000000 weights(2,2): expected - actural 0.000512 - 1.000000
Process finished with exit code 0
按calc_delta_k的修改后的程序,输出如下: D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py weights(0,0): expected - actural -0.001360 - -0.001360 weights(0,1): expected - actural 0.000520 - 0.000520 weights(0,2): expected - actural 0.000452 - 0.000452 weights(1,0): expected - actural -0.001360 - -0.001360 weights(1,1): expected - actural 0.000520 - 0.000520 weights(1,2): expected - actural 0.000452 - 0.000452 weights(2,0): expected - actural -0.001360 - -0.001360 weights(2,1): expected - actural 0.000520 - 0.000520 weights(2,2): expected - actural 0.000452 - 0.000452
由此可以验证原程序calc_delta_k函数是不正确的,修改后的是正确的。
The text was updated successfully, but these errors were encountered:
这里确实有问题,和文中的公式3明显对不上。问个问题在bp.py文件中的梯度检查方法gradient_check 中计算网络误差 network_error=lambda vec1,vec2:0.5reduce(lambda a,b:a+b,map(lambda v:(v[0]-v[1])(v[0]-v[1]),zip(vec1,vec2)))为啥是这个公式?
Sorry, something went wrong.
@cyrixlin 求问这里是不是对应公式4,若是的话 self.W 是否应该改成 self.U ?
No branches or pull requests
首先,非常感谢《零基础入门深度学习》作者hanbingtao付出的辛苦努力,提供了这么好的教程和代码程序。
在学习的过程中我发现,零基础入门深度学习(5) - 循环神经网络的代码实现rnn.py有个明显错误的地方,并且我用梯度检验程序验过了,确实有问题。现提出问题和解决方法如下,供作者参考。
原方法内容:
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k+1].copy()
element_wise_op(self.state_list[k+1],
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T
这里存在2处明显的错误:
分析:
state取self.state_list[k].copy()后,再进行element_wise_op操作,获得激活函数的导数数组,用此k层的数组乘以(k+1层的误差项与W的乘积)才是k层的误差项。
修改如下:
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k].copy()
element_wise_op(state,
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T
验证情况如下:
验证数据调整如下(输入数据调整为4维,输入数据调整为3个):
def data_set():
x = [np.array([[1], [2], [3], [8]]),
np.array([[2], [3], [4],[-9]]),
np.array([[-1], [-2], [4], [3]])]
d = np.array([[1], [2]])
return x, d
验证程序调整如下(输入数据调整为4维,每层的隐藏神经元数调整为3,输入数据调整为3个):
def gradient_check():
'''
梯度检查
'''
# 设计一个误差函数,取所有节点输出项之和
error_function = lambda o: o.sum()
按calc_delta_k的原程序,输出如下:
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural 0.000095 - 1.000000
weights(0,1): expected - actural 0.000372 - 1.000000
weights(0,2): expected - actural 0.000512 - 1.000000
weights(1,0): expected - actural 0.000095 - 1.000000
weights(1,1): expected - actural 0.000372 - 1.000000
weights(1,2): expected - actural 0.000512 - 1.000000
weights(2,0): expected - actural 0.000095 - 1.000000
weights(2,1): expected - actural 0.000372 - 1.000000
weights(2,2): expected - actural 0.000512 - 1.000000
Process finished with exit code 0
按calc_delta_k的修改后的程序,输出如下:
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural -0.001360 - -0.001360
weights(0,1): expected - actural 0.000520 - 0.000520
weights(0,2): expected - actural 0.000452 - 0.000452
weights(1,0): expected - actural -0.001360 - -0.001360
weights(1,1): expected - actural 0.000520 - 0.000520
weights(1,2): expected - actural 0.000452 - 0.000452
weights(2,0): expected - actural -0.001360 - -0.001360
weights(2,1): expected - actural 0.000520 - 0.000520
weights(2,2): expected - actural 0.000452 - 0.000452
Process finished with exit code 0
由此可以验证原程序calc_delta_k函数是不正确的,修改后的是正确的。
The text was updated successfully, but these errors were encountered: