循环神经网络rnn.py中calc_delta_k方法实现有错误 #29

cyrixlin · 2018-09-20T11:10:44Z

首先，非常感谢《零基础入门深度学习》作者hanbingtao付出的辛苦努力，提供了这么好的教程和代码程序。

在学习的过程中我发现，零基础入门深度学习(5) - 循环神经网络的代码实现rnn.py有个明显错误的地方，并且我用梯度检验程序验过了，确实有问题。现提出问题和解决方法如下，供作者参考。

原方法内容：

def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k+1].copy()
element_wise_op(self.state_list[k+1],
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T

这里存在2处明显的错误：

state 应取self.state_list[k].copy()，而非k+1元素。
state变量取出后，没进行element_wise_op操作，应当放在element_wise_op方法中进行逐元素的activator.backward操作。

分析：
state取self.state_list[k].copy()后，再进行element_wise_op操作，获得激活函数的导数数组，用此k层的数组乘以（k+1层的误差项与W的乘积）才是k层的误差项。

修改如下：
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k].copy()
element_wise_op(state,
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T

验证情况如下：
验证数据调整如下（输入数据调整为4维，输入数据调整为3个）：
def data_set():
x = [np.array([[1], [2], [3], [8]]),
np.array([[2], [3], [4],[-9]]),
np.array([[-1], [-2], [4], [3]])]
d = np.array([[1], [2]])
return x, d

验证程序调整如下（输入数据调整为4维，每层的隐藏神经元数调整为3，输入数据调整为3个）：
def gradient_check():
'''
梯度检查
'''
# 设计一个误差函数，取所有节点输出项之和
error_function = lambda o: o.sum()

rl = RecurrentLayer(4, 3, IdentityActivator(), 1e-3)

# 计算forward值
x, d = data_set()
rl.forward(x[0])
rl.forward(x[1])
rl.forward(x[2])

# 求取sensitivity map
sensitivity_array = np.ones(rl.state_list[-1].shape,
                            dtype=np.float64)
# 计算梯度
rl.backward(sensitivity_array, IdentityActivator())

# 检查梯度
epsilon = 10e-4
for i in range(rl.W.shape[0]):
    for j in range(rl.W.shape[1]):
        rl.W[i,j] += epsilon
        rl.reset_state()
        rl.forward(x[0])
        rl.forward(x[1])
        rl.forward(x[2])
        err1 = error_function(rl.state_list[-1])
        rl.W[i,j] -= 2*epsilon
        rl.reset_state()
        rl.forward(x[0])
        rl.forward(x[1])
        rl.forward(x[2])
        err2 = error_function(rl.state_list[-1])
        expect_grad = (err1 - err2) / (2 * epsilon)
        rl.W[i,j] += epsilon
        print 'weights(%d,%d): expected - actural %f - %f' % (
            i, j, expect_grad, rl.gradient[i,j])

按calc_delta_k的原程序，输出如下：
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural 0.000095 - 1.000000
weights(0,1): expected - actural 0.000372 - 1.000000
weights(0,2): expected - actural 0.000512 - 1.000000
weights(1,0): expected - actural 0.000095 - 1.000000
weights(1,1): expected - actural 0.000372 - 1.000000
weights(1,2): expected - actural 0.000512 - 1.000000
weights(2,0): expected - actural 0.000095 - 1.000000
weights(2,1): expected - actural 0.000372 - 1.000000
weights(2,2): expected - actural 0.000512 - 1.000000

Process finished with exit code 0

按calc_delta_k的修改后的程序，输出如下：
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural -0.001360 - -0.001360
weights(0,1): expected - actural 0.000520 - 0.000520
weights(0,2): expected - actural 0.000452 - 0.000452
weights(1,0): expected - actural -0.001360 - -0.001360
weights(1,1): expected - actural 0.000520 - 0.000520
weights(1,2): expected - actural 0.000452 - 0.000452
weights(2,0): expected - actural -0.001360 - -0.001360
weights(2,1): expected - actural 0.000520 - 0.000520
weights(2,2): expected - actural 0.000452 - 0.000452

Process finished with exit code 0

由此可以验证原程序calc_delta_k函数是不正确的，修改后的是正确的。

The text was updated successfully, but these errors were encountered:

GSD-Dreammark · 2018-10-23T06:36:16Z

这里确实有问题，和文中的公式3明显对不上。问个问题在bp.py文件中的梯度检查方法gradient_check 中计算网络误差 network_error=lambda vec1,vec2:0.5reduce(lambda a,b:a+b,map(lambda v:(v[0]-v[1])(v[0]-v[1]),zip(vec1,vec2)))为啥是这个公式？

JnuSimba · 2023-07-03T23:47:51Z

@cyrixlin 求问这里是不是对应公式4，若是的话 self.W 是否应该改成 self.U ？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

循环神经网络rnn.py中calc_delta_k方法实现有错误 #29

循环神经网络rnn.py中calc_delta_k方法实现有错误 #29

cyrixlin commented Sep 20, 2018 •

edited

Loading

GSD-Dreammark commented Oct 23, 2018

JnuSimba commented Jul 3, 2023

循环神经网络rnn.py中calc_delta_k方法实现有错误 #29

循环神经网络rnn.py中calc_delta_k方法实现有错误 #29

Comments

cyrixlin commented Sep 20, 2018 • edited Loading

GSD-Dreammark commented Oct 23, 2018

JnuSimba commented Jul 3, 2023

cyrixlin commented Sep 20, 2018 •

edited

Loading