-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[待读论文提交示例]Visualizing and understanding recurrent networks #1
Comments
[论文领取示例] 这篇论文我领了 |
Linusp
added a commit
that referenced
this issue
Mar 12, 2017
我已经阅读了 https://github.com/swarma/papernotes 中的说明文字, @Linusp 你的上一条评论 #1 (comment) 意思是不是说你打算阅读这篇论文并分享阅读笔记? |
@pimgeek 是的,我正在修改参与方式说明,希望能更友好一些。能帮忙测试一个功能么? 麻烦看看你能不能点一下右上角的「Assignees」选中自己? |
并不能,是灰色无法点选状态 |
@pimgeek ok,3q,我试试把你加到项目里来,你再看看可以不? |
@pimgeek 我把你加到「论文阅读小组」里了,并且给这个 team 添加了对这个项目的读写权限,你再试试看? |
可以了 |
@pimgeek 👍 |
[论文笔记提交示例]作者
观点
模型/实验/结论数据集:
模型:
实验:
(图片可拖拽上传)
结论:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
作者
发表时间
2015 年
摘要
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.
推荐理由
LSTM 的优异性能已经得到了普遍的认可,但对于其内部机制仍然缺少足够的研究,本文通过对 RNN/LSTM/GRU 进行语言建模训练,分析了它们在相同问题上的差异,以及 LSTM/GRU 内部模块的细致分析,并对 LSTM 的错误做了分类,对理解 LSTM/GRU 的内在机制是一个非常好的参考。
The text was updated successfully, but these errors were encountered: