diff --git a/chapter09_natural-language-processing/tree-lstm.ipynb b/chapter09_natural-language-processing/tree-lstm.ipynb index a78c821..8483eda 100644 --- a/chapter09_natural-language-processing/tree-lstm.ipynb +++ b/chapter09_natural-language-processing/tree-lstm.ipynb @@ -6,17 +6,50 @@ "source": [ "# Tree LSTM modeling for semantic relatedness\n", "\n", + "Just five years ago, many of the most successful models for doing supervised learning with text\n", + "ignored word order altogether. \n", + "Some of the most successful models represented documents or sentences \n", + "with the order-invariant *bag-of-words* representation.\n", + "Anyone thinking hard should probably have realized that these models couldn't dominate forever.\n", + "That's because we all know that word order actually does matter.\n", + "Bag-of-words models, which ignored word order, left some information on the table.\n", + "\n", + "The recurrent neural networks that\n", + "[we introduced in chapter 5](../chapter05_recurrent-neural-networks/simple-rnn.ipynb)\n", + "model word order, by passing over the sequence of words in order,\n", + "updating the models representation of the sentence after each word. \n", + "And, with LSTM recurrent cells and training on GPUS, \n", + "even the straightforward LSTM far outpaces classical approaches,\n", + "on a number of tasks, including language modeling,\n", + "named entity recognition and more. \n", + "\n", + "But while those models are impressive, they still may be leaving some knowledge on the table.\n", + "To begin with, we know a priori that sentence have a grammatical structure. \n", + "And we already have some tools that are very good at recovering parse trees that reflect grammatical structure of the sentences.\n", + "While it may be possible for an LSTM to learn this informatino implicitly,\n", + "it's often a good idea to build known information into the structure of a neural network.\n", + "Take for example convolutional neural networks.\n", + "They build in the prior knowledge that low level feature should be translation-invariant.\n", + "It's possible to come up with a fully connected net that does the same thing,\n", + "but it would require many more nodes and would be much more susceptible to overfitting. \n", + "In this case, we would like to build the grammatical tree structure of the sentences \n", + "into the architecture of an LSTM recurrent neural network.\n", + "This tutorial walks through *tree LSTMs*,\n", + "an approach that does precisely that.\n", + "The models here are based on the [tree-structured LSTM](https://nlp.stanford.edu/pubs/tai-socher-manning-acl2015.pdf)\n", + "by Kai Sheng Tai, Richard Socher, and Chris Manning.\n", + "Our implementation borrows from [this Pytorch example](https://github.com/dasguptar/treelstm.pytorch).\n", + "\n", + "\n", "### Sentences involving Compositional Knowledge\n", "This tutorial walks through training a child-sum Tree LSTM model for analyzing semantic relatedness of sentence pairs given their dependency parse trees.\n", "\n", "### Preliminaries\n", - "Requires the latest MXNet with the new `gluon` interface. One can either build from source or install the pre-release package through `pip install --pre mxnet`. Use of GPUs is preferred if one wants to run the complete training to match the state-of-the-art results.\n", - "\n", - "Besides, to show a progress meter, one should install the `tqdm` (\"progress\" in Arabic) through `pip install tqdm`. One should also install the HTTP library through `pip install requests`.\n", - "\n", + "Before getting going, you'll probably want to note a couple preliminary details:\n", "\n", - "### Inspiration\n", - "This tutorial borrows heavily from [Pytorch](https://github.com/dasguptar/treelstm.pytorch) example." + "* Use of GPUs is preferred if one wants to run the complete training to match the state-of-the-art results.\n", + "* To show a progress meter, one should install the `tqdm` (\"progress\" in Arabic) through `pip install tqdm`. One should also install the HTTP library through `pip install requests`.\n", + "\n" ] }, { @@ -654,7 +687,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.5.3" + "version": "3.4.3" } }, "nbformat": 4,