Implementation of an AI communicative bot that can answer questions based on a story that is given to the bot.
Example: Story: Betty went to the store. Don ran to the bedroom. Question: Is Don in the store? Answer: No Explanation: Here the model understood that Don did not go to the store and answers accordingly. If the model is asked another question such as "Is Don in the bedroom?" and now the answer will be yes. Dataset Used: babi - Facebook Research
- Model takes a discreet set of inputs x1, ..., xn(Sentences) that are to be stored in the memory, a query q(Question), and outputs an answer a(Yes/No).
- Each of the x, q and a contains symbols that come from a dictionary with V words.
- The model writes all x to the memory up to a fixed buffer size, and then finds a continous representation for the x and q.
(For a quick refresher in word embeddings click here)
The question is also converted into a word embedding giving a result u, the word vectors mi and transpose of u are multiplied and passed through the softmax function to form the vector pi.
Now this resulting pi vector is multiplied with ci vector to give o.
Finally we will pass this vector through the softmax function and get a probability vector that will assign probabilities to all the words in the vocabulary but only the probabilites of the words yes or no will be high.
The above architecture explanation was for one memory hop, this process is repeated multiple times with the output of one memory hop becoming the input of the second memory hop. The sentences (x1,x2,...,xn) remain the same. For illustration refer right half of the diagram.