Skip to content

Implementation of an AI bot that can answer questions based on a story that is given to the bot.

Notifications You must be signed in to change notification settings

mahajanhrishikesh/End-To-End-Memory-Network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End to End Memory Networks

The project will be an implementation of the following paper:

End to End Memory Networks
Implementation of an AI communicative bot that can answer questions based on a story that is given to the bot.
Example:
    Story: Betty went to the store. Don ran to the bedroom.
    Question: Is Don in the store?
    Answer: No
Explanation:
    Here the model understood that Don did not go to the store and answers accordingly. 
    If the model is asked another question such as "Is Don in the bedroom?" and now the 
    answer will be yes.
Dataset Used: babi - Facebook Research

How the bot works:

  • Model takes a discreet set of inputs x1, ..., xn(Sentences) that are to be stored in the memory, a query q(Question), and outputs an answer a(Yes/No).
  • Each of the x, q and a contains symbols that come from a dictionary with V words.
  • The model writes all x to the memory up to a fixed buffer size, and then finds a continous representation for the x and q.

Model Architecture

Single memory hop:

Concentrating on the left part of the image. A set of sentences x1, x2,..., xn is fed to the network, these sentences are from the story provided to the network. The sentences are converted into two identical memory vector representations namely mi and ci.
(For a quick refresher in word embeddings click here)

The question is also converted into a word embedding giving a result u, the word vectors mi and transpose of u are multiplied and passed through the softmax function to form the vector pi.

Now this resulting pi vector is multiplied with ci vector to give o.

Finally we will pass this vector through the softmax function and get a probability vector that will assign probabilities to all the words in the vocabulary but only the probabilites of the words yes or no will be high.



Mulitple Layers

The above architecture explanation was for one memory hop, this process is repeated multiple times with the output of one memory hop becoming the input of the second memory hop. The sentences (x1,x2,...,xn) remain the same. For illustration refer right half of the diagram.

About

Implementation of an AI bot that can answer questions based on a story that is given to the bot.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published