Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/linyh97/Nucleus
Browse files Browse the repository at this point in the history
merge test_bert test case from minghao
  • Loading branch information
Michaellee955 committed Dec 15, 2018
2 parents 66dbd82 + b7f7665 commit 4bfe8a5
Show file tree
Hide file tree
Showing 52 changed files with 10,053 additions and 411 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ models/bert/model_data
models/bert/.idea
models/bert/sample_text.txt

draft.py

# C extensions
*.so

Expand Down
15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ In context-free mode, things become more interesting. At the very beginning we o

In context-free mode, you don't need to provide a context, we do this for you - we use abundant wikipedia API to search the most possible page that may contain answer. Calling multiple APIs including Wikipedia API, rake_nltk, etc.

If you have any questions during the installation or operation of Nucleus, please feel free to open an issue.


## Get Started
Expand Down Expand Up @@ -39,7 +40,9 @@ database_pwd = <your_database_password>

### Find model

Download the model via `https://1drv.ms/f/s!AtfKeiTxgnoqjt0M3lrLoowcsjbKcA`, name the whole dir as `model_data`, and put it to `<root>/models/bert`
Download the model via `https://1drv.ms/f/s!AtfKeiTxgnoqjt0M3lrLoowcsjbKcA`, name the whole dir as `model_data`, and put it to `<root>/models/bert` Please note that the r_net mode is now deprecated. You can try it if you want or you only have limited computation resources.

If you cannot download the model, please contact us at `[email protected]`

### Test cases

Expand Down Expand Up @@ -85,3 +88,13 @@ The basic workflow of our context-free mode is:
4. we split these pages into a list of paragraphs, each of which is about 700 characters long;
5. we put the list of paragraphs as contexts and the question into BERT model, and the model returns an answer and a confidence for each of question-context pair;
6. we select the answer with the best confidence, and return it to the user.

## Reference

https://github.com/google-research/bert
https://github.com/HKUST-KnowComp/R-Net
https://github.com/tensorflow/tensorflow
https://github.com/pallets/flask
https://github.com/goldsmith/Wikipedia
https://github.com/capless/warrant
https://github.com/csurfer/rake-nltk
82 changes: 38 additions & 44 deletions application.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,20 @@
from flask import Flask, render_template, redirect, url_for, flash, session, request
import logging
from logging.handlers import RotatingFileHandler
from flask import Flask, render_template, redirect, url_for, session, request
from warrant import Cognito
from config import cognito_userpool_id, cognito_app_client_id
from models.bert.inference_bert import Inference
import wikipedia
from rake_nltk import Rake
from database.db_update_class import db
from database.Database import Database
from datetime import datetime
from wikipedia.exceptions import PageError

inference = Inference()

app = Flask(__name__)
database = db()
database = Database()
KEYWORD_TOP_K = 5
MIN_ANSWER_SCORE = 3

MIN_ANSWER_SCORE = 0
keyword_topk = 5

@app.route('/login', methods=['GET', 'POST'])
def login():
Expand All @@ -35,7 +33,6 @@ def logout():
session.pop('username', None)
return redirect(url_for('login'))


@app.route('/signup', methods=['GET', "POST"])
def signup():
error = None
Expand All @@ -47,7 +44,7 @@ def signup():
cognito.add_base_attributes(email=request.form['email'])
try:
cognito.register(username=request.form['username'], password=request.form['password'])
user_id = database.add_user(request.form['username'],request.form['password'],request.form['email'])
_ = database.add_user(request.form['username'],request.form['password'],request.form['email'])
except Exception as e:
print(e)
error = str(e)
Expand All @@ -56,7 +53,6 @@ def signup():
return redirect(url_for('verification'))
return render_template('signup.html', error=error)


@app.route('/verification', methods=['GET', 'POST'])
def verification():
if 'username' in session:
Expand All @@ -75,30 +71,32 @@ def verification():
else:
return redirect(url_for('login'))


@app.route('/', methods=['GET', 'POST'])
def welcome():
if 'username' in session:
return render_template('welcome.html', username=session['username'])
else:
return redirect(url_for('login'))


@app.route('/with_context', methods=['GET', 'POST'])
def with_context():
if 'username' in session:
if request.method == 'POST':

question = request.form['question']
if not question.endswith('?'):
question += '?'

context = request.form['passage']
qas = {
'question': question,
'context_list': [context]
}

print(qas)

results = inference.response(qas=qas)
if not results:
return redirect(url_for('result_no_answer'))

answer, score = results[0]

if not answer or score < MIN_ANSWER_SCORE:
Expand All @@ -112,7 +110,6 @@ def with_context():
else:
return redirect(url_for('login'))


@app.route('/without_context', methods=['GET', 'POST'])
def without_context():
if 'username' in session:
Expand All @@ -137,31 +134,20 @@ def without_context():

if not context_list:
return redirect(url_for('result_no_answer'))

print("************** begin printing context list *******************")
for context in context_list:
print(context)

qas = {
'question': question,
'context_list': context_list
}

print(qas)

results = inference.response(qas=qas)

final_answer = ""
max_score = float("-inf")
final_context = ""

print("************* begin printing result *******************")

for i, result in enumerate(results):
answer, score = result
print(i)
print(score)
print(answer)
print("********************************************************")

if score > max_score and answer:
final_answer = answer
Expand All @@ -183,33 +169,44 @@ def without_context():
@app.route('/result/<question>/<answer>', methods=['GET', 'POST'])
def result(question="", answer=""):
if 'username' in session:
print("Question", question)
print("Answer", answer)
return render_template('result.html', username=session['username'], question=question, answer=answer)
return render_template('result_with_context.html', username=session['username'], question=question, answer=answer)
else:
return redirect(url_for('login'))

@app.route('/result_no_answer/', methods=['GET', 'POST'])
def result_no_answer(question=""):
def result_no_answer():
if 'username' in session:
print("Question", question)
return render_template('result_no_answer.html', username=session['username'])
else:
return redirect(url_for('login'))

@app.route('/history', methods=['GET', 'POST'])
@app.route('/history', methods=['GET'])
def history():
if 'username' in session:
if request.method == 'POST':
num = request.form['num']
requested_history = database.get_history_list(name=session['username'], limit=num)
# TODO: send history to frontend
return "TO DO"
else:
return render_template('history.html', username=session['username'])
requested_history = database.get_history_list(name=session['username'], limit=5)
if requested_history == -1:
requested_history = [("This will not be shown", "You do not have any question history now", "Go to ask Nucleus something!")]
return render_template('history.html', username=session['username'], requested_history=requested_history)
else:
return redirect(url_for('login'))

@app.route('/history/<cnt>', methods=['GET'])
def history_count(cnt=5):
if 'username' in session:
requested_history = database.get_history_list(name=session['username'], limit=cnt)
if requested_history == -1:
requested_history = [("This will not be shown", "You do not have any question history now", "Go to ask Nucleus something!")]
return render_template('history_count.html', username=session['username'], requested_history=requested_history)
else:
return redirect(url_for('login'))

@app.route('/history_redirect', methods=['GET', 'POST'])
def history_redirect():
if 'username' not in session:
return redirect(url_for('login'))
if request.method == 'POST':
return redirect(url_for('history_count', cnt=request.form['cnt']))
return render_template('history_redirect.html', username=session['username'])

@app.route('/feedback/<question>/<answer>', methods=['GET', 'POST'])
def feedback(question=None, answer=None):
Expand All @@ -222,7 +219,6 @@ def feedback(question=None, answer=None):
else:
return render_template('feedback.html', username=session['username'], question=question, answer=answer)


@app.route('/thankyou', methods=['GET'])
def thankyou():
if 'username' not in session:
Expand All @@ -238,7 +234,6 @@ def valid_login(username, password):
return False
return True


def get_context_list(context, min_len=700):
context_list = []
assert context
Expand All @@ -256,5 +251,4 @@ def get_context_list(context, min_len=700):
app.debug = True
app.secret_key = '\xe3-\xe1\xf7\xfb\x91\xb1\x8c\xae\xf2\xc1BH\xe0/K~~%>ac\t\x01'
app.run()



25 changes: 15 additions & 10 deletions coverage_report.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
Name Stmts Miss Cover
-------------------------------------------------
config.py 6 0 100%
database/db_update_class.py 90 33 63%
models/r_net/func.py 153 42 73%
models/r_net/inference.py 184 5 97%
models/r_net/prepro.py 187 161 14%
test/test_database.py 235 5 98%
-------------------------------------------------
TOTAL 855 246 71%
Name Stmts Miss Cover
---------------------------------------------------
config.py 6 0 100%
database/db_update_class.py 103 35 66%
models/bert/inference_bert.py 518 114 78%
models/bert/modeling.py 301 39 87%
models/bert/optimization.py 68 56 18%
models/bert/tokenization.py 202 42 79%
models/r_net/func.py 153 42 73%
models/r_net/inference.py 183 5 97%
models/r_net/prepro.py 187 161 14%
test/test_bert.py 16 0 100%
test/test_database.py 248 5 98%
---------------------------------------------------
TOTAL 1985 499 75%
101 changes: 101 additions & 0 deletions coverage_report_html/config_py.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@



<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


<meta http-equiv="X-UA-Compatible" content="IE=emulateIE7" />
<title>Coverage for config.py: 100%</title>
<link rel="stylesheet" href="style.css" type="text/css">

<script type="text/javascript" src="jquery.min.js"></script>
<script type="text/javascript" src="jquery.hotkeys.js"></script>
<script type="text/javascript" src="jquery.isonscreen.js"></script>
<script type="text/javascript" src="coverage_html.js"></script>
<script type="text/javascript">
jQuery(document).ready(coverage.pyfile_ready);
</script>
</head>
<body class="pyfile">

<div id="header">
<div class="content">
<h1>Coverage for <b>config.py</b> :
<span class="pc_cov">100%</span>
</h1>

<img id="keyboard_icon" src="keybd_closed.png" alt="Show keyboard shortcuts" />

<h2 class="stats">
6 statements &nbsp;
<span class="run hide_run shortkey_r button_toggle_run">6 run</span>
<span class="mis shortkey_m button_toggle_mis">0 missing</span>
<span class="exc shortkey_x button_toggle_exc">0 excluded</span>


</h2>
</div>
</div>

<div class="help_panel">
<img id="panel_icon" src="keybd_open.png" alt="Hide keyboard shortcuts" />
<p class="legend">Hot-keys on this page</p>
<div>
<p class="keyhelp">
<span class="key">r</span>
<span class="key">m</span>
<span class="key">x</span>
<span class="key">p</span> &nbsp; toggle line displays
</p>
<p class="keyhelp">
<span class="key">j</span>
<span class="key">k</span> &nbsp; next/prev highlighted chunk
</p>
<p class="keyhelp">
<span class="key">0</span> &nbsp; (zero) top of page
</p>
<p class="keyhelp">
<span class="key">1</span> &nbsp; (one) first highlighted chunk
</p>
</div>
</div>

<div id="source">
<table>
<tr>
<td class="linenos">
<p id="n1" class="stm run hide_run"><a href="#n1">1</a></p>
<p id="n2" class="stm run hide_run"><a href="#n2">2</a></p>
<p id="n3" class="stm run hide_run"><a href="#n3">3</a></p>
<p id="n4" class="stm run hide_run"><a href="#n4">4</a></p>
<p id="n5" class="stm run hide_run"><a href="#n5">5</a></p>
<p id="n6" class="stm run hide_run"><a href="#n6">6</a></p>

</td>
<td class="text">
<p id="t1" class="stm run hide_run"><span class="nam">cognito_userpool_id</span> <span class="op">=</span> <span class="str">'us-east-1_sKC0FXdYE'</span><span class="strut">&nbsp;</span></p>
<p id="t2" class="stm run hide_run"><span class="nam">cognito_app_client_id</span> <span class="op">=</span> <span class="str">'6h52ib9acta6l7kpv7oja879eg'</span><span class="strut">&nbsp;</span></p>
<p id="t3" class="stm run hide_run"><span class="nam">database_user_name</span> <span class="op">=</span> <span class="str">'HooliASE'</span><span class="strut">&nbsp;</span></p>
<p id="t4" class="stm run hide_run"><span class="nam">database_endpoint</span> <span class="op">=</span> <span class="str">'minghaoli995.cfyz5fmpzjzj.ap-south-1.rds.amazonaws.com'</span><span class="strut">&nbsp;</span></p>
<p id="t5" class="stm run hide_run"><span class="nam">port</span> <span class="op">=</span> <span class="num">3306</span><span class="strut">&nbsp;</span></p>
<p id="t6" class="stm run hide_run"><span class="nam">database_pwd</span> <span class="op">=</span> <span class="str">'Sumcyq-9cogdy-vymfuw'</span><span class="strut">&nbsp;</span></p>

</td>
</tr>
</table>
</div>

<div id="footer">
<div class="content">
<p>
<a class="nav" href="index.html">&#xab; index</a> &nbsp; &nbsp; <a class="nav" href="https://coverage.readthedocs.io">coverage.py v4.5.2</a>,
created at 2018-12-14 15:28
</p>
</div>
</div>

</body>
</html>
Loading

0 comments on commit 4bfe8a5

Please sign in to comment.