The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/617_scores.html b/617_scores.html
new file mode 100644
index 0000000..1eea5ab
--- /dev/null
+++ b/617_scores.html
@@ -0,0 +1,514 @@
+J! Archive - Show #617, aired 1987-04-21
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/618.html b/618.html
new file mode 100644
index 0000000..8af2207
--- /dev/null
+++ b/618.html
@@ -0,0 +1,1336 @@
+J! Archive - Show #618, aired 1987-04-22
(Alex: And finally, a new category for us, [*], and I say new because all of the material in this category was written or co-written by a man who specializes in silly songs, Jeff Barry.)
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/618_scores.html b/618_scores.html
new file mode 100644
index 0000000..5f84c2b
--- /dev/null
+++ b/618_scores.html
@@ -0,0 +1,493 @@
+J! Archive - Show #618, aired 1987-04-22
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/619.html b/619.html
new file mode 100644
index 0000000..64a7cf3
--- /dev/null
+++ b/619.html
@@ -0,0 +1,1462 @@
+J! Archive - Show #619, aired 1987-04-23
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/619_scores.html b/619_scores.html
new file mode 100644
index 0000000..b5d2ff6
--- /dev/null
+++ b/619_scores.html
@@ -0,0 +1,542 @@
+J! Archive - Show #619, aired 1987-04-23
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/620.html b/620.html
new file mode 100644
index 0000000..64f5c3e
--- /dev/null
+++ b/620.html
@@ -0,0 +1,1462 @@
+J! Archive - Show #620, aired 1987-04-24
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/620_scores.html b/620_scores.html
new file mode 100644
index 0000000..90364bb
--- /dev/null
+++ b/620_scores.html
@@ -0,0 +1,542 @@
+J! Archive - Show #620, aired 1987-04-24
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/621.html b/621.html
new file mode 100644
index 0000000..0c378db
--- /dev/null
+++ b/621.html
@@ -0,0 +1,1300 @@
+J! Archive - Show #621, aired 1987-04-27
Group which topped the country charts with the following song about a truck driver:
[Truck noises] "Roll on, highway / Roll on along / Roll on, Daddy, 'til you get back home / Roll on family / Roll on crew / Roll on, Mama, like I asked you to do / And roll on eighteen wheeler, roll on / (Roll on!)..."
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/621_scores.html b/621_scores.html
new file mode 100644
index 0000000..7dec504
--- /dev/null
+++ b/621_scores.html
@@ -0,0 +1,479 @@
+J! Archive - Show #621, aired 1987-04-27
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/732.html b/732.html
new file mode 100644
index 0000000..00d8ac5
--- /dev/null
+++ b/732.html
@@ -0,0 +1,1497 @@
+J! Archive - Show #732, aired 1987-11-10
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/732_scores.html b/732_scores.html
new file mode 100644
index 0000000..530aed5
--- /dev/null
+++ b/732_scores.html
@@ -0,0 +1,555 @@
+J! Archive - Show #732, aired 1987-11-10
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/736.html b/736.html
new file mode 100644
index 0000000..3d22a1b
--- /dev/null
+++ b/736.html
@@ -0,0 +1,1480 @@
+J! Archive - Show #736, aired 1987-11-16
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/736_scores.html b/736_scores.html
new file mode 100644
index 0000000..273a30b
--- /dev/null
+++ b/736_scores.html
@@ -0,0 +1,549 @@
+J! Archive - Show #736, aired 1987-11-16
The J! Archive is created by fans, for fans. The Jeopardy! game show and all elements thereof, including but not limited to copyright and trademark thereto, are the property of Jeopardy Productions, Inc. and are protected under law. This website is not affiliated with, sponsored by, or operated by Jeopardy Productions, Inc.
\ No newline at end of file
diff --git a/ b/
new file mode 100644
index 0000000..28e22b1
--- /dev/null
+++ b/
@@ -0,0 +1,400 @@
+from BeautifulSoup import BeautifulSoup as Soup
+from soupselect import select
+import re, csv
+from metacategories import *
+def strip_tags(s):
+ return re.sub(r'<[^>]*>', '', s)
+def extract_clue_attribute(c, attr):
+ return strip_tags(str(select(c, attr))).strip()[1:-1]
+def mean(nums):
+ if len(nums):
+ return float( sum(nums) / len(nums))
+ else:
+ return 0.0
+def unescape(s):
+ s = s.replace("<", "<")
+ s = s.replace(">", ">")
+ s = s.replace("&", "&")
+ return s
+class Clue(object):
+ def __init__(self):
+ super(Clue, self).__init__()
+ self.wrong = []
+ self.right = []
+ self.dollars = None
+ self.question = None
+ self.answer = None
+ self.order = None
+ self.round = None # zero-indexed
+ self.category = None
+ self.daily_double = False
+ def __str__(self):
+ return """%s: %s (%s) // %s // %s
+ Correct: %s
+ Wrong: %s
+ """ % (self.category, self.answer, self.dollars, self.order, self.question, str(self.right), str(self.wrong))
+GAMES = ('617','618','619','620','621','732','736')
+collected_clues = {}
+collected_scores = {}
+# ------ grab clues for the game ------
+for game_number in GAMES:
+ collected_clues[game_number] = []
+ f = open('%s.html' % game_number, 'r')
+ soup = Soup(
+ f.close()
+ KEY = {
+ 'clue_value': 'dollars',
+ 'clue_order_number': 'order',
+ 'clue_text': 'answer',
+ }
+ for (round_number, round) in enumerate(select(soup, 'table.round')):
+ categories = []
+ for category in select(round, 'td.category_name'):
+ categories.append(strip_tags(str(category)).strip())
+ for (i, clue) in enumerate(select(round, 'td.clue')):
+ # skip if empty
+ if len(strip_tags(str(clue)).strip())==0:
+ continue
+ c = Clue()
+ c.round = round_number
+ c.category = categories[i % len(categories)]
+ for (html_class, attr_name) in KEY.items():
+ x = extract_clue_attribute(clue, 'td.%s' % html_class)
+ setattr(c, attr_name, x)
+ mouseover_html = re.sub(r'\'\)\s*$','',clue.findAll('div')[0]['onmouseover'].split('\', \'')[2])
+ div_soup = Soup(mouseover_html)
+ # get wrong answerer
+ for wrong in select(div_soup, 'td.wrong'):
+ wrong_answer = strip_tags(str(wrong)).strip()
+ if wrong_answer.lower()!='triple stumper':
+ c.wrong.append(wrong_answer)
+ # get right answerer
+ for right in select(div_soup, 'td.right'):
+ c.right.append(strip_tags(str(right)).strip())
+ # get question
+ c.question = extract_clue_attribute(div_soup, 'em.correct_response')
+ # check for daily double
+ dd = select(clue, 'td.clue_value_daily_double')
+ dd_text = strip_tags(str(dd).strip())
+ if len(dd_text)>0:
+ if dd_text[1:4].upper()=='DD:':
+ c.daily_double = True
+ c.dollars = int(re.sub(r'[^\d]', '', dd_text))
+ collected_clues[game_number].append(c)
+# ------ grab scores for the game ------
+for game_number in GAMES:
+ collected_scores[game_number] = []
+ f = open('%s_scores.html' % game_number, 'r')
+ soup = Soup(
+ f.close()
+ players = []
+ for (round_number, table) in enumerate(select(soup, 'table.scores_table')):
+ collected_scores[game_number].append({})
+ if len(players)==0:
+ for player_nickname in select(table, 'td.score_player_nickname'):
+ players.append(strip_tags(str(player_nickname)).strip())
+ rows = select(table, 'tr')
+ for (i, row) in enumerate(rows):
+ if i==0:
+ continue
+ cells = select(row, 'td')
+ for (i,cell) in enumerate(cells):
+ if (i!=0) and (i!=4):
+ if not collected_scores[game_number][round_number].has_key(players[i-1]):
+ collected_scores[game_number][round_number][players[i-1]] = []
+ collected_scores[game_number][round_number][players[i-1]].append( re.sub(r'[^\d]', '', strip_tags(str(cell)).strip()) )
+# --- check categories for inclusion in metacategories ---
+missing = {}
+for game_number in GAMES:
+ for clue in collected_clues[game_number]:
+ cat = clue.category
+ found_cat = False
+ if normalize_category_name(cat) in METACATEGORIES[mc]:
+ found_cat = True
+ # print 'found %s in %s' % (cat, mc)
+ if not found_cat:
+ missing[normalize_category_name(cat)] = True
+assert len(missing.keys())==0, "Some categories aren't matched properly in the METACATEGORIES dict"
+# --- figure out DD stats ---
+wager_sizes = {'RICHARD': [], 'OTHER': []}
+for game_number in GAMES:
+ for c in collected_clues[game_number]:
+ if c.daily_double:
+ player = ''
+ if len(c.right):
+ player = c.right[0]
+ else:
+ player = c.wrong[0]
+ score_after_dd = collected_scores[game_number][c.round][player][int(c.order) - 1]
+ score_before_dd = collected_scores[game_number][c.round][player][int(c.order) - 2]
+ score_change = int(score_after_dd) - int(score_before_dd)
+ wager_pct = abs(score_change) / (int(score_before_dd) * 1.0)
+ if player.upper()=='RICHARD':
+ wager_sizes['RICHARD'].append(wager_pct)
+ else:
+ wager_sizes['OTHER'].append(wager_pct)
+true_dd = {'RICHARD': 0, 'OTHER': 0}
+for player in wager_sizes:
+ for w in wager_sizes[player]:
+ if w==1.0:
+ true_dd[player] += 1
+print "avg Daily Double pct (Richard/Other): %0.2f%% / %0.2f%%" % ((mean(wager_sizes['RICHARD']) * 100), (mean(wager_sizes['OTHER']) * 100))
+print "True Daily Doubles (Richard/Other): (%d/%d) / (%d/%d)" % (true_dd['RICHARD'], len(wager_sizes['RICHARD']), true_dd['OTHER'], len(wager_sizes['OTHER']))
+# --- figure out in-control stats ---
+selections_in_control = 0
+total_selections = 0
+categories_sought = {}
+categories_avoided = {}
+categories_available = {}
+for game_number in GAMES:
+ in_control = ''
+ sorted_clues = [[], []] # by-round ordered list of clues
+ category_counts = [{}, {}] # keeps track of which cats are still available as round progresses
+ for clue in collected_clues[game_number]:
+ sorted_clues[int(clue.round)].append(clue)
+ category_counts[int(clue.round)][normalize_category_name(clue.category)] = 5
+ for i in range(0, len(sorted_clues)):
+ sorted_clues[i].sort(key=lambda c: int(c.order))
+ for (clue_round, clue_round_contents) in enumerate(sorted_clues):
+ for clue in sorted_clues[clue_round]:
+ # a clue has been picked by Richard. What now?
+ if in_control.upper().strip()=='RICHARD':
+ # 1. note that he's made an in-control pick
+ selections_in_control += 1
+ # 2. note that he sought this category
+ if not categories_sought.has_key(normalize_category_name(clue.category)):
+ categories_sought[normalize_category_name(clue.category)] = 0
+ categories_sought[normalize_category_name(clue.category)] += 1
+ # 3. note the other available categories that he avoided, and which were available
+ for cat in category_counts[clue_round]:
+ if category_counts[clue_round][cat]>0: # are there still clues left in this category?
+ # increment count of which categories were available at the time, including the selected one
+ if not categories_available.has_key(normalize_category_name(clue.category)):
+ categories_available[normalize_category_name(clue.category)] = 0
+ categories_available[normalize_category_name(clue.category)] += 1
+ # increment count of all avoided categories -- every one BUT the selected one
+ if cat!=normalize_category_name(clue.category):
+ if not categories_avoided.has_key(normalize_category_name(clue.category)):
+ categories_avoided[normalize_category_name(clue.category)] = 0
+ categories_avoided[normalize_category_name(clue.category)] += 1
+ # did anyone answer correctly? if so, they're now in control
+ if len(clue.right)>0:
+ in_control = clue.right[0]
+ # the availability of this category -- whether chosen by Richard or not -- needs to be decremented
+ category_counts[clue_round][normalize_category_name(clue.category)] -= 1
+ # the total number of selections ought to go up
+ total_selections += 1
+print "Richard in control %d of %d selections (%0.2f%%)" % (selections_in_control, total_selections, (((selections_in_control) / (total_selections / 1.0)) * 100))
+mc_sought = {}
+mc_avoided = {}
+mc_availability = {}
+ mc_sought[mc] = 0
+ mc_avoided[mc] = 0
+ mc_availability[mc] = 0
+# for sanity-checking that more clues were passed up than taken
+total_sought = 0
+total_avoided = 0
+total_available = 0
+# tabulate metacategories sought
+for c in categories_sought:
+ if c in METACATEGORIES[mc]:
+ mc_sought[mc] += categories_sought[c]
+ total_sought += categories_sought[c]
+# tabulate metacategories avoided
+for c in categories_avoided:
+ if c in METACATEGORIES[mc]:
+ mc_avoided[mc] += categories_avoided[c]
+ total_avoided += categories_avoided[c]
+# tabulate metacategory availability
+for c in categories_available:
+ if c in METACATEGORIES[mc]:
+ mc_availability[mc] += categories_available[c]
+ total_available += categories_available[c]
+# find categories always selected/avoided
+always_selected = []
+always_avoided = []
+for (c, count) in categories_available.items():
+ if count==categories_sought[c]:
+ always_selected.append("%s (%d times)" % (c, count))
+ if count>0 and categories_sought[c]==0:
+ always_avoided.append("%s (%d times)" % (c, count))
+# normalization to reflect metacategory size -- necessary?
+# for mc in mc_avoided:
+# mc_avoided[mc] = mc_avoided[mc] / (len(METACATEGORIES[mc]) * 1.0)
+# for mc in mc_sought:
+# mc_sought[mc] = mc_sought[mc] / (len(METACATEGORIES[mc]) * 1.0)
+print "Total sought: %d" % total_sought
+print "Total avoided: %d" % total_avoided
+print 'Metacategories sought:'
+for (c, count) in mc_sought.items():
+ print '%s: %d (%.2f%%)' % (c, count, (100 * count / (1.0 * total_sought)))
+print 'Metacategories avoided:'
+for (c, count) in mc_avoided.items():
+ print '%s: %d (%.2f%%)' % (c, count, (100 * count / (1.0 * total_avoided)))
+print 'Metacategory selection pct (how often category chosen when available):'
+for (c, count) in mc_sought.items():
+ print '%s: %d/%d (%.2f%%)' % (c, count, mc_availability[c], (100* count / (int(mc_availability[c]*1.0))))
+print 'Categories always selected: %s' % (len(always_selected) and ', '.join(always_selected) or 'None!')
+print 'Categories always avoided: %s' % (len(always_avoided) and (', '.join(always_avoided)) or 'None!')
+f = open('stats.csv', 'w')
+writer = csv.writer(f)
+# --- figure out score versus average opponent over course of game ----
+richard_game_progress = {}
+opponent_game_progress = {}
+for game_number in GAMES:
+ richard_game_progress[game_number] = []
+ opponent_game_progress[game_number] = {}
+ for round_number in (0,1):
+ for player in collected_scores[game_number][round_number]:
+ if not opponent_game_progress[game_number].has_key(player) and player.upper().strip()!='RICHARD':
+ opponent_game_progress[game_number][player] = []
+ if player.strip().upper()=='RICHARD':
+ for score in collected_scores[game_number][round_number][player]:
+ richard_game_progress[game_number].append(score)
+ else:
+ for score in collected_scores[game_number][round_number][player]:
+ opponent_game_progress[game_number][player].append(score)
+max_score_length = 0
+header_row = []
+for game in GAMES:
+ header_row.append('Richard %s' % game)
+ for player in opponent_game_progress[game]:
+ header_row.append('%s %s' % (player, game))
+ max_score_length = max(max_score_length, len(richard_game_progress[game]))
+for i in range(0, max_score_length):
+ row = []
+ for game in GAMES:
+ if i...')
+select(soup, 'div')
+- returns a list of div elements
+select(soup, 'div#main ul a')
+- returns a list of links inside a ul inside div#main
+import re
+tag_re = re.compile('^[a-z0-9]+$')
+attribselect_re = re.compile(
+ r'^(?P\w+)?\[(?P\w+)(?P[=~\|\^\$\*]?)' +
+ r'=?"?(?P[^\]"]*)"?\]$'
+# /^(\w+)\[(\w+)([=~\|\^\$\*]?)=?"?([^\]"]*)"?\]$/
+# \---/ \---/\-------------/ \-------/
+# | | | |
+# | | | The value
+# | | ~,|,^,$,* or =
+# | Attribute
+# Tag
+def attribute_checker(operator, attribute, value=''):
+ """
+ Takes an operator, attribute and optional value; returns a function that
+ will return True for elements that match that combination.
+ """
+ return {
+ '=': lambda el: el.get(attribute) == value,
+ # attribute includes value as one of a set of space separated tokens
+ '~': lambda el: value in el.get(attribute, '').split(),
+ # attribute starts with value
+ '^': lambda el: el.get(attribute, '').startswith(value),
+ # attribute ends with value
+ '$': lambda el: el.get(attribute, '').endswith(value),
+ # attribute contains value
+ '*': lambda el: value in el.get(attribute, ''),
+ # attribute is either exactly value or starts with value-
+ '|': lambda el: el.get(attribute, '') == value \
+ or el.get(attribute, '').startswith('%s-' % value),
+ }.get(operator, lambda el: el.has_key(attribute))
+def select(soup, selector):
+ """
+ soup should be a BeautifulSoup instance; selector is a CSS selector
+ specifying the elements you want to retrieve.
+ """
+ tokens = selector.split()
+ current_context = [soup]
+ for token in tokens:
+ m = attribselect_re.match(token)
+ if m:
+ # Attribute selector
+ tag, attribute, operator, value = m.groups()
+ if not tag:
+ tag = True
+ checker = attribute_checker(operator, attribute, value)
+ found = []
+ for context in current_context:
+ found.extend([el for el in context.findAll(tag) if checker(el)])
+ current_context = found
+ continue
+ if '#' in token:
+ # ID selector
+ tag, id = token.split('#', 1)
+ if not tag:
+ tag = True
+ el = current_context[0].find(tag, {'id': id})
+ if not el:
+ return [] # No match
+ current_context = [el]
+ continue
+ if '.' in token:
+ # Class selector
+ tag, klass = token.split('.', 1)
+ if not tag:
+ tag = True
+ found = []
+ for context in current_context:
+ found.extend(
+ context.findAll(tag,
+ {'class': lambda attr: attr and klass in attr.split()}
+ )
+ )
+ current_context = found
+ continue
+ if token == '*':
+ # Star selector
+ found = []
+ for context in current_context:
+ found.extend(context.findAll(True))
+ current_context = found
+ continue
+ # Here we should just have a regular tag
+ if not tag_re.match(token):
+ return []
+ found = []
+ for context in current_context:
+ found.extend(context.findAll(token))
+ current_context = found
+ return current_context
+def monkeypatch(BeautifulSoupClass=None):
+ """
+ If you don't explicitly state the class to patch, defaults to the most
+ common import location for BeautifulSoup.
+ """
+ if not BeautifulSoupClass:
+ from BeautifulSoup import BeautifulSoup as BeautifulSoupClass
+ BeautifulSoupClass.findSelect = select
+def unmonkeypatch(BeautifulSoupClass=None):
+ if not BeautifulSoupClass:
+ from BeautifulSoup import BeautifulSoup as BeautifulSoupClass
+ delattr(BeautifulSoupClass, 'findSelect')