Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sea Turtles - Celina B./Angelica R. #57

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 106 additions & 4 deletions adagrams/game.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,113 @@
from os import truncate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep an eye out for unexpected imports. I don't see this being used, so it's likely VSCode "helpfully" pulled it in at some point.

import random

LETTER_POOL = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer pulling a big data structure like this out of any particular function into a module global. Putting this in the function directly would just make the function harder to read. We just need to be especially careful that we're not doing anything that would cause this to be modified, which would affect future uses of this data.

'A': 9,
'B': 2,
'C': 2,
'D': 4,
'E': 12,
'F': 2,
'G': 3,
'H': 2,
'I': 9,
'J': 1,
'K': 1,
'L': 4,
'M': 2,
'N': 6,
'O': 8,
'P': 2,
'Q': 1,
'R': 6,
'S': 4,
'T': 6,
'U': 4,
'V': 2,
'W': 2,
'X': 1,
'Y': 2,
'Z': 1
}

LETTER_SCORES = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this representation of the scores is minimally repetitive, notice that when we look up scores in this structure later we're not able to use the optimal strategy of looking up directly by a piece of data we have (a letter) to get the value we don't (the score). Check my comments in score_word for some additional thoughts on this.

1: ["A", "E", "I", "O", "U", "L", "N", "R", "S", "T"],
2: ["D", "G"],
3: ["B", "C", "M", "P"],
4: ["F", "H", "V", "W", "Y"],
5: ["K"],
8: ["J", "X"],
10: ["Q", "Z"]
}

def make_letter_pool_list():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice helper to convert from the letter frequency data to a list where the letters present represent the actual "tiles".

letter_pool_list = []

for letter, count in LETTER_POOL.items():
letter_pool_list += [letter]*count

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice use of the the repeater operator!

This would also work as

        letter_pool_list += letter*count

The += operator more or less maps to list extend(), which takes an iterable, not necessarily another list. So the string repeater ("A"*3"AAA") itself produces an iterable (a string) where each character would be iterated over and then added to the pool list.

return letter_pool_list

def draw_letters():
pass
letter_bank = random.sample(make_letter_pool_list(), 10)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really concise approach, but how would you do this random sampling if we didn't have this function? And what sort of performance can we expect from this approach? Thinking through the approach of functions from built-in modules is great practice, and it's possible you may get something like this in an interview. A solution like this is likely to have an interviewer ask you to supply your own implementation, so take every opportunity to practice now!

return letter_bank




def uses_available_letters(word, letter_bank):
pass
# compare the letters in word with the letters in letter_bank
#for letter in word if in letter_bank
# return true if all letters in word are in letter_bank
# else return fals
temp_letter_bank = list(letter_bank)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (shallow) copy is really important, since your validation method would otherwise be destructive to the letter bank (through the use of remove).

for letter in word.upper():
if letter in temp_letter_bank:
temp_letter_bank.remove(letter)
else:
return False
Comment on lines +63 to +67

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like that this implementation exists as soon as it finds an invalid letter.

This approach does contain a nested loop. The outer loop is the iteration over the letters in the word, and the inner loop is iterating over the letters in the bank copy (either by the in operator on line 64, or because removing from a list is linear on line 65). It's not triply-nested, since the iteration in the in operator is done before the iteration in remove (even though they sort of look nested).

This isn't strictly speaking O(n^2), since the length of the word and the letter bank size can differ. So rather than O(n^2) = O(nn), we might describe this as O(nm), where each letter stands for a separate count. We could still think of this as O(n^2) in the worst case where n and m are equal (or approx. equal).

As a result, we might look for alternate approaches. We can't avoid iterating over the letters in the word, but maybe we could think about ways of moving those linear operations which make the inner loop OUT of the loop, and see if there are constant time operations we could then perform for each letter.

What data structures have constant in (membership) checks and could be somehow used to store the letter bank (which might have duplicates)?

return True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
The placement of the return True is really important, since we can't decide that the word is valid until we've checked every letter.

# for letter in letter_bank:
# if letter in word:
# continue
# else:
# return False
# return True



def score_word(word):
pass
if len(word) >= 7:
word_score = 8
else:
word_score = 0
Comment on lines +79 to +82

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice way to handle the length bonus.

for letter_to_score in word.upper():
for score, letter_list in LETTER_SCORES.items():
if letter_to_score in letter_list:
word_score += score
Comment on lines +84 to +86

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where there's a bit of a mismatch between the data we are trying to look up (a letter) and the key in our structure. This approach essentially requires that we (in the worst case) iterate over the entire letter score structure to find each letter. Now given that our code always expects the LETTER_SCORES to be constant, we could write off the whole thing as a constant term (26 letters to check max, and therefore this whole thing is something like O(26n) → O(n)) and not worry about it. But it's also worth taking a step back.

Consider what would happen to our analysis if this score_word were instead a helper function score_word_helper(word, letter_scores) and our score_word looked more like

def score_word(word):
    return score_word_helper(word, LETTER_SCORES)

Now our score_word_helper has two parameters that could vary in length, so now we have a nested loop with two sets of data, and our efficiency is revealed to be O(n*m) or again, something more like O(n^2) in the worst case.

So if we are committed to the idea that our function will only ever have to work with the scores in LETTER_SCORES, we could get away with calling this O(n), but if we step back and recognize that those scores are just another piece of data, we have to admit that this looks more like O(n^2).

Think about how we could improve this by rearranging how we store LETTER_SCORES.

return word_score



def get_highest_word_score(word_list):
pass
highest_word = ("word", 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's true that eventually we want to return a tuple with the winning word and score, starting off with the data structured that way makes things a little more difficult to work with. We need to remember in the code below that position 0 holds the word and position 1 holds the score, which could lead to errors in the code. Instead, if we use two separate variables, say highest_word and highest_score, we are less likely to mix them up. Then at the end, we can return highest_word, highest_score to make our desired tuple.

Also, I would tend to initialize the winning word to None rather than something which might be mistaken for an actual result.

for word in word_list:
score = score_word(word)
if score > highest_word[1]:
highest_word = (word, score)
Comment on lines +93 to +96

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When your implementation has distinct areas of code, it can be helpful to provide a comment on each section describing roughly what it does. This first part looks like it's locating the max score of any word (which word is incidental, but it will be whatever the first word in the list with this score was).

for word in word_list:
score = score_word(word)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we're double calculating the score for each word (we already did it once when looking for the max score). The scoring function isn't free, so we might build a helper structure once at the start of this function that we can use to store the score for each word once, then just look it up any other times we need it. What structure would let as store that information?

if score == highest_word[1]:
if len(word) == len(highest_word[0]) and word != highest_word[0]:
# without the second conditional, the above if statement will
# break when the iterator reaches the same word. we want
# to break the tie by returning the earlier word in the list
# only when we compare different words, hence the != conditional.
break
Copy link

@anselrognlie anselrognlie Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking here would stop looking for any other shorter words. Maybe continue is what we want? It would skip over the current word, but would keep looking at later words in the list which could still be more favorable.

But if the only thing in the condition body is a continue do we even need this condition check?

It's a little difficult to get an intuitive sense of what this is doing, but a good takeaway from this is that if the comment explaining why this is like this is 4 times longer than the original line, maybe another pass is in order. 😄

elif len(word) == 10:
highest_word = (word, score)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safe to immediately break after this. As soon as we find a word whose length is 10, nothing else can beat it.

elif len(highest_word[0]) == 10:
break
Comment on lines +108 to +109

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to do this check earlier in the if/elif chain. These are checked in order, and once we find a 10 letter word, we don't need to do any more checks.

elif len(word) < len(highest_word[0]):
highest_word = (word, score)

return highest_word