-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Api key framework #100
base: master
Are you sure you want to change the base?
Api key framework #100
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
from binascii import hexlify | ||
from os import urandom | ||
|
||
import pymsql | ||
|
||
|
||
"""Constants | ||
Changes here should be reflected in the corresponding keys.sql file if | ||
necessary | ||
""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this isn't actually a docstring /shrug |
||
KEY_LENGTH = 32 | ||
|
||
|
||
def _generate_key(length=KEY_LENGTH): | ||
"""Return a random `length` string to be used as a key | ||
""" | ||
return hexlify(urandom(length)).decode() | ||
|
||
|
||
def get_key(cursor, user): | ||
"""Get the key corresponding to the user in the db | ||
|
||
Args: | ||
cursor (pymysql.cursors.Cursor): database cursor | ||
user (str): user to get the key for | ||
|
||
Returns: | ||
(str) the user's key, or None if there is no matching row | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this docstring format is different than what we use elsewhere in ocflib/ocfweb |
||
""" | ||
cursor.execute( | ||
'SELECT `key`' | ||
'FROM `keys`' | ||
'WHERE `user` LIKE %s', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LIKE should probably be =, we don't want to do wildcard matching here or to be case insensitive There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agreed, should be exact for this |
||
(user,), | ||
) | ||
query_result = cursor.fetchone() | ||
|
||
try: | ||
return query_result['key'] | ||
except (KeyError, TypeError): | ||
# No matching row | ||
return None | ||
|
||
def get_user(cursor, key): | ||
"""Get the user corresponding to the key in the db | ||
|
||
Args: | ||
cursor (pymysql.cursors.Cursor): database cursor | ||
key (str): key that corresponds to the user | ||
|
||
Returns: | ||
(str) user that owns the key or None if there is no matching row | ||
""" | ||
cursor.execute( | ||
'SELECT `user`' | ||
'FROM `keys`' | ||
'WHERE `key` LIKE %s', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. key should probably be hashed in the db? imo it'd be preferable to require api calls to provide both the user and the api key (you can combine them together if you want so that you use one Authorization header or whatever) and have the api keys hashed in the db. then we're not storing plaintext secrets in a database which can easily be abused in the event of an unnoticed leak. (you'll need to store the users in addition to the keys if you want to salt the hashes) same comments here wrt docstring and LIKE There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. after some online research and thought also agree salt+hash is good |
||
(key,), | ||
) | ||
query_result = cursor.fetchone() | ||
|
||
|
||
try: | ||
return query_result['user'] | ||
except (KeyError, TypeError): | ||
# No matching row | ||
return None | ||
|
||
def key_exists(cursor, key): | ||
"""Checks if the key currently exists in the db | ||
|
||
Args: | ||
cursor (pymysql.cursors.Cursor): database cursor | ||
key (str): key to check for | ||
|
||
Returns: | ||
(bool) whether or not the key exists | ||
""" | ||
cursor.execute( | ||
'SELECT 1' | ||
'FROM `keys`' | ||
'WHERE `key` LIKE %s', | ||
(key,), | ||
) | ||
query_result = cursor.fetchone() | ||
|
||
return query_result is not None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. don't think you need this function? |
||
|
||
|
||
def add_key(cursor, user): | ||
"""Add a key for a user into the database | ||
|
||
Args: | ||
cursor (pymysql.cursors.Cursor): database cursor | ||
user (str): user to add the key for | ||
|
||
Returns: | ||
None | ||
""" | ||
key = _generate_key() | ||
|
||
# Re-generate key in case of collision | ||
while key_exists(key): | ||
key = _generate_key() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you just make the key long, you'll never have a collision. i think this is slightly overengineered. |
||
|
||
cursor.execute( | ||
'INSERT INTO `keys`' | ||
'(`key`, `user`)' | ||
'VALUES (%s, %s, %s)', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. two columns, three values? |
||
(key, user) | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we won't be able to launch this unless there's also a way to expire/revoke compromised or unused keys There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. definitely, this is just part 1 of a longer process |
||
|
||
|
||
def get_connection(user, password, db='keys', **kwargs): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. db name should be prefixed with "ocf", maybe called "ocfapikeys" or something |
||
"""Return a connection to MySQL.""" | ||
return pymysql.connect( | ||
user=user, | ||
password=password, | ||
db=db, | ||
host='mysql.ocf.berkeley.edu', | ||
cursorclass=pymysql.cursors.DictCursor, | ||
charset='utf8mb4', | ||
**dict({'autocommit': True}, **kwargs) | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
-- length of key defined by KEY_LENGTH in keys.py | ||
CREATE TABLE IF NOT EXISTS `keys` ( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this definitely needs more fields, e.g. an ID, creation/expiration time, revocation status, and if possible, what endpoints this grants access to, etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
we have 0 endpoints currently (except like tv/hours i guess lmao) so we'd need to figure out a good format for storing this |
||
`key` varchar(32) NOT NULL, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. something is weird with your lengths. the keys you generate are 64 characters long since you hex'd them (even though you "only" have 32 bytes of random), either you'll need to make this column binary 32 bytes and convert back and forth, or make this longer. |
||
`user` varchar(255) NOT NULL | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is user meant to be unique? or are you allowing users to have multiple api keys? |
||
PRIMARY KEY(`key`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'd suggest also storing some data like time generated |
||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why put the api key code in ocflib? should this be in ocfweb instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought the nice thing about ocfweb was that it wasn't tied to any databases and just used ocflib for all of its state-needing things, which does all of the database work? unless i am mistaken
idea was to have ocflib handle stateful operations and ocfweb just calls out to it when necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would be a terrible idea to have the code that interacts with the database itself, e.g. key creation, be in ocflib and have ocfweb call out to it.
There's not really much state being carried here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could argue this, but it was never designed this way. I think the "database in a library" is a really bad pattern for a few reasons:
What is the benefit of adding another layer here? I think it just makes your lives harder in addition to the points above.
State isn't why I think we should avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it seems I was mistaken on the point regarding ocfweb not being tied to any databases. For some reason I thought that this was intentionally done.
In that case, I believe the other points you made make sense, and it would be much easier to test this if it was just on ocfweb instead of ocflib.
If we were to place this in ocfweb instead, would we want to actually connect ocfweb to a mysql database in
settings.py
? And then for non-prod runs we would have it connect to like a sqlite instance instead of mysql so we don't have the problem of ocfweb being harder to start up. Just trying to gauge if that would be a good approach or notThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, yeah. We don't have prior art for that yet, but I think that's the right approach. If we're going to keep writing raw SQL (rather than using sqlalchemy or the django ORM), we'd want to use mysql in dev too though. Maybe make a shared dev database and put creds in the ocfweb config file (settings.py would read this, we don't want to embed creds in settings.py itself).