Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

derive_named_parameters() method that works with latest SQLite #2354

Closed
simonw opened this issue Jun 12, 2024 · 5 comments
Closed

derive_named_parameters() method that works with latest SQLite #2354

simonw opened this issue Jun 12, 2024 · 5 comments

Comments

@simonw
Copy link
Owner

simonw commented Jun 12, 2024

Related:

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

Code to fix, because the opcode trick no longer works:

@documented
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
"""
Given a SQL statement, return a list of named parameters that are used in the statement
e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
"""
explain = "explain {}".format(sql.strip().rstrip(";"))
possible_params = _re_named_parameter.findall(sql)
try:
results = await db.execute(explain, {p: None for p in possible_params})
return [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"]
except sqlite3.DatabaseError:
return possible_params

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

Got ChatGPT Code Interpreter to have a go at this for me: https://chatgpt.com/share/f2ce4904-184c-4825-847d-30467c3a8236 - it came up with a pattern that first strips all comments, single-quoted and double-quoted strings and then extracts parameters from what's left.

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

This new version of the function passes all of the existing tests:

@documented
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
    """
    Given a SQL statement, return a list of named parameters that are used in the statement

    e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
    """
    # Remove single-line comments
    sql = re.sub(r"--.*", "", sql)
    # Remove multi-line comments
    sql = re.sub(r"/\*.*?\*/", "", sql, flags=re.DOTALL)
    # Remove single-quoted strings
    sql = re.sub(r"'(?:''|[^'])*'", "", sql)
    # Remove double-quoted strings
    sql = re.sub(r'"(?:\"\"|[^"])*"', "", sql)
    # Extract parameters from what is left
    return re.findall(r":(\w+)", sql)

But... it doesn't need to take the db argument any more and it doesn't need to be async - but it is a documented function (glad we are not at the 1.0 final release).

I'm going to add a new, non-async function and switch to that, but I'll leave an async undocumented version in there so plugins that use it don't break.

simonw added a commit that referenced this issue Jun 12, 2024
simonw added a commit to asg017/datasette that referenced this issue Jun 12, 2024
@simonw simonw closed this as completed Jun 12, 2024
@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

New implementation:

_single_line_comment_re = re.compile(r"--.*")
_multi_line_comment_re = re.compile(r"/\*.*?\*/", re.DOTALL)
_single_quote_re = re.compile(r"'(?:''|[^'])*'")
_double_quote_re = re.compile(r'"(?:\"\"|[^"])*"')
_named_param_re = re.compile(r":(\w+)")
@documented
def named_parameters(sql: str) -> List[str]:
"""
Given a SQL statement, return a list of named parameters that are used in the statement
e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
"""
# Remove single-line comments
sql = _single_line_comment_re.sub("", sql)
# Remove multi-line comments
sql = _multi_line_comment_re.sub("", sql)
# Remove single-quoted strings
sql = _single_quote_re.sub("", sql)
# Remove double-quoted strings
sql = _double_quote_re.sub("", sql)
# Extract parameters from what is left
return _named_param_re.findall(sql)
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
"""
This undocumented but stable method exists for backwards compatibility
with plugins that were using it before it switched to named_parameters()
"""
return named_parameters(sql)

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

I don't think those comments are needed with the clear names for the _re compiled regexes.

simonw added a commit that referenced this issue Jun 12, 2024
simonw added a commit that referenced this issue Jun 21, 2024
@simonw simonw mentioned this issue Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant