-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix check bypass via square brackets Issue #857 #860
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tsigouris007, thank you for your PR!
Could you explain what the bug was?
Hi, |
Hi @tsigouris007, #857 describes the issue, I was looking for an explanation around what's causing said issue in the code |
I haven't quite had the time to find the exact location that brakes this. For example I got
I am taking wild a guess here and I most probably think the problem is how the checked file lines are fetched (and that's why I thought of "escaping" the square brackets earlier). if _is_filtered_out(
required_filter_parameters=['line'],
filename=filename,
line=line,
context=code_snippet,
): If you remove it or add a
I can take a closer look but it will take me at least a couple of days. |
Understanding what exactly is causing the bug will help us finding a good solution moving forward rather than a workaround that could potentially add another set of issues, hence why I'm asking this. Take your time and please let me know what you'll find! |
It's understandable. |
Hi @lorenzodb1 , return re.compile(r'([^\v=!:]*)\s*(:=?|[!=]{1,3})\s*([\w.-]+[\[\(][^\v]*[\]\)])') So the choice is to either change this regex a bit or escape earlier the square brackets. def is_indirect_reference(line: str) -> bool:
"""
Filters secrets that take the form of:
secret = get_secret_key()
or
secret = request.headers['apikey']
"""
# Constrain line length as the heuristic's intention is to target lines that resemble
# function calls. The constraint avoids catastrophic backtracking failures of the regex.
if len(line) > 1000:
return False
line = line.replace('[', '\[').replace(']', '\]')
return bool(_get_indirect_reference_regex().search(line)) This produces the correct results. After playing around with the regex I came up with the following that seems to be working (tested on a huge repo with ~1.2k secrets): def _get_indirect_reference_regex() -> Pattern:
# Regex details:
# ([^\v=!:]*) -> Something before the assignment or comparison
# \s* -> Some optional whitespaces
# (:=?|[!=]{1,3}) -> Assignment or comparison: :=, =, ==, ===, !=, !==
# \s* -> Some optional whitespaces
# (
# [\w.-]+ -> Some alphanumeric character, dot or -
# [\[\(] -> Start of indirect reference: [ or (
# [^\v]* -> Something except line breaks
# [\]\)] -> End of indirect reference: ] or )
# )
# return re.compile(r'([^\v=!:]*)\s*(:=?|[!=]{1,3})\s*([\w.-]+[\[\(][^\v]*[\]\)])') # Left the old one for reference
return re.compile(r'([^\v=!:"<%>]*)\s*(:=?|[!=]{1,3})\s*([\w.-]+[\[\(][^\v]*[\]\)])') How do you want to go through with this? |
Hi @tsigouris007, sorry for the delay in replying. I personally prefer the second option. Thanks for investigating this and coming up with a solution! |
f5be7c0
to
23e5097
Compare
Hi @lorenzodb1 , |
detect_secrets/filters/heuristic.py
Outdated
@@ -197,7 +197,7 @@ def _get_indirect_reference_regex() -> Pattern: | |||
# [^\v]* -> Something except line breaks | |||
# [\]\)] -> End of indirect reference: ] or ) | |||
# ) | |||
return re.compile(r'([^\v=!:]*)\s*(:=?|[!=]{1,3})\s*([\w.-]+[\[\(][^\v]*[\]\)])') | |||
return re.compile(r'([^\v=!:"<%>]*)\s*(:=?|[!=]{1,3})\s*([\w.-]+[\[\(][^\v]*[\]\)])') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsigouris007 could you add a test for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay.
I added two test cases for this change.
Are we good to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what you were trying to fix with this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was catching the first case as that was my initial issue as described above.
So, I beefed up the regex and now it catches both cases (broke it in multiple lines for readability, it's not that far from the original one).
I also added a few more test cases that should be caught following this logic and added a brief explanation for each case.
Let me know if this works for you now.
b7c79e0
to
c4210a1
Compare
Hi @lorenzodb1 , |
Docs have been added / updatedFixes behavior on missed secrets that have square brackets before their default value.
Example:
Issue: #857
Escapes the square brackets and catches secrets properly.
No breaking changes.