-
-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security/EscapeOutput: use PHPCSUtils, allow for modern PHP and slew of bug fixes #2326
Merged
dingo-d
merged 44 commits into
develop
from
feature/escapeoutput-review-phpcsutils-modern-php
Jul 28, 2023
Merged
Security/EscapeOutput: use PHPCSUtils, allow for modern PHP and slew of bug fixes #2326
dingo-d
merged 44 commits into
develop
from
feature/escapeoutput-review-phpcsutils-modern-php
Jul 28, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
... to allow for adding additional test case files.
... to prevent it interfering with other tests.
The `Tokens::$magicConstants` array was introduced to PHPCS in version 3.5.6, so we can now defer to the upstream array.
* Use token constants instead of type strings. * Have the token constant as both the key and the value.
Operator tokens in PHP are not insecure, it is the **_operands_** which are potentially insecure. As things were, we've seen all sorts of messages in bugs reported, where the sniff would report on `&&` or `===` tokens, which is nonsensical. The "safe components" list already contained a number of operators, most notably the arithmetic operators (though the PHP 5.6+ `**` (`T_POW`) operator was missing). I'm proposing to expand this list with a much larger list of operators using the predefined token arrays from PHPCS itself + one of the PHPCSUtils token arrays + the `T_BOOLEAN_NOT` (`!`) operator. With this expansion, we shouldn't be seeing anymore issues reported for the sniff flagging operators, which should make debugging the issues reported more straight forward. The only operators I have not (yet) included in the list are the assignment operators as assignments within an output statement should be exceedingly rare anyway. Note: in all honesty, the sniff should take operator precedence into account properly and, for instance, disregard a `$var` or function call after a boolean operator as that is automatically cast to a boolean. Similarly, the operands before and after an arithmetic operator will always be cast to numeric. Having said that, that's a rabbit hole for a future release as there's enough to fix in the sniff without going down that path. Includes unit tests, all of which would throw an error on the operator prior to this change.
Split off the code which does the actual output escaping check into a separate `check_code_is_escaped()` method. No functional changes.
Split off the code which checks for function calls into a separate `process_tstring_token()` method. No functional changes. :point_right: This commit will be easier to review while ignoring whitespace changes.
As things were, the sniff would look for `T_STRING` tokens with certain contents, but never do a proper check if those were actually function calls. This would lead to false positives, but also potentially to an `Internal.Exception` for the `$end_of_statement` not being defined. This commit is a first step towards fixing this. Includes tests for select situations fixed by this particular commit.
19 tasks
... which is needed for the switch to another base class is the next commit.
…on calls As things were, the sniff would look for `T_STRING` tokens with certain contents, but never do a proper check if those were actually function calls to the WP/PHP native _global_ function. This would lead to false positives for calls to methods, function declarations etc. By changing the parent class to the `AbstractFunctionRestrictionsSniff`, we can re-use the "if this a global function call" logic already in place in that abstract, while still maintaining the handling of the other tokens in the pre-existing manner. There are three caveats to this: 1. The `AbstractFunctionRestrictionsSniff` contains an `exclude` property, which allows to silence notices about certain function for a complete group. As this is a security sniff, we don't want end-users to be able to silence notices about groups of functions in that way. To that end, the ability for end-users to set the `exclude` property for this sniff is being disabled. 2. The `AbstractFunctionRestrictionsSniff` class will call the `process_matched_token()` method once for each match in each function group. As the "unsafe printing functions" are also included in the "printing functions", this would lead to the `process_matched_token()` method being called twice for those methods, leading to duplicate error notices. I'm preventing this by removing the "unsafe printing functions" from the "printing functions" list prior to passing it off to the parent class. 3. Lastly, the `AbstractFunctionRestrictionsSniff` class expects the list of target functions to be available when the sniff is instantiated. However, in test situations, that list can fluctuate, as the "printing functions" supports the ability to add "custom printing functions", which is done on the fly in the test case file using `// phpcs:set ...`. To get round that, I'm re-registering the "function groups" every time a `T_STRING` (potential function call) triggers the sniff, but only when the tests are being run. While it is possible for end-users to use the `// phpcs:set` syntax, it is little known and is not mentioned anywhere in the documentation as a feature for end-users, so just re-registering the function groups for test runs would seem sufficient. Includes tests for situations fixed by this particular commit. Includes a test using the PHP 8.0+ nullsafe object operator. Note: the `PrintingFunctionsTrait::is_printing_function()` has now become unused and _could_ be removed. For consistency with other traits/helper classes, I'm electing to leave it be (for now) though.
While the initial function name check was previously already made case-insensitive in 2273 (commit 53309ac) and remained case-insensitive with the switch to the `AbstractFunctionRestrictionsSniff` base class, the _special casing_ function name checks were **not** done in a case-insensitive manner. This could lead to false positives as well as false negatives. Fixed now. Includes tests proving the bug.
The `'raw'` key in the parameter arrays returned from the `PassedParameters` class contains - as per the name - the _raw_ contents of the parameter. Since PHPCSUtils 1.0.0-alpha4, the return array also contain a `'clean'` index, which contains the contents of the parameter cleaned of comments. By switching to using that key, a potential false positive gets fixed. Includes unit test demonstrating the issue and safeguarding the fix.
Minor tweaks to the code in the `process_token()` method for improved comprehensibility of the code. Rename some variables, add some documentation, split up long conditions. Includes preventing the code from changing the `$stackPtr` as that is just really confusing.
`exit`/`die` are language constructs and can be used as part of an expression. Additionally, an `exit`/`die` can be stand-alone or can pass a status code/message. If a status code/message is passed, we want it to be escaped for output, but a status code/message can only be passed within parentheses. As things were, the sniff would not ignore `exit`/`die` statements without status and when used within an expression, would walk the tokens after the `exit`/`die`, leading to false positives. Fixed now by using custom logic for `T_EXIT` tokens. Includes parse error/live coding protection, where the sniff should stay silent. As the method will now use a `switch` control structure, move the special casing for `T_STRING` also into the `switch`. Includes tests for situations fixed by this particular commit. :point_right: This commit will be easier to review while ignoring whitespace changes.
…print statements `print` is a language construct and can be used as part of an expression. The sniff did not take this into account correctly. This would lead to the sniff walking too far and examining code which is not part of the `print` statement. This would lead to both false positives as well as false negatives. This commit improves that situation, but doesn't completely fix it. First off, the sniff now uses the `BCFile::findEndOfStatement()` method to more accurately retrieve the end token for the expression. This should improve the handling of the most common cases. This part includes live coding/parse error protection for unfinished `print` statements at the end of a file. However, the `[BC]File::findEndOfStatement()` method does not take ternaries into account when determining the end of the expression. So, secondly, code has been added to special case `print` statements in the "then" part of a ternary and determine the end of such an expression correctly. Includes tests for situations fixed by this particular commit. Additional notes: * I can still come up with more (exotic) test cases with `print` in ternaries which would still not be handled correctly. I'm leaving those as "future scope". Example: ```php ( $fop === 1 ) ? true && print $foo : print $bar; // The first print will now examine everything to the end of the statement, not to the end of the expression, which means that `$bar` will not be flagged as the sniff bow out before it reached $bar. ``` Fixes 2209
…ements Bow out earlier in case of a parse error for echo statements and short open echo tag statements. While the sniff would not throw false positives/negatives as the `for()` loop, which does the token walking in the `check_code_is_escaped()` method, would not get started, as `$i < false` as stop condition would always return `true`, we can (and should) bow out earlier for parse errors/live coding. Fixed now. Includes switching the order of the token searches for maximum efficiency. Includes tests safeguarding the tweak.
The refactor in the previous few commits "broke" the preliminary check for a non-parenthesized ternary ìn a `print` statement. There were no tests covering this. I'm fixing this now by moving that particular code block out of the `switch` and executing it for both `print`, `echo` and `<?=` (though not for `exit`/`die` statements as those will always have parentheses, otherwise there will be nothing to check). I'm also extracting the actual logic for finding the ternary to a separate method, as this logic will be expanded in follow-up commits. Includes adding tests safeguarding the logic.
…naries As things were, the check for statements with ternaries without wrapping parentheses, did not take into account that the whole statement could be nested within parentheses if the statement is a `print`. This commit updates the code block handling the skipping of the condition for a ternary without wrapping parentheses to allow for the statement being _nested in_ parentheses. Includes adding tests safeguarding the fix.
…ts with ternaries While the previous commit already prevented a ternary being regarded as the "applicable" ternary when it wasn't at the right nesting level, that still disregarded the fact that ternaries can be chained and that an "applicable" ternary token could be found _after_ a ternary which should be disregarded. This led to false positives. This commit changes the logic in the `find_ternary()` method to properly search the token group for an applicable ternary token, which fixes that issue. Includes tests proving the bug and safeguarding the fix.
As things were, the prelim check for statements with ternaries without wrapping parentheses, did not take into account that there could be a parentheses at the start and at the end of the statement, which didn't belong together. In that case, the preliminary ternary check should still be executed. Fixed now. Includes adding tests safeguarding the fix. Fixes 677 This also fixes the third case reported in 1507
As things were, when an open parenthesis was found in the loop, the sniff would attempt to find a relevant ternary operator and if found, skip over the condition. However, this check did not take nested parentheses into account correctly, which meant that if the complete ternary was wrapped in parentheses, but the condition was _not_ wrapped in parentheses and the "then" or the "else" part _would_ contain parentheses, the ternary was incorrectly disregarded. Fixed now by switching the logic used within the loop over to use the new `find_ternary()` method, which should handle the determination of whether the ternary is applicable correctly. Note: it is important for the `$start` token passed to the `find_ternary()` method to be the token _after_ the open parenthesis, otherwise, the method will not look _within_ the parentheses. Includes adding tests safeguarding the fix. Fixes 1219
While the use of short ternaries is/will be discouraged in WP, this doesn't mean it won't be used in non-WP Core code. When a short ternary is used, the sniff should **not** ignore the code before the ternary operator, but should check it for being correctly escaped as the value will be used if it is "truthy". Fixed by changing the `find_ternary()` method to a `find_long_ternary()` method, which disregards short ternaries. Includes tests proving the bug and safeguarding the fix. Fixes 1617
Minor tweaks to the code in the `process_matched_token()` method for improved comprehensibility of the code. Rename some variables, simplify/split up long conditions. Includes preventing the code from changing the `$stackPtr` as that is just really confusing.
Bow out earlier. There is one function call related check which may bow out without checking parameters, so move that check up for efficiency.
If the parameter which needs to be examined doesn't exist, we can skip the complete function call.
If a non-function call `T_STRING` token would be named the same as any of the escaping/formatting/array walking functions the sniff looks for, the sniff would walk to the next parenthesis opener (if found) and skip everything in between. Fixed now. Includes a test proving the bug and safeguarding the fix. Note: the test shows the sniff now throwing two errors, one for the `$var` and one for the `unrelatedFunction()` function call. The sniff does not throw an error for the (unescaped) constant being used. This may need to be looked into separately.
…arameters correctly As things were, the sniff would examine all parameters from a function call as one long code block. This would easily lead to false positives (and possibly also false negatives) largely related to the sniff not correctly starting/stopping the watching for each new parameter and getting confused over ternaries (again). This commit make two closely related changes to fix this. 1. It changes the behaviour of the sniff to examine each function call parameter individually. 2. It moves the "should a ternary condition at the start of the code snippet be skipped ?" check, which was previously only used for `echo`/`print`/`<?=` statements/expressions, to the `check_code_is_escaped()` method, which means it will now be applied to each individual parameter. This gives much higher accuracy when examining parameters. As a side-effect of this change, function calls to printing functions using named parameters will not yield false positives on the parameter label. Includes tests proving the bug(s) and safeguarding the fix. Fixes the issue reported in WordPress/WordPress-Coding-Standards 677#issuecomment-470407780
….1 octal literals No changes needed. Just making sure.
1. Adjusted the way the correct parameters are retrieved from a call to `trigger_error()`/`user_error()` to use the new PHPCSUtils 1.0.0-alpha4 `PassedParameters::getParameterFromStack()` method. 2. The parameter names used are in line with the names as per the PHP 8.0 release. PHP itself renamed a lot of parameters in PHP 8.0. As named parameters did not exist before PHP 8.0, the parameter name as per PHP 8.0 (or above) are the only relevant names. Includes additional unit tests.
1. Adjusted the way the correct parameters are retrieved from a call to `_deprecated_file()` to use the new PHPCSUtils 1.0.0-alpha4 `PassedParameters::getParameterFromStack()` method. 2. Verified the parameter names used are in line with the name as per the WP 6.3 release. WP has been renaming parameters and is probably not done yet, but it doesn't look like those changes (so far) made it into changelog entries.... For the purposes of this exercise, I've taken the _current_ parameter name as the "truth" as support for named parameters hasn't officially been announced yet, so any renames _after_ this moment are the only ones relevant. Includes additional unit tests.
PHP 8.0 introduced match expressions, which can be used inline, which means they can be used in an output generating statement. If the match expression is wrapped inside an escaping function, it is already handled correctly. This commit is the first step in adding additional support for `match` expressions to the sniff and adds logic to disregard a complete match expression if it is preceded by a "safe cast". Includes tests.
PHP 8.0 introduced match expressions, which can be used inline, which means they can be used in an output generating statement. This second commit adds logic to correctly check _inside_ a match expression. In a `match`, there can be multiple matching conditions before the double arrow and one statement to be returned after the double arrow. So basically, we should ignore everything _before_ the double arrow as it is effectively a comparison. And we should verify that everything after the double arrow - up to the first comma at the same "level" as the arrow - is escaped correctly. Now as the conditions, when there is more than one, are also comma-separated, the parsing of a `match` expression is non-trivial. With this in mind, I'm introducing a helper method to do the parsing. This helper method will pass each individually identified "return value" part to the `check_code_is_escaped()` method for stand-alone analysis. Once the method is finished analyzing the `match` expression, it will hand back off to the original call to `check_code_is_escaped()` and continue analyzing the code after the `match` expression. Note: I'm considering adding a utility method to PHPCSUtils for the parsing of `match` expressions. If and when that method is available, we should be able to simplify the code I've now added to the sniff. Includes tests.
The result of `Name::class` or the PHP 8.0+ `$obj::class` will always be a plain string, along the same lines as the use of `__CLASS__`, so this should be allowed and not need escaping. This commit adds support for that code pattern to the sniff, which fixes some false positives. Includes tests. Fixes 1989
… PHP 8.0+ throw expressions Any exception which isn't caught runs the risk of being displayed and should therefore be output escaped, along the same lines as is already required for parameters passed to `trigger_error()` function calls. This commit add the `T_THROW` token to the tokens the sniff listens for. Parameters passed to the exception creation function call/class instantiation will be examined for being correctly escaped. Notes: * As custom exceptions may expect different parameters from the PHP native parameters, *all* parameters will be examined, not just the `$message` parameter. * When a `throw` statement is wrapped within a `try - catch` control structure, it is presumed that the exception will be caught and the `throw` statement will be ignored. * The current logic supports a _lot_ of different ways of exception creation, but does not (yet) support variable variables for the exception name and other exotics like that. Includes tests. Fixes 884
When the sniff would encountered one of the keywords the sniff listens to as an inline expression in the statement which was being examined, the sniff would flag the keyword as needing escaping. This is wrong as that expression should be examined separately by the sniff (but wouldn't be as the sniff skips to the end of the examined code). This commit fixes that by adjusting the logic in the loop in the `check_code_is_escaped()` method to check for keywords the sniff listens too and then recursing into the sniff again to examine that expression separately, after which the loop for the original statement will continue from the end of the examined expression. Includes tests. Fixes 1861
…functions As things were, the sniff will flag "unsafe printing functions" when encountered, but if the `UnsafePrintingFunction` error code is silenced, the sniff would examine all parameters passed to the function for output escaping, while only the first parameter will actually be used in the output. This updates the sniff to only examine the first (`$text`) parameter for function calls to any of the unsafe printing functions, which should prevent some (rare) false positives. Includes named parameter support for these functions. Note: while the target parameter position and name in this case is the same for both functions, I have set it up to be flexible, in case additional functions with different target parameters would be added to the list. Includes tests.
In particular when examining the parameters passed to a function call, the sniff may encounter parameters which are arrays. As things were, the sniff would skip over the array opener and then continue like before. This is buggy, in the same way as examining all function parameters as one long code block was buggy and would lead to errors throw on the incorrect token and/or false positives when the array item contains a ternary. To fix this, I'm special casing arrays, parsing the array items and then recursively calling the `check_code_is_escaped()` method to examine each array item individually. After this, the original call to `check_code_is_escaped()` will continue from the array closer onwards. This gives much higher accuracy when examining array items and fixes the bug. Includes tests proving the bug(s) and safeguarding the fix.
At various points within the `check_code_is_escaped()` loop, the sniff would check whether the current token is a token which can be ignored. This commit just groups those simple checks, which are not dependent on other logic in the loop, together and places them at the top of the loop for the highest efficiency.
… flexible heredocs As things were, when the sniff would encounter a heredoc, it would throw an error on the heredoc opener (`<<<EOD`) token and be done with it. However, heredocs _can_, but don't necessarily always contain interpolated variables and expressions and when the heredoc doesn't contain any interpolation, it should be treated the same as any other plain text string. This commit fixes the sniff to handle heredocs properly and throw errors only on heredoc lines containing interpolation. Includes tests.
…ions Similar to multiple other fixes made before, the parameters passed to a formatting function should be examined individually and not as one big code block. Fixed now. Includes tests.
As per the reported issue, `get_search_query()` is unsafe when the `$escaped` parameter is set, but not set to `true`. This commit special cases the function and checks the parameter. If the parameter is passed, but not set to `true`, a custom error message will be thrown. Includes tests. Fixes 1354
jrfnl
force-pushed
the
feature/escapeoutput-review-phpcsutils-modern-php
branch
from
July 28, 2023 01:44
bfc0f03
to
bb0b7d6
Compare
GaryJones
approved these changes
Jul 28, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! This is intense and fantastic! ✅
dingo-d
approved these changes
Jul 28, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I realize this is a huge PR, so please take your time reviewing it.
Note: this PR will be easiest to review by checking each commit individually.
Also note: there is always more which can be done, so, while not perfect, this PR will make significant (and HUGE) improvements to the escape output sniff and sets the sniff up to make it more straight forward to make further improvements in the future.
Security/EscapeOutput: rename test case file
... to allow for adding additional test case files.
Security/EscapeOutput: move parse error related test to separate file
... to prevent it interfering with other tests.
Security/EscapeOutput: use Tokens::$magicConstants
The
Tokens::$magicConstants
array was introduced to PHPCS in version 3.5.6, so we can now defer to the upstream array.Security/EscapeOutput: use PHPCSUtils
Security/EscapeOutput: normalize the $safe_components array
Security/EscapeOutput: bug fix - expand the "safe components" list
Operator tokens in PHP are not insecure, it is the operands which are potentially insecure.
As things were, we've seen all sorts of messages in bugs reported, where the sniff would report on
&&
or===
tokens, which is nonsensical.The "safe components" list already contained a number of operators, most notably the arithmetic operators (though the PHP 5.6+
**
(T_POW
) operator was missing).I'm proposing to expand this list with a much larger list of operators using the predefined token arrays from PHPCS itself + one of the PHPCSUtils token arrays + the
T_BOOLEAN_NOT
(!
) operator.With this expansion, we shouldn't be seeing any more issues reported for the sniff flagging operators, which should make debugging the issues reported more straight forward.
The only operators I have not (yet) included in the list are the assignment operators as assignments within an output statement should be exceedingly rare anyway.
Includes unit tests, all of which would throw an error on the operator prior to this change.
Security/EscapeOutput: split process_token() method [1]
Split off the code which does the actual output escaping check into a separate
check_code_is_escaped()
method.No functional changes.
Security/EscapeOutput: split process_token() method [2]
Split off the code which checks for function calls into a separate
process_tstring_token()
method.No functional changes.
👉 This commit will be easier to review while ignoring whitespace changes.
Security/EscapeOutput: bug fix - false positives on non-function calls
As things were, the sniff would look for
T_STRING
tokens with certain contents, but never do a proper check if those were actually function calls.This would lead to false positives, but also potentially to an
Internal.Exception
for the$end_of_statement
not being defined.This commit is a first step towards fixing this.
Includes tests for select situations fixed by this particular commit.
Helpers/PrintingFunctionsTrait: add get_printing_functions() method
... which is needed for the switch to another base class is the next commit.
Security/EscapeOutput: bug fix - false positives on non-global function calls
As things were, the sniff would look for
T_STRING
tokens with certain contents, but never do a proper check if those were actually function calls to the WP/PHP native global function.This would lead to false positives for calls to methods, function declarations etc.
By changing the parent class to the
AbstractFunctionRestrictionsSniff
, we can re-use the "if this a global function call" logic already in place in that abstract, while still maintaining the handling of the other tokens in the pre-existing manner.There are three caveats to this:
AbstractFunctionRestrictionsSniff
contains anexclude
property, which allows to silence notices about certain function for a complete group.As this is a security sniff, we don't want end-users to be able to silence notices about groups of functions in that way.
To that end, the ability for end-users to set the
exclude
property for this sniff is being disabled.AbstractFunctionRestrictionsSniff
class will call theprocess_matched_token()
method once for each match in each function group.As the "unsafe printing functions" are also included in the "printing functions", this would lead to the
process_matched_token()
method being called twice for those methods, leading to duplicate error notices.I'm preventing this by removing the "unsafe printing functions" from the "printing functions" list prior to passing it off to the parent class.
AbstractFunctionRestrictionsSniff
class expects the list of target functions to be available when the sniff is instantiated. However, in test situations, that list can fluctuate, as the "printing functions" supports the ability to add "custom printing functions", which is done on the fly in the test case file using// phpcs:set ...
.To get round that, I'm re-registering the "function groups" every time a
T_STRING
(potential function call) triggers the sniff, but only when the tests are being run.While it is possible for end-users to use the
// phpcs:set
syntax, it is little known and is not mentioned anywhere in the documentation as a feature for end-users, so just re-registering the function groups for test runs would seem sufficient.Includes tests for situations fixed by this particular commit.
Includes a test using the PHP 8.0+ nullsafe object operator.
Note: the
PrintingFunctionsTrait::is_printing_function()
method has now become unused and could be removed.For consistency with other traits/helper classes, I'm electing to leave it be (for now) though.
Security/EscapeOutput: bug fix - function names are case-insensitive
While the initial function name check was previously already made case-insensitive in #2273 (commit 53309ac) and remained case-insensitive with the switch to the
AbstractFunctionRestrictionsSniff
base class, the special casing function name checks were not done in a case-insensitive manner.This could lead to false positives as well as false negatives.
Fixed now.
Includes tests proving the bug.
Security/EscapeOutput: prevent false positive due to comments in param
The
'raw'
key in the parameter arrays returned from thePassedParameters
class contains - as per the name - the raw contents of the parameter.Since PHPCSUtils 1.0.0-alpha4, the return array also contain a
'clean'
index, which contains the contents of the parameter cleaned of comments.By switching to using that key, a potential false positive gets fixed.
Includes unit test demonstrating the issue and safeguarding the fix.
Security/EscapeOutput: code readability improvements [1]
Minor tweaks to the code in the
process_token()
method for improved comprehensibility of the code.Rename some variables, add some documentation, split up long conditions.
Includes preventing the code from changing the
$stackPtr
as that is just really confusing.Security/EscapeOutput: bug fix - only examine exit/die with parentheses
exit
/die
are language constructs and can be used as part of an expression.Additionally, an
exit
/die
can be stand-alone or can pass a status code/message. If a status code/message is passed, we want it to be escaped for output, but a status code/message can only be passed within parentheses.As things were, the sniff would not ignore
exit
/die
statements without status and when used within an expression, would walk the tokens after theexit
/die
, leading to false positives.Fixed now by using custom logic for
T_EXIT
tokens.Includes parse error/live coding protection, where the sniff should stay silent.
As the method will now use a
switch
control structure, move the special casing forT_STRING
also into theswitch
.Includes tests for situations fixed by this particular commit.
👉 This commit will be easier to review while ignoring whitespace changes.
Security/EscapeOutput: bug fix - improve start/end determination for print statements
print
is a language construct and can be used as part of an expression.The sniff did not take this into account correctly. This would lead to the sniff walking too far and examining code which is not part of the
print
statement. This would lead to both false positives as well as false negatives.This commit improves that situation, but doesn't completely fix it.
First off, the sniff now uses the
BCFile::findEndOfStatement()
method to more accurately retrieve the end token for the expression. This should improve the handling of the most common cases.This part includes live coding/parse error protection for unfinished
print
statements at the end of a file.However, the
[BC]File::findEndOfStatement()
method does not take ternaries into account when determining the end of the expression.So, secondly, code has been added to special case
print
statements in the "then" part of a ternary and determine the end of such an expression correctly.Includes tests for situations fixed by this particular commit.
Additional notes:
print
in ternaries which would still not be handled correctly.I'm leaving those as "future scope".
Example:
Fixes #2209
Security/EscapeOutput: efficiency tweak for parse errors in echo statements
Bow out earlier in case of a parse error for echo statements and short open echo tag statements.
While the sniff would not throw false positives/negatives as the
for()
loop, which does the token walking in thecheck_code_is_escaped()
method, would not get started, as$i < false
as stop condition would always returntrue
, we can (and should) bow out earlier for parse errors/live coding.Fixed now.
Includes switching the order of the token searches for maximum efficiency.
Includes tests safeguarding the tweak.
Security/EscapeOutput: fix up after refactor
The refactor in the previous few commits "broke" the preliminary check for a non-parenthesized ternary ìn a
print
statement.There were no tests covering this.
I'm fixing this now by moving that particular code block out of the
switch
and executing it for bothprint
,echo
and<?=
(though not forexit
/die
statements as those will always have parentheses, otherwise there will be nothing to check).I'm also extracting the actual logic for finding the ternary to a separate method, as this logic will be expanded in follow-up commits.
Includes adding tests safeguarding the logic.
Security/EscapeOutput: bug fix - allow for nested statements with ternaries
As things were, the check for statements with ternaries without wrapping parentheses, did not take into account that the whole statement could be nested within parentheses if the statement is a
print
.This commit updates the code block handling the skipping of the condition for a ternary without wrapping parentheses to allow for the statement being nested in parentheses.
Includes adding tests safeguarding the fix.
Security/EscapeOutput: bug fix - [improved] allow for nested statements with ternaries
While the previous commit already prevented a ternary being regarded as the "applicable" ternary when it wasn't at the right nesting level, that still disregarded the fact that ternaries can be chained and that an "applicable" ternary token could be found after a ternary which should be disregarded.
This led to false positives.
This commit changes the logic in the
find_ternary()
method to properly search the token group for an applicable ternary token, which fixes that issue.Includes tests proving the bug and safeguarding the fix.
Security/EscapeOutput: bug fix - allow non-matching parentheses
As things were, the prelim check for statements with ternaries without wrapping parentheses, did not take into account that there could be a parentheses at the start and at the end of the statement, which didn't belong together.
In that case, the preliminary ternary check should still be executed.
Fixed now.
Includes adding tests safeguarding the fix.
Fixes #677
This also fixes the third case reported in #1507
Security/EscapeOutput: bug fix - fix ternary finding within the loop
As things were, when an open parenthesis was found in the loop, the sniff would attempt to find a relevant ternary operator and if found, skip over the condition.
However, this check did not take nested parentheses into account correctly, which meant that if the complete ternary was wrapped in parentheses, but the condition was not wrapped in parentheses and the "then" or the "else" part would contain parentheses, the ternary was incorrectly disregarded.
Fixed now by switching the logic used within the loop over to use the new
find_ternary()
method, which should handle the determination of whether the ternary is applicable correctly.Note: it is important for the
$start
token passed to thefind_ternary()
method to be the token after the open parenthesis, otherwise, the method will not look within the parentheses.Includes adding tests safeguarding the fix.
Fixes #1219
Security/EscapeOutput: bug fix - false negatives for short ternary
While the use of short ternaries is/will be discouraged in WP, this doesn't mean it won't be used in non-WP Core code.
When a short ternary is used, the sniff should not ignore the code before the ternary operator, but should check it for being correctly escaped as the value will be used if it is "truthy".
Fixed by changing the
find_ternary()
method to afind_long_ternary()
method, which disregards short ternaries.Includes tests proving the bug and safeguarding the fix.
Fixes #1617
Security/EscapeOutput: code readability improvements [2]
Minor tweaks to the code in the
process_matched_token()
method for improved comprehensibility of the code.Rename some variables, simplify/split up long conditions.
Includes preventing the code from changing the
$stackPtr
as that is just really confusing.Security/EscapeOutput: efficiency tweak [1]
Bow out earlier.
There is one function call related check which may bow out without checking parameters, so move that check up for efficiency.
Security/EscapeOutput: efficiency tweak [2]
If the parameter which needs to be examined doesn't exist, we can skip the complete function call.
Security/EscapeOutput: bug fix - sniff would skip too much
If a non-function call
T_STRING
token would be named the same as any of the escaping/formatting/array walking functions the sniff looks for, the sniff would walk to the next parenthesis opener (if found) and skip everything in between.Fixed now.
Includes a test proving the bug and safeguarding the fix.
Note: the test shows the sniff now throwing two errors, one for the
$var
and one for theunrelatedFunction()
function call.The sniff does not throw an error for the (unescaped) constant being used. This may need to be looked into separately.
Security/EscapeOutput: bug fix - sniff did not handle function call parameters correctly
As things were, the sniff would examine all parameters from a function call as one long code block.
This would easily lead to false positives (and possibly also false negatives) largely related to the sniff not correctly starting/stopping the watching for each new parameter and getting confused over ternaries (again).
This commit make two closely related changes to fix this.
echo
/print
/<?=
statements/expressions, to thecheck_code_is_escaped()
method, which means it will now be applied to each individual parameter.This gives much higher accuracy when examining parameters.
As a side-effect of this change, function calls to printing functions using named parameters will not yield false positives on the parameter label.
Includes tests proving the bug(s) and safeguarding the fix.
Fixes the issue reported in #677 (comment)
Security/EscapeOutput: add tests for PHP 7.4 numeric literals + PHP 8.1 octal literals
No changes needed. Just making sure.
Security/EscapeOutput: add support for PHP 8.0+ named parameters [1]
trigger_error()
/user_error()
to use the new PHPCSUtils 1.0.0-alpha4PassedParameters::getParameterFromStack()
method.PHP itself renamed a lot of parameters in PHP 8.0. As named parameters did not exist before PHP 8.0, the parameter name as per PHP 8.0 (or above) are the only relevant names.
Includes additional unit tests.
ecurity/EscapeOutput: add support for PHP 8.0+ named parameters [2]
_deprecated_file()
to use the new PHPCSUtils 1.0.0-alpha4PassedParameters::getParameterFromStack()
method.WP has been renaming parameters and is probably not done yet, but it doesn't look like those changes (so far) made it into changelog entries....
For the purposes of this exercise, I've taken the current parameter name as the "truth" as support for named parameters hasn't officially been announced yet, so any renames after this moment are the only ones relevant.
Includes additional unit tests.
Security/EscapeOutput: add support for PHP 8.0+ match expressions [1]
PHP 8.0 introduced match expressions, which can be used inline, which means they can be used in an output generating statement.
If the match expression is wrapped inside an escaping function, it is already handled correctly.
This commit is the first step in adding additional support for
match
expressions to the sniff and adds logic to disregard a complete match expression if it is preceded by a "safe cast".Includes tests.
Security/EscapeOutput: add support for PHP 8.0+ match expressions [2]
PHP 8.0 introduced match expressions, which can be used inline, which means they can be used in an output generating statement.
This second commit adds logic to correctly check inside a match expression.
In a
match
, there can be multiple matching conditions before the double arrow and one statement to be returned after the double arrow.So basically, we should ignore everything before the double arrow as it is effectively a comparison.
And we should verify that everything after the double arrow - up to the first comma at the same "level" as the arrow - is escaped correctly.
Now as the conditions, when there is more than one, are also comma-separated, the parsing of a
match
expression is non-trivial.With this in mind, I'm introducing a helper method to do the parsing. This helper method will pass each individually identified "return value" part to the
check_code_is_escaped()
method for stand-alone analysis.Once the method is finished analyzing the
match
expression, it will hand back off to the original call tocheck_code_is_escaped()
and continue analyzing the code after thematch
expression.Note: I'm considering adding a utility method to PHPCSUtils for the parsing of
match
expressions. If and when that method is available, we should be able to simplify the code I've now added to the sniff.Includes tests.
Security/EscapeOutput: add support for *::class and PHP 8.0+ $obj::class
The result of
Name::class
or the PHP 8.0+$obj::class
will always be a plain string, along the same lines as the use of__CLASS__
, so this should be allowed and not need escaping.This commit adds support for that code pattern to the sniff, which fixes some false positives.
Includes tests.
Fixes #1989
Security/EscapeOutput: add support for examining throw statements and PHP 8.0+ throw expressions
Any exception which isn't caught runs the risk of being displayed and should therefore be output escaped, along the same lines as is already required for parameters passed to
trigger_error()
function calls.This commit add the
T_THROW
token to the tokens the sniff listens for. Parameters passed to the exception creation function call/class instantiation will be examined for being correctly escaped.Notes:
$message
parameter.throw
statement is wrapped within atry - catch
control structure, it is presumed that the exception will be caught and thethrow
statement will be ignored.Includes tests.
Fixes #884
Security/EscapeOutput: improve handling of inline expressions
When the sniff would encountered one of the keywords the sniff listens to as an inline expression in the statement which was being examined, the sniff would flag the keyword as needing escaping.
This is wrong as that expression should be examined separately by the sniff (but wouldn't be as the sniff skips to the end of the examined code).
This commit fixes that by adjusting the logic in the loop in the
check_code_is_escaped()
method to check for keywords the sniff listens too and then recursing into the sniff again to examine that expression separately, after which the loop for the original statement will continue from the end of the examined expression.Includes tests.
Fixes #1861
Security/EscapeOutput: improve handling of params in unsafe printing functions
As things were, the sniff will flag "unsafe printing functions" when encountered, but if the
UnsafePrintingFunction
error code is silenced, the sniff would examine all parameters passed to the function for output escaping, while only the first parameter will actually be used in the output.This updates the sniff to only examine the first (
$text
) parameter for function calls to any of the unsafe printing functions, which should prevent some (rare) false positives.Includes named parameter support for these functions.
Note: while the target parameter position and name in this case is the same for both functions, I have set it up to be flexible, in case additional functions with different target parameters would be added to the list.
Includes tests.
Security/EscapeOutput: very minor code readability improvements [3]
Security/EscapeOutput: bug fix - sniff did not handle arrays correctly
In particular when examining the parameters passed to a function call, the sniff may encounter parameters which are arrays.
As things were, the sniff would skip over the array opener and then continue like before.
This is buggy, in the same way as examining all function parameters as one long code block was buggy and would lead to errors throw on the incorrect token and/or false positives when the array item contains a ternary.
To fix this, I'm special casing arrays, parsing the array items and then recursively calling the
check_code_is_escaped()
method to examine each array item individually. After this, the original call tocheck_code_is_escaped()
will continue from the array closer onwards.This gives much higher accuracy when examining array items and fixes the bug.
Includes tests proving the bug(s) and safeguarding the fix.
Security/EscapeOutput: group common token check together
At various points within the
check_code_is_escaped()
loop, the sniff would check whether the current token is a token which can be ignored.This commit just groups those simple checks, which are not dependent on other logic in the loop, together and places them at the top of the loop for the highest efficiency.
Security/EscapeOutput: improve handling of heredocs, include PHP 7.3+ flexible heredocs
As things were, when the sniff would encounter a heredoc, it would throw an error on the heredoc opener (
<<<EOD
) token and be done with it.However, heredocs can, but don't necessarily always contain interpolated variables and expressions and when the heredoc doesn't contain any interpolation, it should be treated the same as any other plain text string.
This commit fixes the sniff to handle heredocs properly and throw errors only on heredoc lines containing interpolation.
Includes tests.
Security/EscapeOutput: improve handling of params in formatting functions
Similar to multiple other fixes made before, the parameters passed to a formatting function should be examined individually and not as one big code block.
Fixed now.
Includes tests.
Security/EscapeOutput: special case get_search_query( false )
As per the reported issue,
get_search_query()
is unsafe when the$escaped
parameter is set, but not set totrue
.This commit special cases the function and checks the parameter. If the parameter is passed, but not set to
true
, a custom error message will be thrown.Includes tests.
Fixes #1354
Security/EscapeOutput: various minor docs fixes