Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add taint source on plugin-added taints #10206

Open
wants to merge 10 commits into
base: 5.x
Choose a base branch
from

Conversation

Patrick-Remy
Copy link
Contributor

@Patrick-Remy Patrick-Remy commented Sep 15, 2023

Fixes #10057

Problem

Taints not found in psalm v5 using $codebase->addTaintSource() in plugin

With the change of immutability (d0be59e#diff-122c0df94b9a0557e4fd09b8d8c7324f4d6d8fff34f344e1e49318c8e7a0a242), addTaintSource() doesn't manipulate the $expr_type anymore and instead returns the adjusted $expr_type. But this breaks custom-added taint sources via plugin as shown in the docs currently. From plugin perspective you cannot anymore adjust the expression type inside the plugin.

See also my repro here: https://github.com/Patrick-Remy/repro-psalm-5-taint-analysis-issue/blob/master/README.md

<?php

namespace Some\Ns;

use PhpParser;
use Psalm\CodeLocation;
use Psalm\Context;
use Psalm\FileManipulation;
use Psalm\Plugin\EventHandler\AfterExpressionAnalysisInterface;
use Psalm\Plugin\EventHandler\Event\AfterExpressionAnalysisEvent;
use Psalm\Type\TaintKindGroup;

class BadSqlTainter implements AfterExpressionAnalysisInterface
{
    /**
     * Called after an expression has been checked
     *
     * @param  PhpParser\Node\Expr  $expr
     * @param  Context              $context
     * @param  FileManipulation[]   $file_replacements
     *
     * @return void
     */
    public static function afterExpressionAnalysis(AfterExpressionAnalysisEvent $event): ?bool {
        $expr = $event->getExpr();
        $statements_source = $event->getStatementsSource();
        $codebase = $event->getCodebase();
        if ($expr instanceof PhpParser\Node\Expr\Variable
            && $expr->name === 'bad_data'
        ) {
            $expr_type = $statements_source->getNodeTypeProvider()->getType($expr);

            // should be a globally unique id
            // you can use its line number/start offset
            $expr_identifier = '$bad_data'
                . '-' . $statements_source->getFileName()
                . ':' . $expr->getAttribute('startFilePos');

            if ($expr_type) {
                $codebase->addTaintSource(
                    $expr_type,
                    $expr_identifier,
                    TaintKindGroup::ALL_INPUT,
                    new CodeLocation($statements_source, $expr)
                );

                // Custom addition here to check if variable is really tainted
                echo "\n\n\nTAINTED\n\n\n";
            }
        }

        return null;
    }

This is probably an intended behaviour, but there needs to be another way for adding taints by plugin.

Taints not found in psalm v4 + v5 using AddTaintsInterface

I found the undocumented plugin interface AddTaintsInterface that makes it possible to return an array of taints that should be added to analyzed expression.
Unfortunately this also lead to no findings in both psalm v4 and psalm v5.

<?php

namespace Some\Ns;

use PhpParser;
use Psalm\Plugin\EventHandler\AddTaintsInterface;
use Psalm\Plugin\EventHandler\Event\AddRemoveTaintsEvent;
use Psalm\Type\TaintKindGroup;

class BadSqlTainter implements AddTaintsInterface
{

    /**
     * Called to see what taints should be added
     *
     * @return list<string>
     */
    public static function addTaints(AddRemoveTaintsEvent $event): array
    {
        $expr = $event->getExpr();
        if ($expr instanceof PhpParser\Node\Expr\Variable
            && $expr->name === 'bad_data'
        ) {
            echo "\n\n\nTAINTED\n\n\n";
            return TaintKindGroup::ALL_INPUT;
        }

        return [];
    }
}

Analysis

After deep-diving into the taint analysis, I found the reason why addTaints only sometimes could have an effect.
Other than $codebase->addSource() the callees most time do not add TaintSource via data_flow_graph->addSource(). Seems like only those Analyzers that analyze types that can be annotated with @psalm-taint-source like MethodCallReturnTypeFetcher already added those.

Implemented solution

As anywhere where AddRemoveTaintsEvent is dispatched, a plugin could add new taints (also at Taint sinks, which seemed a bit strange to me), a new TaintSource has to be added, so that the already added data flow paths get connected.
I also added a new plugin example and test for addTaints and updated the documentaiton.

Code review request

I would kindly ask for review, as I just have gone to all results of the AddRemoveTaintsEvent and added the required missing steps.
Specially in which scenarios is it required to replace the typeNode and add new parents, like $codebase->addTaintSource() does? For now I have skipped that, as my newly added test case works wihtout modifying the typeNode.

@Patrick-Remy Patrick-Remy force-pushed the fix/custom-taint-sources-not-detected branch 2 times, most recently from b7910ca to 01a9575 Compare September 19, 2023 18:36
@Patrick-Remy
Copy link
Contributor Author

I have just refactored my implementation and more important added a new test case that covers the issue.

As now all checks are passing (ignoring the test-with-real-projects), could you please review it, so that this fix can be integrated into 5.x? As unfortunately this currently blocks our upgrade to psalm v5.

@orklah
Copy link
Collaborator

orklah commented Sep 28, 2023

Hi, thanks for this work and sorry for not answering earlier.

Does that mean that any existing plugin that may have tried to add taints or sink was broken?

I'm trying to establish the consequences for other existing plugins that may exist to see if this is a BC break or not

@Patrick-Remy
Copy link
Contributor Author

As far as I know, it may probably still work when adding taints via either addTaintSource() or AddTaintsInterface if there is already some taint source at a node (due to psalm) and you are adding another to the exact same node. But I haven't tested that so far. But as there weren't any tests related to plugin-added taints since now, it could also be completely broken.

There should probably be a full suite of tests related to plugin-added taints. If you wish I can add some more.

@orklah
Copy link
Collaborator

orklah commented Sep 29, 2023

More tests are always good :)

Should we deprecated addTaintSource for removal in Psalm 6?

@Patrick-Remy Patrick-Remy force-pushed the fix/custom-taint-sources-not-detected branch from 4640aa6 to bb10f13 Compare January 23, 2024 21:14
@Patrick-Remy
Copy link
Contributor Author

I was adding some more tests, and fastly failed with some odd behaviour where I am not sure, if it is due to my changes or if it was already previously.

Using the TaintBadDataPlugin

    public static function addTaints(AddRemoveTaintsEvent $event): array
    {
        $expr = $event->getExpr();

        if ($expr instanceof Variable && $expr->name === 'bad_data') {
            return TaintKindGroup::ALL_INPUT;
        }

        return [];
    }

The following taint is not recognized

<?php // --taint-analysis

$foo = $bad_data;
echo $foo;

whereas this is still recognized:

<?php // --taint-analysis

echo $bad_data;

Anyway I wanted to fix it, but I am still stucking on how inside the AssignmentAnalyzer. Any hints to me?

@Patrick-Remy
Copy link
Contributor Author

Any ideas @orklah ?

@Patrick-Remy Patrick-Remy force-pushed the fix/custom-taint-sources-not-detected branch 2 times, most recently from bb8a5d4 to bd424dc Compare July 1, 2024 16:12
@Patrick-Remy Patrick-Remy force-pushed the fix/custom-taint-sources-not-detected branch from bd424dc to ba334db Compare December 31, 2024 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding custom taint sources via plugin ignored
2 participants