Skip to content

Latest commit

 

History

History
75 lines (55 loc) · 2.48 KB

pattern_language.md

File metadata and controls

75 lines (55 loc) · 2.48 KB

phpgrep pattern language

This file serves as a main documentation source for the pattern language used inside phpgrep.

We'll refer to it as PPL (phpgrep pattern language) for brevity.

Overview

Syntax-wise, PPL is 100% compatible with PHP.

In fact, it only changes the semantics of some syntax constructions without adding any new syntax forms to the PHP language. It means that phpgrep patterns can be parsed by any parser that can handle PHP.

The patterns describe the program parts (syntax trees) that they need to match. In places where whitespace doesn't mattern in PHP, it has no special meaning in PPL as well.

PHP variables

PHP variables syntax, $<id> match any kind of node (expression or a statement) exactly once.

If same <id> is used multiple times, both "variables" should match the same AST.

$x = $y; // Matches any assignment
$x = $x; // Matches only self-assignments

The special variable $_ can be used to avoid having to give names to less important parts of the pattern without additional restrictions that apply when variable names are identical.

$_ = $_ // Matches any assignment (because $_ is special)

Matcher expressions

Expressions in form of ${"<matcher>"} or ${'<matcher>'} are called matcher expressions. The <matcher> determines what will be matched.

It does not matter whether you use ' or ", both behave identically.

matcher_expr = "$" "{" quote matcher quote "}"
quote = "\"" | "'"
matcher = named_matcher | matcher_class
named_matcher = <name> ":" matcher_class
matcher_class = <see the table of supported classes below>
Class Description
* Any node, 0-N times
+ Any node, 1-N times
int Integer literal
float Float literal
num Integer or float literal
str String literal
char A string literal of length=1, like 'a'
const Constant, like true or a class constant like T::FOO
var Variable
func Anonymous function/closure expression
expr Any expression

Some examples of complete matcher expressions:

  • ${'*'} - matches any number of nodes
  • ${"+"} - matches one or more nodes
  • ${'str'} - matches any kind of string literal
  • ${"x:int"} - x-named matcher that matches any integer
  • $${"var"} - matches any "variable variable", like $$x and $$php

Interesting details:

  • Anonymous matchers get "_" name, so ${"var"} is actually ${"_:var"}
  • Semantically, $x is ${"x:node"} (but PPL doesn't define node)