-
Notifications
You must be signed in to change notification settings - Fork 0
/
pattern_matching.txt
38 lines (27 loc) · 1.8 KB
/
pattern_matching.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Pattern matching
================
[[Parent]]: understanding.txt
_Pattern matching_ is the process of testing whether a given string belongs to a given set of strings, called a _pattern_. Remark uses two traditional forms of pattern matching, called globs and regular expressions.
Globs
-----
A _glob_ defines a pattern by a string in which:
* a character matches itself, except that
* `*` matches anything any number of times,
* `?` matches anything at most one time,
* `[seq]` matches any character in string `seq`, and
* `[!seq]` matches any character not in string `seq`.
For example, `?at.png` matches `at.png`, `bat.png`, and `cat.png`. Globs are commonly used in file systems. They capture a reasonable amount of patterns, while still being intuitive.
Regular expressions
-------------------
A _regular expression_, or a regex, defines a pattern by a string
constructed using the following kind of rules:
* a character matches itself, except that
* `.` matches any single character except the newline,
* `E?` matches `E` at most one time,
* `E*` matches `E` any number of times,
* `E+` matches `E` at least one time,
* `AB` matches `A` and `B` in a sequence,
* `A|B` matches either `A` or `B`,
* `(E)` matches `E`,
where `E`, `A`, and `B` are regular expressions. The backslash `\` is used to escape the meta-characters. This list is incomplete; the regular expressions are given using the [Python's regular expression syntax][PythonRegex]. For example, `(ab)*.txt` matches `.txt`, `ab.txt`, `abab.txt`, and so on. In Remark, the regex is automatically appended `\Z` at the end so that it must match the whole string. Regular expressions are strictly more powerful than globs, but they are also less intuitive.
[PythonRegex]: http://docs.python.org/library/re.html