This page contains brief descriptions of all PEGTL rule and combinator classes.
The information about how much input is consumed by the rules only applies when the rules succeed. Otherwise there are two failure modes with different requirements.
- Local failure is when a rule returns
false
and the rule must generally rewind the input to where its match attempt started. - Global failure is when a rule throws an exception (usually of type
tao::parse_error
)(usually via the control-class'raise()
function).
Since an exception, by default, aborts a parsing run -- hence the term "global failure" -- there are no assumptions or requirements for the throwing rule to rewind the input.
On the other hand a local failure will frequently lead to back-tracking, i.e. the attempt to match a different rule at the same position in the input, wherefore rules that were previously attempted at the same position must rewind back to where they started in preparation of the next attempt.
Note that in some cases it is not necessary to actually rewind on local failure, see the description of the rewind_mode in the section on how to implement custom rules.
Some rule classes are said to be equivalent to a combination of other rules. Here, equivalence is with respect to which inputs are matched, but not (necessarily) how the rule is implemented.
For rules other than must<>
that contain "must" in their name, rule equivalence shows which rule will be used to call the control class' raise()
function when certain sub-rules fail to match.
The "meta data and implementation mapping" section of each rule's description shows both how the rule is implemented and what the meta data looks like.
When the list of sub-rules is empty then the definition of subs_t
is omitted from the description.
The documentation will use (template parameter) packs when zero-or-more or one-or-more of a (template) parameter are allowed.
For example seq< R... >
accepts zero-or-more template parameters.
In the zero case, i.e. seq<>
, we describe R
as "empty".
When at least one parameter is given, i.e. seq< A >
or seq< A, B, C >
, R
is "non-empty".
- Meta Rules
- Combinators
- Convenience
- Action Rules
- Atomic Rules
- ASCII Rules
- Unicode Rules
- Binary Rules
- Full Index
These rules are in namespace tao::pegtl
.
- Equivalent to
seq< R... >
, but: - Uses the given class template
A
for actions. - Does not
enable
ordisable
actions while matchingR...
. - Meta data and implementation mapping:
action< A >::rule_t
isinternal::success
action< A, R >::rule_t
isinternal::action< A, R >
action< A, R >::subs_t
istype_list< R >
action< A, R... >::rule_t
isinternal::action< A, internal::seq< R... > >
action< A, R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Uses the given class template
C
as control class. - Meta data and implementation mapping:
control< C >::rule_t
isinternal::success
control< C, R >::rule_t
isinternal::control< C, R >
control< C, R >::subs_t
istype_list< R >
control< C, R... >:rule_t
isinternal::control< C, internal::seq< R... > >
control< C, R... >:subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Disables all actions.
- Meta data and implementation mapping:
disable<>::rule_t
isinternal::success
disable< R >::rule_t
isinternal::disable<, R >
disable< R >::subs_t
istype_list< R >
disable< R... >::rule_t
isinternal::disable< internal::seq< R... > >
disable< R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
success
, but: - Calls the input's
discard()
member function. - Must not be used where backtracking to before the
discard
might occur and/or nested within a rule for which an action with input can be called. - See Incremental Input for details.
- Meta data and implementation mapping:
discard::rule_t
isinternal::discard
- Equivalent to
seq< R... >
, but: - Enables all actions (if any).
- Meta data and implementation mapping:
enable<>::rule_t
isinternal::success
enable< R >::rule_t
isinternal::enable< R >
enable< R >::subs_t
istype_list< R >
enable< R... >::rule_t
isinternal::enable< internal::seq< R... > >
enable< R... >::subs_t
istype_list< internal::seq< R... > >
- Succeeds if at least
Num
further input bytes are available. - With Incremental Input reads the bytes into the buffer.
- Meta data and implementation mapping:
require< 0 >::rule_t
isinternal::success
require< N >::rule_t
isinternal::require< N >
- Equivalent to
seq< R... >
, but: - Replaces all state arguments with a new instance
s
of typeS
. s
is constructed with the input and all previous states as arguments.- If
seq< R... >
succeeds thens.success()
is called with the input after the match and all previous states as arguments. - Meta data and implementation mapping:
state< S >::rule_t
isinternal::success
state< S, R >::rule_t
isinternal::state< S, R >
state< S, R >::subs_t
istype_list< R >
state< S, R... >::rule_t
isinternal::state< S, internal::seq< R... > >
state< S, R... >::subs_t
istype_list< internal::seq< R... > >
Combinators (or combinator rules) are rules that combine (other) rules into new ones.
These are the classical PEG combinator rules and are defined in namespace tao::pegtl
.
- PEG and-predicate &e
- Succeeds if and only if
seq< R... >
would succeed. - Consumes nothing, i.e. rewinds after matching.
- Disables all actions.
- Meta data and implementation mapping:
at<>::rule_t
isinternal::success
at< R >::rule_t
isinternal::at< R >
at< R >::subs_t
istype_list< R >
at< R... >::rule_t
isinternal::at< internal::seq< R... > >
at< R... >::subs_t
istype_list< internal::seq< R... > >
- PEG not-predicate !e
- Succeeds if and only if
seq< R... >
would not succeed. - Consumes nothing, i.e. rewinds after matching.
- Disables all actions.
- Meta data and implementation mapping:
not_at<>::rule_t
isinternal::failure
not_at< R >::rule_t
isinternal::not_at< R >
not_at< R >::subs_t
istype_list< R >
not_at< R... >::rule_t
isinternal::not_at< internal::seq< R... > >
not_at< R... >::subs_t
istype_list< internal::seq< R... > >
- PEG optional e?
- Optional
seq< R... >
, i.e. attempt to matchseq< R... >
and signal success regardless of the result. - Equivalent to
sor< seq< R... >, success >
. - Meta data and implementation mapping:
opt<>::rule_t
isinternal::success
opt< R >::rule_t
isinternal::opt< R >
opt< R >::subs_t
istype_list< R >
opt< R... >::rule_t
isinternal::opt< internal::seq< R... > >
opt< R... >::subs_t
istype_list< internal::seq< R... > >
- PEG one-or-more e+
- Matches
seq< R... >
as often as possible and succeeds if it matches at least once. - Equivalent to
rep_min< 1, R... >
. R
must be a non-empty rule pack.- Meta data and implementation mapping:
plus< R >::rule_t
isinternal::plus< R >
plus< R >::subs_t
istype_list< R >
plus< R... >::rule_t
isinternal::plus< internal::seq< R... > >
plus< R... >::subs_t
istype_list< internal::seq< R... > >
- PEG sequence e1 e2
- Sequence or conjunction of rules.
- Matches the given rules
R...
in the given order. - Fails and stops matching when one of the given rules fails.
- Consumes everything that the rules
R...
consumed. - Succeeds if
R
is an empty rule pack. - Meta data and implementation mapping:
seq<>::rule_t
isinternal::success
seq< R >::rule_t
isinternal::seq< R >
seq< R >::subs_t
istype_list< R >
seq< R... >::rule_t
isinternal::seq< R... >
seq< R... >::subs_t
istype_list< R... >
- PEG ordered choice e1 / e2
- Choice or disjunction of rules.
- Matches the given rules
R...
in the given order. - Succeeds and stops matching when one of the given rules succeeds.
- Consumes whatever the first rule that succeeded consumed.
- Fails if
R
is an empty rule pack. - Meta data and implementation mapping:
sor<>::rule_t
isinternal::failure
sor< R >::rule_t
isinternal::sor< R >
sor< R >::subs_t
istype_list< R >
sor< R... >::rule_t
isinternal::sor< R... >
sor< R... >::subs_t
istype_list< R... >
- PEG zero-or-more e*
- Matches
seq< R... >
as often as possible and always succeeds. R
must be a non-empty rule pack.- Meta data and implementation mapping:
star< R >::rule_t
isinternal::star< R >
star< R >::subs_t
istype_list< R >
star< R... >::rule_t
isinternal::star< internal::seq< R... > >
star< R... >::subs_t
istype_list< internal::seq< R... > >
The PEGTL offers a variety of convenience rules which help writing concise grammars as well as offering performance benefits over the equivalent implementation with classical PEG combinators.
These rules are in namespace tao::pegtl
.
- Attempts to match
R
and depending on the result proceeds with eithermust< S... >
orfailure
. - Equivalent to
seq< R, must< S... > >
. - Equivalent to
if_then_else< R, must< S... >, failure >
. - Meta data and implementation mapping:
if_must< R >::rule_t
isinternal::if_must< false, R >
if_must< R >::subs_t
istype_list< R >
if_must< R, S... >::rule_t
isinternal::if_must< false, R, S... >
if_must< R, S... >::subs_t
istype_list< R, internal::must< S... > >
Note that the false
template parameter to internal::if_must
corresponds to the failure
in the equivalent description using if_then_else
.
- Attempts to match
R
and depending on the result proceeds with eithermust< S >
ormust< T >
. - Equivalent to
if_then_else< R, must< S >, must< T > >
. - Meta data and implementation mapping:
if_must_else< R, S, T >::rule_t
isinternal::if_then_else< R, internal::must< S >, internal::must< T > >
if_must_else< R, S, T >::subs_t
istype_list< R, internal::must< S >, internal::must< T > >
- Equivalent to
sor< seq< R, S >, seq< not_at< R >, T > >
. - Meta data and implementation mapping:
if_then_else< R, S, T >::rule_t
isinternal::if_then_else< R, S, T>
if_then_else< R, S, T >::subs_t
istype_list< R, S, T >
- Matches a non-empty list of
R
separated byS
. - Equivalent to
seq< R, star< S, R > >
. - Meta data and implementation mapping:
list< R, S >::rule_t
isinternal::seq< R, internal::star< S, R > >
list< R, S >::subs_t
istype_list< R, internal::star< S, R > >
- Matches a non-empty list of
R
separated byS
where eachS
can be padded byP
. - Equivalent to
seq< R, star< pad< S, P >, R > >
. - Meta data and implementation mapping:
list< R, S, P >::rule_t
isinternal::seq< R, internal::star< internal::pad< S, P >, R > >
list< R, S, P >::subs_t
istype_list< R, internal::star< internal::pad< S, P >, R > >
- Matches a non-empty list of
R
separated byS
. - Similar to
list< R, S >
, but if there is anS
it must be followed by anR
. - Equivalent to
seq< R, star< if_must< S, R > > >
. - Meta data and implementation mapping:
list_must< R, S >::rule_t
isinternal::seq< R, internal::star< S, internal::must< R > > >
list_must< R, S >::subs_t
istype_list< R, internal::star< S, internal::must< R > > >
- Matches a non-empty list of
R
separated byS
where eachS
can be padded byP
. - Similar to
list< R, S, P >
, but if there is anS
it must be followed by anR
. - Equivalent to
seq< R, star< if_must< pad< S, P >, R > > >
. - Meta data and implementation mapping:
list_must< R, S, P >::rule_t
isinternal::seq< R, internal::star< internal::pad< S, P >, internal::must< R > > >
list_must< R, S, P >::subs_t
istype_list< R, internal::star< internal::pad< S, P >, internal::must< R > > >
- Matches a non-empty list of
R
separated byS
with optional trailingS
. - Equivalent to
seq< list< R, S >, opt< S > >
. - Equivalent to
seq< R, star_partial< S, R > >
. - Meta data and implementation mapping:
list_tail< R, S >::rule_t
isinternal::seq< R, internal::star_partial< S, R > >
list_tail< R, S >::subs_t
istype_list< R, internal::star_partial< S, R > >
- Matches a non-empty list of
R
separated byS
with optional trailingS
and paddingP
inside the list. - Equivalent to
seq< list< R, S, P >, opt< star< P >, S > >
. - Equivalent to
seq< R, star_partial< padl< S, P >, padl< R, P > > >
. - Meta data and implementation mapping:
list_tail< R, S, P >::rule_t
isinternal::seq< R, internal::star_partial< internal::padl< S, P >, internal::padl< R, P > > >
list_tail< R, S, P >::subs_t
istype_list< R, internal::star_partial< internal::padl< S, P >, internal::padl< R, P > > >
- Succeeds if
M
matches, andS
does not match all of the input thatM
matched. - Equivalent to
rematch< M, not_at< S, eof > >
. - Meta data and implementation mapping:
minus< M, S >::rule_t
isinternal::rematch< M, internal::not_at< S, internal::eof > >
minus< M, S >::subs_t
istype_list< M, internal::not_at< S, internal::eof > >
- Equivalent to
seq< R... >
, but: - Converts local failure of
R...
into global failure. - Calls
raise< R >
for theR
that failed. - Equivalent to
seq< sor< R, raise< R > >... >
. - Meta data and implementation mapping:
must<>::rule_t
isinternal::success
must< R >::rule_t
isinternal::must< R >
must< R >::subs_t
istype_list< R >
must< R... >::rule_t
isinternal::seq< internal::must< R >... >::rule_t
must< R... >::subs_t
istype_list< internal::must< R... > >
Note that must
uses a different pattern to handle multiple sub-rules compared to the other seq
-equivalent rules (which use rule< seq< R... > >
rather than seq< rule< R >... >
).
- Equivalent to
opt< if_must< R, S... > >
. - Equivalent to
if_then_else< R, must< S... >, success >
. - Meta data and implementation mapping:
opt_must< R >::rule_t
isinternal::if_must< true, R >
opt_must< R >::subs_t
istype_list< R >
opt_must< R, S... >::rule_t
isinternal::if_must< true, R, S... >
opt_must< R, S... >::subs_t
istype_list< R, internal::must< S... > >
Note that the true
template parameter to internal::if_must
corresponds to the success
in the equivalent description using if_then_else
.
- Matches an
R
that can be padded by arbitrary manyS
on the left andT
on the right. - Equivalent to
seq< star< S >, R, star< T > >
. - Meta data and implementation mapping:
pad< R, S, T >::rule_t
isinternal::seq< internal::star< S >, R, internal::star< T > >
pad< R, S, T >::subs_t
istype_list< internal::star< S >, R, internal::star< T > >
- Matches an optional
R
that can be padded by arbitrary manyP
or just arbitrary manyP
. - Equivalent to
seq< star< P >, opt< R, star< P > > >
. - Meta data and implementation mapping:
pad_opt< R, P >::rule_t
isinternal::seq< internal::star< P >, internal::opt< R, internal::star< P > > >
pad_opt< R, P >::subs_t
istype_list< internal::star< P >, internal::opt< R, internal::star< P > > >
- Similar to
opt< R... >
with one important difference: - Does not rewind the input after a partial match of
R...
. - Attempts to match the given rules
R...
in the given order. - Succeeds and stops matching when one of the given rules fails;
- succeds when all of the given rules succeed.
- Consumes everything that the successful rules of
R...
consumed. R
must be a non-empty rule pack.- Equivalent to
opt< R >
whenR...
is a single rule. - Meta data and implementation mapping:
partial< R... >::rule_t
isinternal::partial< R... >
partial< R... >::subs_t
istype_list< R... >
- Succeeds if
R
matches, and eachS
matches the input thatR
matched. - Ignores all
S
for the grammar analysis. - Meta data and implementation mapping:
rematch< R, S... >::rule_t
isinternal::rematch< R, S... >
rematch< R, S... >::subs_t
istype_list< R, S... >
Note that the S
do not need to match all of the input matched by R
(which is why minus
uses eof
in its implementation).
- Matches
seq< R... >
forNum
times without checking for further matches. - Equivalent to
seq< seq< R... >, ..., seq< R... > >
whereseq< R... >
is repeatedNum
times. - Meta data and implementation mapping:
rep< 0, R... >::rule_t
isinternal::success
rep< N >::rule_t
isinternal::success
rep< N, R >::rule_t
isinternal::rep< N, R >
rep< N, R >::subs_t
istype_list< R >
rep< N, R... >::rule_t
isinternal::rep< N, internal::seq< R... > >
rep< N, R... >::subs_t
istype_list< internal::seq< R... > >
- Matches
seq< R... >
for at mostMax
times and verifies that it doesn't match more often. - Equivalent to
rep_min_max< 0, Max, R... >
. - Meta data and implementation mapping:
rep_max< 0, R >::rule_t
isinternal::not_at< R >
rep_max< 0, R >::subs_t
istype_list< R >
rep_max< 0, R... >::rule_t
isinternal::not_at< internal::seq< R... > >
rep_max< 0, R... >::subs_t
istype_list< internal::seq< R... > >
rep_max< Max >::rule_t
isinternal::failure
rep_max< Max, R >::rule_t
isinternal::rep_min_max< 0, Max, R >
rep_max< Max, R >::subs_t
istype_list< R >
rep_max< Max, R... >::rule_t
isinternal::rep_min_max< 0, Max, internal::seq< R... > >
rep_max< Max, R... >::subs_t
istype_list< internal::seq< R... > >
- Matches
seq< R... >
as often as possible and succeeds if it matches at leastMin
times. - Equivalent to
seq< rep< Min, R... >, star< R... > >
. R
must be a non-empty rule pack.- Meta data and implementation mapping:
rep_min< Min, R... >::rule_t
isinternal::seq< internal::rep< Min, R... >, internal::star< R... > >
rep_min< Min, R... >::subs_t
istype_list< internal::rep< Min, R... >, internal::star< R... > >
- Matches
seq< R... >
forMin
toMax
times and verifies that it doesn't match more often. - Equivalent to
seq< rep< Min, R... >, rep_opt< Max - Min, R... >, not_at< R... > >
. - Meta data and implementation mapping:
rep_min_max< 0, 0, R >::rule_t
isinternal::not_at< R >
rep_min_max< 0, 0, R >::subs_t
istype_list< R >
rep_min_max< 0, 0, R... >::rule_t
isinternal::not_at< internal::seq< R... > >
rep_min_max< 0, 0, R... >::subs_t
istype_list< internal::seq< R... > >
rep_min_max< Min, Max >::rule_t
isinternal::failure
rep_min_max< Min, Max, R >::rule_t
isinternal::rep_min_max< Min, Max, R >
rep_min_max< Min, Max, R >::subs_t
istype_list< R >
rep_min_max< Min, Max, R... >::rule_t
isinternal::rep_min_max< Min, Max, internal::seq< R... > >
rep_min_max< Min, Max, R... >::subs_t
istype_list< internal::seq< R... > >
- Matches
seq< R... >
for zero toNum
times without check for further matches. - Equivalent to
rep< Num, opt< R... > >
. - Meta data and implementation mapping:
rep_opt< 0, R... >::rule_t
isinternal::success
rep_opt< Num >::rule_t
isinternal::success
rep_opt< Num, R... >::rule_t
isinternal::seq< internal::rep< Num, R... >, internal::star< R... > >
rep_opt< Num, R... >::subs_t
istype_list< internal::rep< Num, R... >, internal::star< R... > >
- Equivalent to
star< if_must< R, S... > >
. - Meta data and implementation mapping:
star_must< R >::rule_t
isinternal::star< internal::if_must< false, R > >
star_must< R >::subs_t
istype_list< internal::if_must< false, R > >
star_must< R, S... >::rule_t
isinternal::star< internal::if_must< false, R, S... > >
star_must< R, S... >::subs_t
istype_list< internal::if_must< false, R, S... > >
- Similar to
star< R... >
with one important difference: - The final iteration does not rewind the input after a partial match of
R...
. R
must be a non-empty rule pack.- Meta data and implementation mapping:
star_partial< R... >::rule_t
isinternal::star_partial< R... >
star_partial< R... >::subs_t
istype_list< R... >
- Similar to
star< R... >
with one important difference: - A partial match of
R...
letsstar_strict
fail locally. R
must be a non-empty rule pack.- Meta data and implementation mapping:
star_strict< R... >::rule_t
isinternal::star_strict< R... >
star_strict< R... >::subs_t
istype_list< R... >
- Similar to
opt< R... >
with one important difference: - A partial match of
R...
letsstrict
fail locally. - Equivalent to
sor< not_at< R1 >, seq< R... > >
ifR1
is the first rule ofR...
. R
must be a non-empty rule pack.- Meta data and implementation mapping:
strict< R... >::rule_t
isinternal::strict< R... >
strict< R... >::subs_t
istype_list< R... >
- Equivalent to
seq< R... >
, but: - Catches exceptions of any type via
catch( ... )
and: - Throws a new exception with the caught one as nested exception.
- Throws via
Control< R >::raise_nested()
whenR...
is a single rule. - Throws via
Control< internal::seq< R... > >::raise_nested()
whenR...
is more than one rule. - Meta data and implementation mapping:
try_catch_any_raise_nested<>::rule_t
isinternal::success
try_catch_any_raise_nested< R >::rule_t
isinternal::try_catch_raise_nested< void, R >
try_catch_any_raise_nested< R >::subs_t
istype_list< R >
try_catch_any_raise_nested< R... >::rule_t
isinternal::try_catch_raise_nested< void, internal::seq< R... > >
try_catch_any_raise_nested< R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of any type via
catch( ... )
, and: - Converts the global failure (exception) into a local failure (return value
false
). - Meta data and implementation mapping:
try_catch_any_return_false< E >::rule_t
isinternal::success
try_catch_any_return_false< E, R >::rule_t
isinternal::try_catch_return_false< void, R >
try_catch_any_return_false< E, R >::subs_t
istype_list< R >
try_catch_any_return_false< E, R... >::rule_t
isinternal::try_catch_return_false< void, internal::seq< R... > >
try_catch_any_return_false< E, R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
tao::pegtl::parse_error_base
(or derived), and: - Throws a new exception with the caught one as nested exception.
- Throws via
Control< R >::raise_nested()
whenR...
is a single rule. - Throws via
Control< internal::seq< R... > >::raise_nested()
whenR...
is more than one rule. - Meta data and implementation mapping:
try_catch_raise_nested<>::rule_t
isinternal::success
try_catch_raise_nested< R >::rule_t
isinternal::try_catch_raise_nested< parse_error_base, R >
try_catch_raise_nested< R >::subs_t
istype_list< R >
try_catch_raise_nested< R... >::rule_t
isinternal::try_catch_raise_nested< parse_error_base, internal::seq< R... > >
try_catch_raise_nested< R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
tao::pegtl::parse_error_base
(or derived), and: - Converts the global failure (exception) into a local failure (return value
false
). - Meta data and implementation mapping:
try_catch_return_false<>::rule_t
isinternal::success
try_catch_return_false< R >::rule_t
isinternal::try_catch_return_false< parse_error_base, R >
try_catch_return_false< R >::subs_t
istype_list< R >
try_catch_return_false< R... >::rule_t
isinternal::try_catch_return_false< parse_error_base, internal::seq< R... > >
try_catch_return_false< R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
std::exception
(or derived), and: - Throws a new exception with the caught one as nested exception.
- Throws via
Control< R >::raise_nested()
whenR...
is a single rule. - Throws via
Control< internal::seq< R... > >::raise_nested()
whenR...
is more than one rule. - Meta data and implementation mapping:
try_catch_std_raise_nested<>::rule_t
isinternal::success
try_catch_std_raise_nested< R >::rule_t
isinternal::try_catch_raise_nested< std::exception, R >
try_catch_std_raise_nested< R >::subs_t
istype_list< R >
try_catch_std_raise_nested< R... >::rule_t
isinternal::try_catch_raise_nested< std::exception, internal::seq< R... > >
try_catch_std_raise_nested< R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
std::exception
(or derived), and: - Converts the global failure (exception) into a local failure (return value
false
). - Meta data and implementation mapping:
try_catch_std_return_false< E >::rule_t
isinternal::success
try_catch_std_return_false< E, R >::rule_t
isinternal::try_catch_return_false< std::exception, R >
try_catch_std_return_false< E, R >::subs_t
istype_list< R >
try_catch_std_return_false< E, R... >::rule_t
isinternal::try_catch_return_false< std::exception, internal::seq< R... > >
try_catch_std_return_false< E, R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
E
(or derived), and: - Throws a new exception with the caught one as nested exception.
- Throws via
Control< R >::raise_nested()
whenR...
is a single rule. - Throws via
Control< internal::seq< R... > >::raise_nested()
whenR...
is more than one rule. - Meta data and implementation mapping:
try_catch_type_raise_nested< E >::rule_t
isinternal::success
try_catch_type_raise_nested< E, R >::rule_t
isinternal::try_catch_raise_nested< E, R >
try_catch_type_raise_nested< E, R >::subs_t
istype_list< R >
try_catch_type_raise_nested< E, R... >::rule_t
isinternal::try_catch_raise_nested< E, internal::seq< R... > >
try_catch_type_raise_nested< E, R... >::subs_t
istype_list< internal::seq< R... > >
- Equivalent to
seq< R... >
, but: - Catches exceptions of type
E
(or derived), and: - Converts the global failure (exception) into a local failure (return value
false
). - Meta data and implementation mapping:
try_catch_type_return_false< E >::rule_t
isinternal::success
try_catch_type_return_false< E, R >::rule_t
isinternal::try_catch_return_false< E, R >
try_catch_type_return_false< E, R >::subs_t
istype_list< R >
try_catch_type_return_false< E, R... >::rule_t
isinternal::try_catch_return_false< E, internal::seq< R... > >
try_catch_type_return_false< E, R... >::subs_t
istype_list< internal::seq< R... > >
- Consumes all input until
R
matches. - Equivalent to
until< R, any >
. - Meta data and implementation mapping:
until< R >::rule_t
isinternal::until< R >
until< R >::subs_t
istype_list< R >
- Matches
seq< S... >
as long asat< R >
does not match and succeeds whenR
matches. - Equivalent to
seq< star< not_at< R >, S... >, R >
. - Does not apply if
S
is an empty rule pack, see the previous entry for the semantics ofuntil< R >
. - Meta data and implementation mapping:
until< R, S >::rule_t
isinternal::until< R, S >
until< R, S >::subs_t
istype_list< R, S >
until< R, S... >::rule_t
isinternal::until< R, internal::seq< S... > >
until< R, S... >::subs_t
istype_list< R, internal::seq< S... > >
These rules are in namespace tao::pegtl
.
These rules replicate the intrusive way actions were called from within the grammar in the PEGTL 0.x with the apply<>
and if_apply<>
rules.
The actions for these rules are classes (rather than class templates as required for parse()
and the action<>
-rule).
These rules respect the current apply_mode
, but do not use the control class to invoke the actions.
- Calls
A::apply()
for allA
, in order, with an empty input and all states as arguments. - If any
A::apply()
has a boolean return type and returnsfalse
, no furtherA::apply()
calls are made and the result is equivalent tofailure
, otherwise: - Equivalent to
success
wrt. parsing. - Meta data and implementation mapping:
apply< A... >::rule_t
isinternal::apply< A... >
- Calls
A::apply0()
for allA
, in order, with all states as arguments. - If any
A::apply0()
has a boolean return type and returnsfalse
, no furtherA::apply0()
calls are made and the result is equivalent tofailure
, otherwise: - Equivalent to
success
wrt. parsing. - Meta data and implementation mapping:
apply0< A... >::rule_t
isinternal::apply0< A... >
- Equivalent to
seq< R, apply< A... > >
wrt. parsing, but also: - If
R
matches, callsA::apply()
, for allA
, in order, with the input matched byR
and all states as arguments. - If any
A::apply()
has a boolean return type and returnsfalse
, no furtherA::apply()
calls are made. - Meta data and implementation mapping:
if_apply< R, A... >::rule_t
isinternal::if_apply< R, A... >
if_apply< R, A... >::subs_t
istype_list< R >
These rules are in namespace tao::pegtl
.
Atomic rules do not rely on other rules.
- Succeeds at "beginning-of-file", i.e. when the input's
byte()
member function returns zero. - Does not consume input.
- Does not work with inputs that don't have a
byte()
member function. - Meta data and implementation mapping:
bof::rule_t
isinternal::bof
- Succeeds at "beginning-of-line", i.e. when the input's
column()
member function returns one. - Does not consume input.
- Does not work with inputs that don't have a
column()
member function. - Meta data and implementation mapping:
bol::rule_t
isinternal::bol
- Succeeds when the input contains at least
Num
further bytes. - Consumes these
Num
bytes from the input. - Meta data and implementation mapping:
bytes< 0 >::rule_t
isinternal::success
bytes< Num >::rule_t
isinternal::bytes< Num >
- Succeeds at "end-of-file", i.e. when the input is empty or all input has been consumed.
- Does not consume input.
- Meta data and implementation mapping:
eof::rule_t
isinternal::eof
- Depends on the
Eol
template parameter of the input, by default: - Matches and consumes a Unix or MS-DOS line ending, that is:
- Equivalent to
sor< one< '\n' >, string< '\r', '\n' > >
. - Meta data and implementation mapping:
eol::rule_t
isinternal::eol
- Equivalent to
sor< eof, eol >
. - Meta data and implementation mapping:
eolf::rule_t
isinternal::eolf
- Matches and consumes the entire input in one go, but:
- Limited by the buffer size when using an Incremental Input.
- Equivalent to
until< eof, any >
. - Meta data and implementation mapping:
everything::rule_t
isinternal::everything< std::size_t >
- Dummy rule that never succeeds.
- Does not consume input.
- Meta data and implementation mapping:
failure::rule_t
isinternal::failure
- Generates a global failure.
- Calls the control-class'
Control< T >::raise()
static member function. T
can be a rule, but it does not have to be a rule.- Does not consume input.
- Meta data and implementation mapping:
raise< T >::rule_t
isinternal::raise< T >
- Generates a global failure with the message given by
C...
. - Calls the control-class'
Control< raise_message< C... > >::raise()
static member function. - Does not consume input.
- Meta data and implementation mapping:
raise_message< C... >::rule_t
isinternal::raise< raise_message< C... > >
- Dummy rule that always succeeds.
- Does not consume input.
- Meta data and implementation mapping:
success::rule_t
isinternal::success
- Macro where
TAO_PEGTL_RAISE_MESSAGE( "foo" )
yieldsraise_message< 'f', 'o', 'o' >
. - The argument must be a string literal.
- Works for strings up to 512 bytes of length (excluding trailing
'\0'
).
These rules are in the inline namespace tao::pegtl::ascii
.
The ASCII rules operate on single bytes, without restricting the range of values to 7 bits.
They are compatible with input with the 8th bit set in the sense that nothing breaks in their presence.
Rules like ascii::any
or ascii::not_one< 'a' >
will match all possible byte values,
and all possible byte values excluding 'a'
, respectively. However the character class rules like
ascii::alpha
only match the corresponding ASCII characters.
(It is possible to match UTF-8 multi-byte characters with the ASCII rules,
for example the Euro sign code point U+20AC
, which is encoded by the UTF-8 sequence E2 82 AC
,
can be matched by either tao::pegtl::ascii::string< 0xe2, 0x82, 0xac >
or tao::pegtl::utf8::one< 0x20ac >
.)
ASCII rules do not usually rely on other rules.
- Matches and consumes a single ASCII alphabetic or numeric character.
- Equivalent to
ranges< 'a', 'z', 'A', 'Z', '0', '9' >
. - Meta data and implementation mapping:
ascii::alnum::rule_t
isinternal::ranges< internal::peek_char, 'a', 'z', 'A', 'Z', '0', '9' >
- Matches and consumes a single ASCII alphabetic character.
- Equivalent to
ranges< 'a', 'z', 'A', 'Z' >
. - Meta data and implementation mapping:
ascii::alpha::rule_t
isinternal::ranges< internal::peek_char, 'a', 'z', 'A', 'Z' >
- Matches and consumes any single byte, including all ASCII characters.
- Equivalent to
bytes< 1 >
. - Meta data and implementation mapping:
ascii::any::rule_t
isinternal::any< internal::peek_char >
- Matches and consumes a single ASCII horizontal space or horizontal tabulator character.
- Equivalent to
one< ' ', '\t' >
. - Meta data and implementation mapping:
ascii::blank::rule_t
isinternal::one< internal::result_on_found::success, internal::peek_char, ' ', '\t' >
- Matches and consumes a single ASCII decimal digit character.
- Equivalent to
range< '0', '9' >
. - Meta data and implementation mapping:
ascii::digit::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, '0', '9' >
- Matches and consumes three dots.
- Equivalent to
three< '.' >
. - Meta data and implementation mapping:
ascii::ellipsis::rule_t
isinternal::string< '.', '.', '.' >
- Equivalent to
rep< 42, one< C... > >
. - Meta data and implementation mapping:
ascii::forty_two< C >::rule_t
isinternal_rep< 42, internal::one< internal::result_on_found::success, internal::peek_char, C > >
- Matches and consumes a single ASCII character permissible as first character of a C identifier.
- Equivalent to
ranges< 'a', 'z', 'A', 'Z', '_' >
. - Meta data and implementation mapping:
ascii::identifier_first::rule_t
isinternal::ranges< internal::peek_char, 'a', 'z', 'A', 'Z', '_' >
- Matches and consumes a single ASCII character permissible as subsequent character of a C identifier.
- Equivalent to
ranges< 'a', 'z', 'A', 'Z', '0', '9', '_' >
. - Meta data and implementation mapping:
ascii::identifier_first::rule_t
isinternal::ranges< internal::peek_char, 'a', 'z', 'A', 'Z', '0', '9', '_' >
- Matches and consumes an ASCII identifier as defined for the C programming language.
- Equivalent to
seq< identifier_first, star< identifier_other > >
. - Meta data and implementation mapping:
ascii::identifier::rule_t
isinternal::seq< identifier_first, internal::star< identifier_other > >
.
- Matches and consumes the given ASCII string
C...
with case insensitive matching. - Similar to
string< C... >
, but: - For ASCII letters a-z and A-Z the match is case insensitive.
- Meta data and implementation mapping:
ascii::istring<>::rule_t
isinternal::success
ascii::istring< C... >::rule_t
isinternal::istring< C... >
- Matches and consumes a non-empty string not followed by an identifier character.
- Equivalent to
seq< string< C... >, not_at< identifier_other > >
. C
must be a non-empty character pack.- Meta data and implementation mapping:
ascii::keyword< C... >::rule_t
isinternal::seq< internal::string< C... >, internal::not_at< internal::ranges< internal::peek_char, 'a', 'z', 'A', 'Z', '0', '9', '_' > > >
- Matches and consumes a single ASCII lower-case alphabetic character.
- Equivalent to
range< 'a', 'z' >
. - Meta data and implementation mapping:
ascii::lower::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, 'a', 'z' >
- Succeeds when the input is not empty, and:
C
is an empty character pack or the next input byte is not one ofC...
.- Consumes one byte when it succeeds.
- Meta data and implementation mapping:
ascii::not_one<>::rule_t
isinternal::any< internal::peek_char >
ascii::not_one< C... >::rule_t
isinternal::one< result_on_found::failure, internal::peek_char, C... >
- Succeeds when the input is not empty, and:
- The next input byte is not in the closed range
C ... D
. - Consumes one byte when it succeeds.
- Meta data and implementation mapping:
ascii::not_range< C, C >::rule_t
isinternal::one< result_on_found::failure, internal::peek_char, C >
ascii::not_range< C, D >::rule_t
isinternal::range< result_on_found::failure, internal::peek_char, C, D >
- Matches and consumes an ASCII nul character.
- Equivalent to
one< '\0' >
.ascii::nul::rule_t
isinternal::one< result_on_found::success, internal::peek_char, 0 >
- Matches and consumes a single ASCII octal digit character.
- Equivalent to
range< '0', '7' >
. - Meta data and implementation mapping:
ascii::digit::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, '0', '7' >
- Succeeds when the input is not empty, and:
- The next input byte is one of
C...
. - Consumes one byte when it succeeds.
- Fails if
C
is an empty character pack. - Meta data and implementation mapping:
ascii::not_one<>::rule_t
isinternal::failure
ascii::not_one< C... >::rule_t
isinternal::one< result_on_found::success, internal::peek_char, C... >
- Matches and consumes any single ASCII character traditionally defined as printable.
- Equivalent to
range< 32, 126 >
. - Meta data and implementation mapping:
ascii::print::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, 32, 126 >
- Succeeds when the input is not empty, and:
- The next input byte is in the closed range
C ... D
. - Consumes one byte when it succeeds.
- Meta data and implementation mapping:
ascii::range< C, C >::rule_t
isinternal::one< result_on_found::success, internal::peek_char, C >
ascii::range< C, D >::rule_t
isinternal::range< result_on_found::success, internal::peek_char, C, D >
- Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ... >
. - Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ..., one< E > >
. - Meta data and implementation mapping:
ascii::ranges<>::rule_t
isinternal::failure
ascii::ranges< E >::rule_t
isinternal::one< result_on_found::success, internal::peek_char, E >
ascii::ranges< C, D >::rule_t
isinternal::range< result_on_found::success, internal::peek_char, C, D >
ascii::ranges< C... >::rule_t
isinternal::ranges< internal::peek_char, C... >
- Matches and consumes any single true ASCII character that fits into 7 bits.
- Equivalent to
range< 0, 127 >
. - Meta data and implementation mapping:
ascii::seven::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, 0, 127 >
- Equivalent to
if_must< string< '#', '!' >, until< eolf > >
. - Meta data and implementation mapping:
ascii::shebang::rule_t
isinternal::seq< false, internal::string< '#', '!' >, internal::until< internal::eolf > >
ascii::shebang::subs_t
istype_list< internal::string< '#', '!' >, internal::until< internal::eolf > >
- Matches and consumes a single space, line-feed, carriage-return, horizontal-tab, vertical-tab or form-feed.
- Equivalent to
one< ' ', '\n', '\r', '\t', '\v', '\f' >
.
- Matches and consumes a string, a sequence of bytes or single-byte characters.
- Equivalent to
seq< one< C >... >
. - Meta data and implementation mapping:
ascii::string<>::rule_t
isinternal::success
ascii::string< C... >::rule_t
isinternal::string< C... >
- Macro where
TAO_PEGTL_ISTRING( "foo" )
yieldsistring< 'f', 'o', 'o' >
. - The argument must be a string literal.
- Works for strings up to 512 bytes of length (excluding trailing
'\0'
). - Strings may contain embedded
'\0'
.
- Macro where
TAO_PEGTL_KEYWORD( "foo" )
yieldskeyword< 'f', 'o', 'o' >
. - The argument must be a string literal.
- Works for keywords up to 512 bytes of length (excluding trailing
'\0'
). - Strings may contain embedded
'\0'
.
- Macro where
TAO_PEGTL_STRING( "foo" )
yieldsstring< 'f', 'o', 'o' >
. - The argument must be a string literal.
- Works for strings up to 512 bytes of length (excluding trailing
'\0'
). - Strings may contain embedded
'\0'
.
- Succeeds when the input contains at least three bytes, and:
- These three input bytes are all
C
. - Consumes three bytes when it succeeds.
- Meta data and implementation mapping:
ascii::three< C >::rule_t
isinternal::string< C, C, C >
- Succeeds when the input contains at least two bytes, and:
- These two input bytes are both
C
. - Consumes two bytes when it succeeds.
- Meta data and implementation mapping:
ascii::two< C >::rule_t
isinternal::string< C, C >
- Matches and consumes a single ASCII upper-case alphabetic character.
- Equivalent to
range< 'A', 'Z' >
. - Meta data and implementation mapping:
ascii::upper::rule_t
isinternal::range< internal::result_on_found::success, internal::peek_char, 'A', 'Z' >
- Matches and consumes a single ASCII hexadecimal digit character.
- Equivalent to
ranges< '0', '9', 'a', 'f', 'A', 'F' >
. - Meta data and implementation mapping:
ascii::xdigit::rule_t
isinternal::ranges< internal::peek_char, '0', '9', 'a', 'f', 'A', 'F' >
These rules are available in multiple versions,
- in namespace
tao::pegtl::utf8
for UTF-8 encoded inputs, - in namespace
tao::pegtl::utf16_be
for big-endian UTF-16 encoded inputs, - in namespace
tao::pegtl::utf16_le
for little-endian UTF-16 encoded inputs, - in namespace
tao::pegtl::utf32_be
for big-endian UTF-32 encoded inputs, - in namespace
tao::pegtl::utf32_le
for little-endian UTF-32 encoded inputs.
For convenience, they also appear in multiple namespace aliases,
- namespace alias
tao::pegtl::utf16
for native-endian UTF-16 encoded inputs, - namespace alias
tao::pegtl::utf32
for native-endian UTF-32 encoded inputs.
The following limitations apply to the UTF-16 and UTF-32 rules:
- Unaligned input leads to unaligned memory access.
- The line and column numbers are not counted correctly.
- They are not automatically included with
tao/pegtl.hpp
.
The UTF-8 rules are included with include/tao/pegtl.hpp
while the UTF-16 and UTF-32 rules require manual inclusion of the following files.
tao/pegtl/contrib/utf16.hpp
tao/pegtl/contrib/utf32.hpp
While unaligned accesses are no problem on x86 compatible processors, on other architectures they might be very slow or even crash the application.
In the following descriptions a Unicode code point is considered valid when it is in the range 0
to 0x10ffff
.
The parameter N stands for the size of the encoding of the next Unicode code point in the input, i.e.
- for UTF-8 the rules are multi-byte-sequence-aware and N is either 1, 2, 3 or 4,
- for UTF-16 the rules are surrogate-pair-aware and N is either 2 or 4, and
- for UTF-32 everything is simple and N is always 4.
It is an error when a code unit in the range 0xd800
to 0xdfff
is encountered outside of a valid UTF-16 surrogate pair (this changed in version 2.6.0).
Unicode rules do not rely on other rules.
- Succeeds when the input is not empty, and:
- The next N bytes encode a valid Unicode code point.
- Consumes the N bytes when it succeeds.
- Equivalent to
one< 0xfeff >
.
- Succeeds when the input is not empty, and:
- The next N bytes encode a valid Unicode code point, and:
C
is an empty character pack or the input code point is not one of the given code pointsC...
.- Consumes the N bytes when it succeeds.
- Succeeds when the input is not empty, and:
- The next N bytes encode a valid Unicode code point, and:
- The input code point
B
satisfiesB < C || D < B
. - Consumes the N bytes when it succeeds.
- Succeeds when the input is not empty, and:
- The next N bytes encode a valid Unicode code point, and:
C
is a non-empty character pack and the input code point is one of the given code pointsC...
.- Consumes the N bytes when it succeeds.
- Succeeds when the input is not empty, and:
- The next N bytes encode a valid Unicode code point, and:
- The input code point
B
satisfiesC <= B && B <= D
. - Consumes the N bytes when it succeeds.
- Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ... >
.
- Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ..., one< E > >
.
- Equivalent to
seq< one< C >... >
.
The following rules depend on the International Components for Unicode (ICU) that provide the means to match characters with specific Unicode character properties.
Because of this external dependency the rules are not automatically included in tao/pegtl.hpp
.
The ICU-based rules are again available in multiple versions,
- in namespace
tao::pegtl::utf8::icu
for UTF-8 encoded inputs, - in namespace
tao::pegtl::utf16_be::icu
for big-endian UTF-16 encoded inputs, - in namespace
tao::pegtl::utf16_le::icu
for little-endian UTF-16 encoded inputs, - in namespace
tao::pegtl::utf32_be::icu
for big-endian UTF-32 encoded inputs, and - in namespace
tao::pegtl::utf32_le::icu
for little-endian UTF-32 encoded inputs.
And, for convenience, they again appear in multiple namespace aliases,
- namespace alias
tao::pegtl::utf16::icu
for native-endian UTF-16 encoded inputs, - namespace alias
tao::pegtl::utf32::icu
for native-endian UTF-32 encoded inputs.
To use these rules it is necessary to provide an include path to the ICU library, to link the application against libicu
, and to manually include one or more of the following header files:
tao/pegtl/contrib/icu/utf8.hpp
tao/pegtl/contrib/icu/utf16.hpp
tao/pegtl/contrib/icu/utf32.hpp
The convenience ICU rules are supplied for all properties found in ICU version 3.4. Users of later versions can use the basic rules manually or create their own convenience rules derived from the basic rules for additional enumeration values found in those later versions of the ICU library.
Each of the above namespaces provides two basic rules for matching binary properties and property value matching for enum properties.
P
is a binary property defined by ICU, seeUProperty
.V
is a boolean value.- Succeeds when the input is not empty, and:
- The next N bytes encode a valid unicode code point, and:
- The code point's property
P
, i.e.u_hasBinaryProperty( cp, P )
, equalsV
. - Consumes the N bytes when it succeeds.
- Identical to
binary_property< P, true >
.
P
is an enumerated property defined by ICU, seeUProperty
.V
is an integer value.- Succeeds when the input is not empty, and:
- The next N bytes encode a valid unicode code point, and:
- The code point's property
P
, i.e.u_getIntPropertyValue( cp, P )
, equalsV
. - Consumes the N bytes when it succeeds.
Convenience wrappers for binary properties.
- Equivalent to
binary_property< UCHAR_ALPHABETIC >
.
- Equivalent to
binary_property< UCHAR_ASCII_HEX_DIGIT >
.
- Equivalent to
binary_property< UCHAR_BIDI_CONTROL >
.
- Equivalent to
binary_property< UCHAR_BIDI_MIRRORED >
.
- Equivalent to
binary_property< UCHAR_CASE_SENSITIVE >
.
- Equivalent to
binary_property< UCHAR_DASH >
.
- Equivalent to
binary_property< UCHAR_DEFAULT_IGNORABLE_CODE_POINT >
.
- Equivalent to
binary_property< UCHAR_DEPRECATED >
.
- Equivalent to
binary_property< UCHAR_DIACRITIC >
.
- Equivalent to
binary_property< UCHAR_EXTENDER >
.
- Equivalent to
binary_property< UCHAR_FULL_COMPOSITION_EXCLUSION >
.
- Equivalent to
binary_property< UCHAR_GRAPHEME_BASE >
.
- Equivalent to
binary_property< UCHAR_GRAPHEME_EXTEND >
.
- Equivalent to
binary_property< UCHAR_GRAPHEME_LINK >
.
- Equivalent to
binary_property< UCHAR_HEX_DIGIT >
.
- Equivalent to
binary_property< UCHAR_HYPHEN >
.
- Equivalent to
binary_property< UCHAR_ID_CONTINUE >
.
- Equivalent to
binary_property< UCHAR_ID_START >
.
- Equivalent to
binary_property< UCHAR_IDEOGRAPHIC >
.
- Equivalent to
binary_property< UCHAR_IDS_BINARY_OPERATOR >
.
- Equivalent to
binary_property< UCHAR_IDS_TRINARY_OPERATOR >
.
- Equivalent to
binary_property< UCHAR_JOIN_CONTROL >
.
- Equivalent to
binary_property< UCHAR_LOGICAL_ORDER_EXCEPTION >
.
- Equivalent to
binary_property< UCHAR_LOWERCASE >
.
- Equivalent to
binary_property< UCHAR_MATH >
.
- Equivalent to
binary_property< UCHAR_NFC_INERT >
.
- Equivalent to
binary_property< UCHAR_NFD_INERT >
.
- Equivalent to
binary_property< UCHAR_NFKC_INERT >
.
- Equivalent to
binary_property< UCHAR_NFKD_INERT >
.
- Equivalent to
binary_property< UCHAR_NONCHARACTER_CODE_POINT >
.
- Equivalent to
binary_property< UCHAR_PATTERN_SYNTAX >
.
- Equivalent to
binary_property< UCHAR_PATTERN_WHITE_SPACE >
.
- Equivalent to
binary_property< UCHAR_POSIX_ALNUM >
.
- Equivalent to
binary_property< UCHAR_POSIX_BLANK >
.
- Equivalent to
binary_property< UCHAR_POSIX_GRAPH >
.
- Equivalent to
binary_property< UCHAR_POSIX_PRINT >
.
- Equivalent to
binary_property< UCHAR_POSIX_XDIGIT >
.
- Equivalent to
binary_property< UCHAR_QUOTATION_MARK >
.
- Equivalent to
binary_property< UCHAR_RADICAL >
.
- Equivalent to
binary_property< UCHAR_S_TERM >
.
- Equivalent to
binary_property< UCHAR_SEGMENT_STARTER >
.
- Equivalent to
binary_property< UCHAR_SOFT_DOTTED >
.
- Equivalent to
binary_property< UCHAR_TERMINAL_PUNCTUATION >
.
- Equivalent to
binary_property< UCHAR_UNIFIED_IDEOGRAPH >
.
- Equivalent to
binary_property< UCHAR_UPPERCASE >
.
- Equivalent to
binary_property< UCHAR_VARIATION_SELECTOR >
.
- Equivalent to
binary_property< UCHAR_WHITE_SPACE >
.
- Equivalent to
binary_property< UCHAR_XID_CONTINUE >
.
- Equivalent to
binary_property< UCHAR_XID_START >
.
Convenience wrappers for enumerated properties.
V
is of typeUCharDirection
.- Equivalent to
property_value< UCHAR_BIDI_CLASS, V >
.
V
is of typeUBlockCode
.- Equivalent to
property_value< UCHAR_BLOCK, V >
.
V
is of typeUDecompositionType
.- Equivalent to
property_value< UCHAR_DECOMPOSITION_TYPE, V >
.
V
is of typeUEastAsianWidth
.- Equivalent to
property_value< UCHAR_EAST_ASIAN_WIDTH, V >
.
V
is of typeUCharCategory
.- Equivalent to
property_value< UCHAR_GENERAL_CATEGORY, V >
.
V
is of typeUGraphemeClusterBreak
.- Equivalent to
property_value< UCHAR_GRAPHEME_CLUSTER_BREAK, V >
.
V
is of typeUHangulSyllableType
.- Equivalent to
property_value< UCHAR_HANGUL_SYLLABLE_TYPE, V >
.
V
is of typeUJoiningGroup
.- Equivalent to
property_value< UCHAR_JOINING_GROUP, V >
.
V
is of typeUJoiningType
.- Equivalent to
property_value< UCHAR_JOINING_TYPE, V >
.
V
is of typeULineBreak
.- Equivalent to
property_value< UCHAR_LINE_BREAK, V >
.
V
is of typeUNumericType
.- Equivalent to
property_value< UCHAR_NUMERIC_TYPE, V >
.
V
is of typeUSentenceBreak
.- Equivalent to
property_value< UCHAR_SENTENCE_BREAK, V >
.
V
is of typeUWordBreakValues
.- Equivalent to
property_value< UCHAR_WORD_BREAK, V >
.
Convenience wrappers for enumerated properties that return a value instead of an actual enum
.
V
is of typestd::uint8_t
.- Equivalent to
property_value< UCHAR_CANONICAL_COMBINING_CLASS, V >
.
V
is of typestd::uint8_t
.- Equivalent to
property_value< UCHAR_LEAD_CANONICAL_COMBINING_CLASS, V >
.
V
is of typestd::uint8_t
.- Equivalent to
property_value< UCHAR_TRAIL_CANONICAL_COMBINING_CLASS, V >
.
These rules are available in multiple versions,
- in namespace
tao::pegtl::uint8
for 8-bit integer values, - in namespace
tao::pegtl::uint16_be
for big-endian 16-bit integer values, - in namespace
tao::pegtl::uint16_le
for little-endian 16-bit integer values, - in namespace
tao::pegtl::uint32_be
for big-endian 32-bit integer values, - in namespace
tao::pegtl::uint32_le
for little-endian 32-bit integer values, - in namespace
tao::pegtl::uint64_be
for big-endian 64-bit integer values, and - in namespace
tao::pegtl::uint64_le
for little-endian 64-bit integer values.
The binary rules need to be manually included from their corresponding headers in the contrib
section.
These rules read one or more bytes from the input to form (and match) an 8, 16, 32 or 64-bit value, respectively, and corresponding template parameters are given as either std::uint8_t
, std::uint16_t
, std::uint32_t
or std::uin64_t
.
In the following descriptions, the parameter N is the size of a single value in bytes, i.e. either 1, 2, 4 or 8. The term input value indicates a correspondingly sized integer value read from successive bytes of the input.
Binary rules do not rely on other rules.
- Succeeds when the input contains at least N bytes.
- Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
C
is an empty character pack or the (endian adjusted) input value masked withM
is not one of the given valuesC...
.- Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
- The (endian adjusted) input value
B
satisfies( B & M ) < C || D < ( B & M )
. - Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
C
is a non-empty character pack and the (endian adjusted) input value masked withM
is one of the given valuesC...
.- Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
- The (endian adjusted) input value
B
satisfiesC <= ( B & M ) && ( B & M ) <= D
. - Consumes N bytes when it succeeds.
- Equivalent to
sor< mask_range< M, C1, D1 >, mask_range< M, C2, D2 >, ... >
.
- Equivalent to
sor< mask_range< M, C1, D1 >, mask_range< M, C2, D2 >, ..., mask_one< M, E > >
.
- Equivalent to
seq< mask_one< M, C >... >
.
- Succeeds when the input contains at least N bytes, and:
C
is an empty character pack or the (endian adjusted) input value is not one of the given valuesC...
.- Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
- The (endian adjusted) input value
B
satisfiesB < C || D < B
. - Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
C
is a non-empty character pack and the (endian adjusted) input value is one of the given valuesC...
.- Consumes N bytes when it succeeds.
- Succeeds when the input contains at least N bytes, and:
- The (endian adjusted) input value
B
satisfiesC <= B && B <= D
. - Consumes N byte when it succeeds.
- Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ... >
.
- Equivalent to
sor< range< C1, D1 >, range< C2, D2 >, ..., one< E > >
.
- Equivalent to
seq< one< C >... >
.
action< A, R... >
(meta rules)alnum
(ascii rules)alpha
(ascii rules)alphabetic
(icu rules)any
(ascii rules)any
(unicode rules)any
(binary rules)apply< A... >
(action rules)apply0< A... >
(action rules)ascii_hex_digit
(icu rules)at< R... >
(combinators)bidi_class< V >
(icu rules)bidi_control
(icu rules)bidi_mirrored
(icu rules)binary_property< P >
(icu rules)binary_property< P, V >
(icu rules)blank
(ascii rules)block< V >
(icu rules)bof
(atomic rules)bol
(atomic rules)bom
(unicode rules)bytes< Num >
(atomic rules)canonical_combining_class< V >
(icu rules)case_sensitive
(icu rules)control< C, R... >
(meta rules)dash
(icu rules)decomposition_type< V >
(icu rules)default_ignorable_code_point
(icu rules)deprecated
(icu rules)diacritic
(icu rules)digit
(ascii rules)disable< R... >
(meta rules)discard
(meta rules)east_asian_width< V >
(icu rules)enable< R... >
(meta-rules)eof
(atomic rules)eol
(atomic rules)eolf
(atomic rules)everything
(atomic rules)extender
(icu rules)failure
(atomic rules)forty_two< C... >
(ascii rules)full_composition_exclusion
(icu rules)general_category< V >
(icu rules)grapheme_base
(icu rules)grapheme_cluster_break< V >
(icu rules)grapheme_extend
(icu rules)grapheme_link
(icu rules)hangul_syllable_type< V >
(icu rules)hex_digit
(icu rules)hyphen
(icu rules)id_continue
(icu rules)id_start
(icu rules)identifier_first
(ascii rules)identifier_other
(ascii rules)identifier
(ascii rules)ideographic
(icu rules)ids_binary_operator
(icu rules)ids_trinary_operator
(icu rules)if_apply< R, A... >
(action rules)if_must< R, S... >
(convenience)if_must_else< R, S, T >
(convenience)if_then_else< R, S, T >
(convenience)istring< C... >
(ascii rules)join_control
(icu rules)joining_group< V >
(icu rules)joining_type< V >
(icu rules)keyword< C... >
(ascii rules)lead_canonical_combining_class< V >
(icu rules)line_break< V >
(icu rules)list< R, S >
(convenience)list< R, S, P >
(convenience)list_must< R, S >
(convenience)list_must< R, S, P >
(convenience)list_tail< R, S >
(convenience)list_tail< R, S, P >
(convenience)logical_order_exception
(icu rules)lower
(ascii rules)lowercase
(icu rules)mask_not_one< M, C... >
(binary rules)mask_not_range< M, C, D >
(binary rules)mask_one< M, C... >
(binary rules)mask_range< M, C, D >
(binary rules)mask_ranges< M, C1, D1, C2, D2, ... >
(binary rules)mask_ranges< M, C1, D1, C2, D2, ..., E >
(binary rules)mask_string< M, C... >
(binary rules)math
(icu rules)minus< M, S >
(convenience)must< R... >
(convenience)nfc_inert
(icu rules)nfd_inert
(icu rules)nfkc_inert
(icu rules)nfkd_inert
(icu rules)noncharacter_code_point
(icu rules)not_at< R... >
(combinators)not_one< C... >
(ascii rules)not_one< C... >
(unicode rules)not_one< C... >
(binary rules)not_range< C, D >
(ascii rules)not_range< C, D >
(unicode rules)not_range< C, D >
(binary rules)nul
(ascii rules)numeric_type< V >
(icu rules)one< C... >
(ascii rules)one< C... >
(unicode rules)one< C... >
(binary rules)opt< R... >
(combinators)opt_must< R, S...>
(convenience)pad< R, S, T = S >
(convenience)pad_opt< R, P >
(convenience)partial< R... >
(convenience)pattern_syntax
(icu rules)pattern_white_space
(icu rules)plus< R... >
(combinators)posix_alnum
(icu rules)posix_blank
(icu rules)posix_graph
(icu rules)posix_print
(icu rules)posix_xdigit
(icu rules)print
(ascii rules)property_value< P, V >
(icu rules)quotation_mark
(icu rules)radical
(icu rules)raise< T >
(atomic rules)raise_message< C... >
(atomic rules)range< C, D >
(ascii rules)range< C, D >
(unicode rules)range< C, D >
(binary rules)ranges< C1, D1, C2, D2, ... >
(ascii rules)ranges< C1, D1, C2, D2, ... >
(unicode rules)ranges< C1, D1, C2, D2, ... >
(binary rules)ranges< C1, D1, C2, D2, ..., E >
(ascii rules)ranges< C1, D1, C2, D2, ..., E >
(unicode rules)ranges< C1, D1, C2, D2, ..., E >
(binary rules)rematch< R, S... >
(convenience)rep< Num, R... >
(convenience)rep_max< Max, R... >
(convenience)rep_min< Min, R... >
(convenience)rep_min_max< Min, Max, R... >
(convenience)rep_opt< Num, R... >
(convenience)require< Num >
(meta-rules)s_term
(icu rules)segment_starter
(icu rules)sentence_break< V >
(icu rules)seq< R... >
(combinators)seven
(ascii rules)shebang
(ascii rules)soft_dotted
(icu rules)sor< R... >
(combinators)space
(ascii rules)star< R... >
(combinators)star_must< R, S... >
(convenience)star_partial< R... >
(convenience)star_strict< R... >
(convenience)state< S, R... >
(meta rules)strict< R... >
(convenience)string< C... >
(ascii rules)string< C... >
(unicode rules)string< C... >
(binary rules)success
(atomic rules)TAO_PEGTL_ISTRING( "..." )
(ascii rules)TAO_PEGTL_KEYWORD( "..." )
(ascii rules)TAO_PEGTL_RAISE_MESSAGE( "..." )
(atomic rules)TAO_PEGTL_STRING( "..." )
(ascii rules)terminal_punctuation
(icu rules)three< C >
(ascii rules)trail_canonical_combining_class< V >
(icu rules)try_catch_any_raise_nested< R... >
(convenience)try_catch_any_return_false< R... >
(convenience)try_catch_raise_nested< R... >
(convenience)try_catch_return_false< R... >
(convenience)try_catch_std_raise_nested< R... >
(convenience)try_catch_std_return_false< R... >
(convenience)try_catch_type_raise_nested< E, R... >
(convenience)try_catch_type_return_false< E, R... >
(convenience)two< C >
(ascii rules)unified_ideograph
(icu rules)until< R >
(convenience)until< R, S... >
(convenience)upper
(ascii rules)uppercase
(icu rules)variation_selector
(icu rules)white_space
(icu rules)word_break< V >
(icu rules)xdigit
(ascii rules)xid_continue
(icu rules)xid_start
(icu rules)
This document is part of the PEGTL.
Copyright (c) 2014-2023 Dr. Colin Hirsch and Daniel Frey
Distributed under the Boost Software License, Version 1.0
See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt