You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
List of parsing ambiguities and how to handle them
The general strategy is to fully disambiguate at the tokenizer level, and then parse using LALR(1) grammar using Bison into AST. Then perform AST->ASR semantic analysis.
Solution: a comparison operator must have spaces on both sides: a > b is a comparison operator. Otherwise it is the beginning or end of template parameters.
Solution: this is a template, thus a declaration of d of type a::b<c>. We can even require a space between a type and the variable: a::b<c> d, although in this case it might not be needed.
Another example:
template <bool T1, int T2> classB;
voidf(int a = B < c, 5>);
Solution: here B < c is a comparison operator, so the above results in a syntax error. One must write it as: void f(int a = B<c, 5>);, then it will be parsed as template parameters.
Multiplication vs pointer
Example: A * p, A *p, A* p, A*p.
Solution: * is tokenized as multiplication if it either has no spaces around it, or has spaces on both sides, so a*b and a * b are both multiplications. Otherwise it is a pointer. So above A * p and A*p are both multiplication of A and p. But A *p and A* p are both a declaration of a pointer p of type A.
Function declaration vs object declaration
Example: int f(A);, this can either be a function declaration int f(A /*a*/); or an integer declaration f initialized to A.
Solution: Either extra parentheses int f((A)) to force variable declaration, or force using int f{A}; as variable initialization, or int f = A;.
Constructor vs object
Example: T ( A ); This can either be a constructor T with an unnamed parameter of type A, or a variable of type T and name A (the parentheses are redundant).
Solution: This will be interpreted as a constructor. You must remove the redundant parentheses as T a; to interpret this as a variable declaration.
Function callbacks
One can have arbitrarily complicated function callbacks syntax.
List of parsing ambiguities and how to handle them
The general strategy is to fully disambiguate at the tokenizer level, and then parse using LALR(1) grammar using Bison into AST. Then perform AST->ASR semantic analysis.
Templates and comparison
Example:
func<4 > 2>
andfunc<4 > 2 > 1 > 3 > 3 > 8 > 9 > 8 > 7 > 8>
.Solution: a comparison operator must have spaces on both sides:
a > b
is a comparison operator. Otherwise it is the beginning or end of template parameters.Another example:
a::b<c>d;
in https://stackoverflow.com/questions/1444961/is-there-a-good-python-library-that-can-parse-c/1447051#1447051.Solution: this is a template, thus a declaration of
d
of typea::b<c>
. We can even require a space between a type and the variable:a::b<c> d
, although in this case it might not be needed.Another example:
Solution: here
B < c
is a comparison operator, so the above results in a syntax error. One must write it as:void f(int a = B<c, 5>);
, then it will be parsed as template parameters.Multiplication vs pointer
Example:
A * p
,A *p
,A* p
,A*p
.Solution:
*
is tokenized as multiplication if it either has no spaces around it, or has spaces on both sides, soa*b
anda * b
are both multiplications. Otherwise it is a pointer. So aboveA * p
andA*p
are both multiplication ofA
andp
. ButA *p
andA* p
are both a declaration of a pointerp
of typeA
.Function declaration vs object declaration
Example:
int f(A);
, this can either be a function declarationint f(A /*a*/);
or an integer declarationf
initialized toA
.Solution: Either extra parentheses
int f((A))
to force variable declaration, or force usingint f{A};
as variable initialization, orint f = A;
.Constructor vs object
Example:
T ( A );
This can either be a constructor T with an unnamed parameter of typeA
, or a variable of typeT
and nameA
(the parentheses are redundant).Solution: This will be interpreted as a constructor. You must remove the redundant parentheses as
T a;
to interpret this as a variable declaration.Function callbacks
One can have arbitrarily complicated function callbacks syntax.
Example:
void (*set_new_handler(void (*)(void)))(void);
Solution: This is so hard to read as a human too that we can just require to split this, such as:
Links
Last version Bison parser in gcc for C:
https://github.com/gcc-mirror/gcc/blob/29231b752cbc105c3158b4b45b97f8374f87cbac/gcc/c-parse.in
Last version Bison parser for C++:
https://github.com/gcc-mirror/gcc/blob/a47a68100f94e7c0679ef8ec478a523bbbaced7b/gcc/cp/parse.y
Links:
https://stackoverflow.com/questions/6319086/are-gcc-and-clang-parsers-really-handwritten
https://news.ycombinator.com/item?id=34410776
The text was updated successfully, but these errors were encountered: