-
Notifications
You must be signed in to change notification settings - Fork 0
Parser
Second most important part of our compiler is parser. Parser takes token stream from scanner and, depending on rules specified in LL table (see LL_table.png), transforms them into Abstract Syntax Tree (AST), which, in our case, is Binary Search Tree (BST). In parser.c
this is implemented as various functions which acts as LL rules. If token stream taken from parser has invalid syntax, parser stops it's action and returns 2
, which indicates error in syntax analysis, otherwise performs semantic analysis and it also can call precedent_analyse
function from precedent.h
.
Because some expressions contain arithmetic and logical operators, we have to check priority for those operators and evaluate correct order in which we perform those operations. Priority for those operators are specified by precedent table (see TODO).
Precedent syntax analysis is not part of parser
files but is implemented in standalone precedent
files.
At start it takes token from parser and then, one by one, gets next tokens and determines what should do next
- Parser - Main function of compiler, prepares symtable and stack for further usage, then calls
program
- Program - Strips comments and EOLs and then calls
prolog
,eolM
andfunctionsBlock
- Prolog - Checks whether file starts with
package main
- EolM - Checks whether prologue is ended with EOL and then calls
eolR
- EolR - Strips EOLs
- FunctionsBlock - Checks that token is KEYWORD
func
and callsfunction
andfunctionNext
, after that checks thatmain
function was found only once - Function - Checks that next token is IDENTIFIER, if so then it saves it's name to functions symtable. Then it looks for
(
and callsarguments
func. Ifarguments
was successful and current token is)
it callsfunctionReturn
andcommandBlock
- FunctionNext - Checks if there are any other functions, if so, calls
function
andfunctionNext
- Arguments - Checks whether token is IDENTIFIER, if so calls
type
andargumentNext
- Type - Checks whether token equals to
int
,float64
,string
orbool
- ArgumentNext - If token is comma, then perform action as
arguments
function - FunctionReturn - If token is
(
callsfunctionReturnType
, otherwise return0
, because return()
could be omitted. - FunctionReturnType - If token equals to KEYWORD, calls
type
andfunctionReturnTypeNext
and then looks for)
, if it equals to)
, continue with program - FunctionReturnTypeNext - Checks that token is COMMA, another token should be KEYWORD and calls
type
andfunctionReturnTypeNext
- CommandBlock - Checks that token is
{
, then gets another which should be EOL, if so strips remaining EOLs. If next token does not equal to}
, callscommands
and then checks that token equals to}
, then look for one required EOL and strip remaining EOLs - Commands - Calls
command
, checks that command is ended with EOL, strips remaining EOLs and then determines whether token equals to}
, if so returns result, otherwise callscommands
- Command - Contains switch statement which determines next action. If token equals to IDENTIFIER calls
statement
. If token equals to KEYWORD go to another switch statement determining what kind of action we should perform based on passed KEYWORD:- IF - call
commandBlock
andifElse
functions. - FOR - call
forDefine
and then checks that token equals to;
- RETURN - call
returnCommand
- default - invalid KEYWORD passed
- IF - call
- Statement - Contains another switch statement which distinguishes between those token types:
-
(
- callsarguments
function and checks that token equals to)
-
=
- callsassignment
function -
:=
- callsassignment
function -
+=
,-=
,*=
,/=
- callsunary
function - default - calls
multipleID
, then checks that token equals to=
and callsassignment
-
- MultipleID - Checks that token equals to COMMA, increases number of IDs, then checks that next token is IDENTIFIER and calls
multipleID
- Assignment - Contains switch that distinguishes between those token types:
- IDENTIFIER - checks that next token is
(
, callsarguments
and checks that next token is)
- default - calls
expressionNext
function
- IDENTIFIER - checks that next token is
- Unary - Only checks that token is unary type
- ExpressionNext - Checks whether token equals to COMMA, then checks that number of IDs - 1 is greater than 0, then calls
expresssionNext
function - IfElse - If token is KEYWORD and equals to ELSE then it calls
ifElseExpanded
function - IfElseExpanded - If token equals to IF KEYWORD then calls
commandBlock
andifElse
functions, otherwise callscommandBlock
only - ForDefine - Checks that token's type equals to IDENTIFIER, if so checks whether next token equals to
:=
- ForAssign - Determines whether token is IDENTIFIER and next token is
=
- ReturnCommand - Checks that token is KEYWORD RETURN and calls
returnStatement
- ReturnStatement - todo
The only function which program should call is program()
, which calls other functions as it progresses through token stream (see Functions 1.). If no syntax error is found while parsing tokens, pass stack of symtable to semantic analyzer which performs own checks on it.
Parser is able to return various codes:
-
0
- if everything was successful -
1
- if lexical error -
2
- if syntax error -
3
- if semantic error in program (undefined function, variable, ...) -
4
- if semantic error in type assignment to new variable -
5
- if semantic error in type compatibility (arithmetics, ...) -
6
- if semantic error in program (invalid number of params or return values) -
7
- if other semantic error -
9
- if zero division -
99
- if internal error -
-1
- if internal warning
- Dominik Horky
- @horsecz ([email protected])
- Roman Janiczek
- @theleteron ([email protected])
- Lukas Hais
- @crackonosh ([email protected])
- Jan Pospisil
- @zelick0 ([email protected])