Skip to content

Commit

Permalink
improve λ-match parse errors
Browse files Browse the repository at this point in the history
  • Loading branch information
VictorTaelin committed Sep 29, 2024
1 parent bb25ba5 commit 65cbdaf
Show file tree
Hide file tree
Showing 2 changed files with 175 additions and 7 deletions.
8 changes: 6 additions & 2 deletions book/main.kind
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
main
: U32
= 3
: ∀(n: Nat)
Nat
= λ{
#s: λpred #z{}
#z: #z{}
}
174 changes: 169 additions & 5 deletions src/Kind/Parse.hs
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ withSrc parser = do

parseTrivia :: Parser ()
parseTrivia = P.skipMany (parseSpace <|> parseComment) where
parseSpace = do
parseSpace = P.try $ do
P.space
return ()
parseComment = P.try $ do
Expand Down Expand Up @@ -233,10 +233,11 @@ parseSwi = withSrc $ do

parseMat = withSrc $ do
P.try $ P.string "λ{"
cse <- P.many $ P.try $ do
parseTrivia
P.char '#'
cnam <- parseName
cse <- P.many $ do
cnam <- P.try $ do
parseTrivia
P.char '#'
parseName
parseTrivia
P.char ':'
cbod <- parseTerm
Expand Down Expand Up @@ -371,3 +372,166 @@ expandUses uses name =
case filter (\(short, _) -> short `isPrefixOf` name) uses of
(short, long):_ -> long ++ drop (length short) name
[] -> name






-- PROBLEM: when parsing a file like:

-- main
-- : ∀(n: Nat)
-- Nat
-- = λ{
-- #s: λpred #z
-- #z: #z{}
-- }

-- we get the following error:

-- PARSE_ERROR
-- - expected: space, "//", "}"
-- - detected:
-- 5 | #s: λpred #z

-- /Users/v/vic/dev/kind2hs/book/main.kind 5:3

-- which is blaming the first # character of the '#s: λpred #z' line. the problem is, the error is actually that we're missing a {} after #z on the end of the line.

-- seems like this happens because, for some reason, the parser backtracks all the way to #s: ... once the 'λpred #z' part fails to parse. but that should NEVER happen.

-- instead, the desired behavior is that, when an "uniquely defining" part of a parser parses, then we never backtrack. so, for example, when we parse:

-- #s:

-- inside the λ-match, we should parse this case's body (another term), and never backtrack further.

-- similarly, when we parse the header of a lambda (λpred), we should parse this lambda's body (another term), and never backtrack.

-- similarly, when we parse '#name', we know we're in a constructor, and we should never backtrack.

-- in other words, backtracking should be used only to disambiguate parses that could be ambiguous.

-- for example, parseSwi and parseMat share the same beginning:

-- λ{

-- but parseSwi parses a '0:', while parseMat parses a '#name:'

-- I'm not familiar with parsec, so, I wrote the parser above, which works fine. but this backtracking behavior is going too far.

-- how can I improve the parser to result in the desired behavior?

-- remember: all parses should start by "checking" if we start with the proper header-part (for example, on swi, the header is 'λ{' then '0:'). and, once the "header" is parsed, then, all parses after that are mandatory. the backtracking occurs only on the "header-part" of each parser. and this applies to ALL parsers.


-- To achieve the desired behavior of limiting backtracking to only the "header" part of each parser, you can use the `P.try` combinator selectively. Here's how you can modify the parsers:

-- 1. Modify the `parseTerm` function to use `P.try` only for the initial part of each parser:

-- ```haskell
-- parseTerm = do
-- parseTrivia
-- P.choice
-- [ P.try parseAll
-- , P.try parseSwi
-- , P.try parseMat
-- , P.try parseLam
-- , P.try parseOp2
-- , P.try parseApp
-- , P.try parseAnn
-- , P.try parseSlf
-- , P.try parseIns
-- , P.try parseDat
-- , P.try parseCon
-- , P.try parseUse
-- , P.try parseLet
-- , P.try parseSet
-- , P.try parseNum
-- , P.try parseTxt
-- , P.try parseHol
-- , P.try parseMet
-- , parseRef -- No need for P.try here as it's the last choice
-- ]
-- ```

-- 2. Modify individual parsers to use `P.try` only for the header part. For example:

-- ```haskell
-- parseSwi = withSrc $ do
-- P.try $ do
-- P.string "λ{"
-- parseTrivia
-- P.string "0:"
-- zero <- parseTerm
-- parseTrivia
-- P.string "_:"
-- succ <- parseTerm
-- parseTrivia
-- P.char '}'
-- return $ Swi zero succ

-- parseMat = withSrc $ do
-- P.try $ P.string "λ{"
-- cse <- P.many $ P.try $ do
-- parseTrivia
-- P.char '#'
-- cnam <- parseName
-- parseTrivia
-- P.char ':'
-- cbod <- parseTerm
-- return (cnam, cbod)
-- parseTrivia
-- P.char '}'
-- return $ Mat cse

-- parseCon = withSrc $ do
-- P.try $ P.char '#'
-- nam <- parseName
-- parseTrivia
-- P.char '{'
-- arg <- P.many $ P.try $ parseTerm
-- parseTrivia
-- P.char '}'
-- return $ Con nam arg
-- ```

-- 3. For parsers that don't need backtracking at all, remove `P.try` completely:

-- ```haskell
-- parseRef = withSrc $ do
-- name <- parseName
-- (_, uses) <- P.getState
-- let name' = expandUses uses name
-- return $ case name' of
-- "U32" -> U32
-- _ -> Ref name'
-- ```

-- These changes will ensure that backtracking only occurs at the beginning of each parser, allowing for better error reporting and preventing excessive backtracking.

























0 comments on commit 65cbdaf

Please sign in to comment.