Improve String parser to allow all unicode values #195
Replies: 2 comments
-
ProblemThe issue is to decide what are valid escape sequences. For now Aiken UPLC parser does not support any:
Plutus SpecThe Plutus Core Spec says in Appendix A.1:
However despite some escape sequences being standardized for some languages, like C, there is as far as I know no "standard escape sequences". PlutusTxPlutusTx
Which implements the Haskell Report grammar rules:
I'm not sure what is supported by those exactly, it seems to be: Which includes quite a lot of non common ones and use AikenIt may also make sense to have the same escape sequences supported in UPLC Aiken compiler than in Aiken language. For now Aiken seems to support a few single character escape sequences in escape lexer, but no unicode ones:
Also not sure why it supports the weird Conclusion
|
Beta Was this translation helpful? Give feedback.
-
cool that makes sense. Thanks for writing this. |
Beta Was this translation helpful? Give feedback.
-
The Plutus Core spec says that strings are allowed to be any Unicode string. The parser currently doesn't support that. For example, my proptest quickly found this innocuous string that broke the parser:
Specifically, the quotes in the middle mess it up.
Probably will never come up, but it's good to uphold contracts even if they are edge cases.
Beta Was this translation helpful? Give feedback.
All reactions