Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bin and oct integer literals & Allow underscores in them #99

Merged
merged 11 commits into from
Feb 14, 2024

Conversation

Gusarich
Copy link
Member

ohm doesn't have binDigit so I needed to define it myself in grammar

@anton-trunov anton-trunov added the kind: language feature Intent to add a language feature label Nov 30, 2023
@anton-trunov anton-trunov added this to the v1.2.0 milestone Nov 30, 2023
@anton-trunov anton-trunov linked an issue Nov 30, 2023 that may be closed by this pull request
@anton-trunov
Copy link
Member

I think we should also resolve issue #103 in this PR

@anton-trunov anton-trunov self-assigned this Nov 30, 2023
@Gusarich Gusarich changed the title Add binary integer literal (ex: 0b10101) Add bin and oct integer literals & Allow underscores in them Nov 30, 2023
@Gusarich
Copy link
Member Author

@anton-trunov Done! I've also added oct literals (0o123)

@anton-trunov
Copy link
Member

@Gusarich Awesome stuff! I've noticed you have positive tests, which is great, let's also add a few negative tests as well, like 0b_00101010, _42, 0b123, etc. do not get accepted by the parser.

@Gusarich
Copy link
Member Author

@Gusarich Awesome stuff! I've noticed you have positive tests, which is great, let's also add a few negative tests as well, like 0b_00101010, _42, 0b123, etc. do not get accepted by the parser.

_42 is considered by parser as a valid id, so I'm not sure how can we test this case.

@anton-trunov
Copy link
Member

Ah, sorry, it was a typo: 0_42 should not be valid.

@Gusarich
Copy link
Member Author

Ah, sorry, it was a typo: 0_42 should not be valid.

Currently parser allows trailing zeroes.. should I also modify this behaviour?

Comment on lines 177 to 194
integerLiteralDec = digit+ ("_" | digit)*
integerLiteralHex = "0x" hexDigit+ ("_" | hexDigit)*
| "0X" hexDigit+ ("_" | hexDigit)*
integerLiteralBin = "0b" binDigit+ ("_" | binDigit)*
| "0B" binDigit+ ("_" | binDigit)*
integerLiteralOct = "0o" octDigit+ ("_" | octDigit)*
| "0O" octDigit+ ("_" | octDigit)*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd model literals after JS/TS, because it's kind of a go-to language for TON, so people could copy some literals from Tact code into TS tests or something like that.

There is one caveat though: JS's octal literals starting with 0. For the reasons of backwards compatibility it might be a bit too late to not allow integer literals with leading zeros. So this might result in some suboptimal UX when copying things from Tact to TS tests. (We could introduce a warning about this, though).

These are JS's restrictions, regarding underscores in numeric literals:

  1. JS does not allow _ at the end of literals.
  2. As mentioned earlier, we should also forbid _ after leading zeros.
  3. JS does not allow multiple _ in a row (so 4_2 is a valid literal, but 4__2 is not).

Of course, some other languages, like OCaml are very liberal in this respect, e.g. it's ok with 0004____2_ being a valid integer literal. However, I think going with JS's grammar (except for octals) is something we should do.

@@ -547,7 +547,7 @@ semantics.addOperation<ASTNode>('resolve_expression', {

// Literals
integerLiteral(n) {
return createNode({ kind: 'number', value: BigInt(n.sourceString), ref: createRef(this) }); // Parses dec-based integer and hex-based integers
return createNode({ kind: 'number', value: BigInt(n.sourceString.replaceAll('_', '')), ref: createRef(this) }); // Parses dec, hex, and bin numbers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looks like we translate all integer literals into FunC's decimal literals (because we strip this info here). Let's not change it now, but might be nice to have direct correspondence (modulo binary and octal literals, which FunC to does not support yet)

@anton-trunov
Copy link
Member

Currently parser allows trailing zeroes.. should I also modify this behaviour?

You probably meant leading zeros: I tried to address it here: #99 (comment)

@anton-trunov
Copy link
Member

Looks like there are some merge conflicts from your previous PR :)

@Gusarich Gusarich force-pushed the binary-number-literal branch from 8b3ccec to b0a5e29 Compare February 12, 2024 06:07
@Gusarich Gusarich force-pushed the binary-number-literal branch from b0a5e29 to 556b184 Compare February 12, 2024 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: language feature Intent to add a language feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add binary and octal integer literals
2 participants