-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for string values #58
Changes from all commits
d0d3412
2a54d28
460b748
e53022a
984fe66
7b85e7d
e241322
64990f8
43f7b28
4a1635c
39f92b5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -542,11 +542,12 @@ func (t *tokenizer) readString() (string, error) { | |
if err != nil { | ||
return "", err | ||
} | ||
|
||
switch c { | ||
case -1, '\n': | ||
// -1 denotes EOF, and new lines are not allowed in short string | ||
if c == -1 || c == '\n' || isProhibitedControlChar(c) { | ||
return "", t.invalidChar(c) | ||
} | ||
|
||
switch c { | ||
case '"': | ||
return ret.String(), nil | ||
|
||
|
@@ -582,20 +583,25 @@ func (t *tokenizer) readLongString() (string, error) { | |
if err != nil { | ||
return "", err | ||
} | ||
|
||
switch c { | ||
case -1: | ||
// -1 denotes EOF | ||
if c == -1 || isProhibitedControlChar(c) { | ||
return "", t.invalidChar(c) | ||
} | ||
|
||
switch c { | ||
case '\'': | ||
startPosition := t.pos | ||
ok, err := t.skipEndOfLongString(t.skipCommentsHandler) | ||
if err != nil { | ||
return "", err | ||
} | ||
if ok { | ||
return ret.String(), nil | ||
} | ||
|
||
if startPosition == t.pos { | ||
// No character has been consumed. It is single '. | ||
ret.WriteByte(byte(c)) | ||
} | ||
case '\\': | ||
c, err = t.peek() | ||
if err != nil { | ||
|
@@ -1263,3 +1269,25 @@ func (t *tokenizer) unread(c int) { | |
t.pos-- | ||
t.buffer = append(t.buffer, c) | ||
} | ||
|
||
func isProhibitedControlChar(c int) bool { | ||
// Values between 0 to 31 are non-displayable ASCII characters; except for new line and white space characters. | ||
if c < 0x00 || c > 0x1F { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It took a bit of investigation to understand why a negative // Values lower than this are non-displayable ASCII characters
if c > 0x1F {
return false;
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I found a couple things easier treating EOF as a character instead of an error, and it's an internal-only API so it felt like a good trade-off. It's very possible that's just my old C habits speaking though. :) Feel free to refactor it if you think it improves readability. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 for having this treat -1 as an invalid char. In both places we're calling it we're checking for -1 immediately after, which you could then remove. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Modified and created #60 to refactor |
||
return false | ||
} | ||
if isStringWhitespace(c) || isNewLineChar(c) { | ||
return false | ||
} | ||
return true | ||
} | ||
|
||
func isStringWhitespace(c int) bool { | ||
return c == 0x09 || //horizontal tab | ||
c == 0x0B || //vertical tab | ||
c == 0x0C // form feed | ||
} | ||
|
||
func isNewLineChar(c int) bool { | ||
return c == 0x0A || //new line | ||
c == 0x0D //carriage return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like
skipEndOfLongString
reports whether it consumed the string ending, so I think you can replace this position check with anelse {...}
attached to theif ok {...}
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll also return false if it skipped over a
''' /* pause in the string */ '''
sequence, in which case you don't want to keep the '. :( This seems correct to me in the short term,skipEndOfLongString
should probably should be refactored to return a tri-state in the longer term.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I almost refactored
skipEndOfLongString
but then I decided to KISS.#61 to change
skipEndOfLongString