-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add yaml parsing & emitting support #707
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have found several places where something can be improved. It would be great if you took these comments into attention.
@Tsygankov-Slava Thank you for your review! All the comments were taken into account and the corresponding parts of the code were corrected. Some other small improvements were done. For now, the problem of emitting number strings such as "10", "5.34" without quotes remains due to poor documentation of yaml-cpp library. I will let you know when a solution is found. |
@Tsygankov-Slava @unserialize The problem of emitting numbers and numbers-as-strings has been solved. Converting from mixed to a YAML document has been rewritten without use of yaml-cpp as it does not provide any way to emit strings with quotes and numbers without. Also now booleans are parsed&emitted correctly (as booleans, not strings or integers). More C++ tests has been written and a complex PHP test has been implemented, too. For the PHP test to work, a 'yaml.so' extension has been added to the In the PHP test, |
Merged VKCOM:master into my master branch so that there are no conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked your PR and found a few points that you might want to fix.
I'll try to review it next week |
builtin-functions/_functions.txt
Outdated
@@ -1652,3 +1652,8 @@ class DateTimeImmutable implements DateTimeInterface { | |||
} | |||
|
|||
function getenv(string $varname = '', bool $local_only = false): mixed; | |||
|
|||
function yaml_emit_file ($filename ::: string, $data ::: mixed) ::: bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(string $filename, mixed $data)
and others, no :::, just type hints. Same for return — just ':' after ')'
@@ -49,6 +49,11 @@ prepend(KPHP_RUNTIME_PDO_PGSQL_SOURCES pdo/pgsql/ | |||
pgsql_pdo_emulated_statement.cpp) | |||
endif() | |||
|
|||
if (YAML) | |||
prepend(KPHP_RUNTIME_YAML_SOURCES / |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing identation in if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically copied this if
from above if
s, there is also no identation
should I fix it there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it honestly looks worse with indentation inside the if
-clause
I think that's why there is no indentation in above if
s
runtime/streams.h
Outdated
@@ -6,7 +6,7 @@ | |||
|
|||
#include "runtime/kphp_core.h" | |||
|
|||
using Stream =mixed; | |||
using Stream = mixed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very strange change in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed a missing space there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted this change
runtime/yaml.cpp
Outdated
|
||
#include "runtime/optional.h" | ||
#include "runtime/streams.h" | ||
#include "yaml.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"runtime/yaml.h"
runtime/yaml.cpp
Outdated
if (node.IsScalar()) { | ||
const string string_data(node.as<std::string>().c_str()); | ||
// check whether the primitive is put in quotes in the source YAML | ||
const bool string_data_has_quotes = (source[node.Mark().pos] == '"' && source[node.Mark().pos + string_data.size() + 1] == '"'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about single quotes? I suppose, they also mean a string in yaml
runtime/yaml.cpp
Outdated
/* | ||
* print tabs in quantity of nesting_level (used to print nested YAML entries) | ||
*/ | ||
static string yaml_print_tabs(const uint8_t nesting_level) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use uint8_t for a number? There is no reason not to use just int
unless memory alignment is a case you care for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I thought it is unrealistic to exceed 255 levels of nesting in a YAML document, so I decided to limit memory for variable a little bit
and uint8_t
is safer in terms of sign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to int
runtime/yaml.cpp
Outdated
string escaped_data; | ||
for (size_t i = 0; i < data.size(); ++i) { | ||
if (data[i] == 10) { // line feed - code 10 | ||
escaped_data.push_back(92); // backslash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'\' instead of 92 (and below)?
runtime/yaml.cpp
Outdated
static string yaml_escape(const string &data) noexcept { | ||
string escaped_data; | ||
for (size_t i = 0; i < data.size(); ++i) { | ||
if (data[i] == 10) { // line feed - code 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- extract data[i] to a separate (char) variable: you use in dozen of times
- why do you compare char with a numeric code, why not `data[i] == '\n' and get rid of comments everywher?
runtime/yaml.cpp
Outdated
} else if (data.is_bool()) { | ||
buffer = (data.as_bool()) ? string("true") : string("false"); | ||
} | ||
string_data.append(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What for do you need buffer
, why not to append into string_data
directly?
runtime/yaml.cpp
Outdated
mixed f$yaml_parse_file(const string &filename, int pos) { | ||
if (filename.empty()) { | ||
php_warning("Filename cannot be empty"); | ||
return {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PHP, false
in returned on failure
Is this issue (from PR description) still actual? I can't see such a behaviour looking at source code |
No, it is outdated, now everything in double quotes is emitted as string |
KPHP YAML
Solution for Issue #639
Implementation of 4 functions from PHP YAML using yaml-cpp:
C++ tests are written and implemented.
There is one major issue: strings like "10", "98.13' are emitted without quotes, so then they are parsed as integers and doubles respectively.More informative README can be found here.