Don't understand grammar #1033
-
Describe the bug On the webpage editor at https://pest.rs/ I enter this:
And as number:
And the decoded us_number is: "312234" instead of "3122345132". This happens due to the rule If I move the Why? To Reproduce Expected behavior Additional context I want to write a parser for US and DE float numbers, here are two tests in order to better explain my goal:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hello, as per the pest book, the pipe
The |
Beta Was this translation helpful? Give feedback.
-
Maybe someone else has the same problem as i did, here is how i solved it: After days trying to get this grammar into PEST/PEG/CHMOSKY I ended up in using a regexp since for me it is easier to create meaningful error messages for my users. I would have wished to use PEST but couldn't solve simple things as the ^ and $ in the regexp do now. Not saying PEST or the others are bad, but simply too complicated for my use-case. #[allow(unused)]
#[derive(Debug, Clone, Copy)]
pub enum NumberFormat {
DE,
US,
}
impl NumberFormat {
fn parse_number(&self, num_str: &str) -> Result<f64, std::num::ParseFloatError> {
match self {
NumberFormat::DE => {
//println!("DE");
let normalized = num_str.replace('.', "").replace(',', ".");
normalized.parse::<f64>()
}
NumberFormat::US => {
//println!("US");
let normalized = num_str.replace(',', "");
normalized.parse::<f64>()
}
}
}
}
pub fn parse_line(
line_number: usize,
line: &str,
number_format: NumberFormat,
) -> Result<Option<PestEmission>, String> {
let optional_regexp = Regex::new(
r#"^\s*"(?P<source>[^"]+)"\s*(?P<value>[\d.,]+)?\s*(?:"(?P<target>[^"]+)")\s*$"#,
)
.unwrap();
let Some(captures) = optional_regexp.captures(line) else {
if line.trim().is_empty() {
return Ok(None);
}
return Err(format!(
"Line \"{}\" does not match expected format, which must be: [\"ID\" \"ID\"] | [\"ID\" NUM \"ID\"]",
line_number
));
};
let Some(value) = captures.name("value") else {
return Ok(Some(PestEmission::EdgeUndefined(EdgeUndefined {
line: line_number,
source: captures["source"].to_string(),
target: captures["target"].to_string(),
})));
};
let number = match number_format.parse_number(value.as_str()) {
Ok(number) => number,
Err(e) => {
return Err(format!(
"The number \"{}\" on line \"{}\" does not match expected format: {}",
value.as_str(), line_number, e // FIXME error contains label
));
}
};
Ok(Some(PestEmission::EdgeDefined(EdgeDefined {
line: line_number,
source: captures["source"].to_string(),
target: captures["target"].to_string(),
value: number,
})))
} |
Beta Was this translation helpful? Give feedback.
Hello, as per the pest book, the pipe
|
operator is an ordered choice.Removing the silencing from your rules helps to see what's happening:
The
L ~ D13
sequence parses correctly so the entire ordered choice collapses into its result, in this case 6 numbers in a row.