The validator should require files to more strictly adhere to CSV #1924
Labels
bug
Something isn't working (crash, a rule has a problem)
status: Needs triage
Applied to all new issues
Describe the bug
The validator will accept files that contain unescaped quotes in string values rather than failing them with the csv_parsing_failed error.
The univocity parser by default will read unescaped quotes as though the entire value is not escaped, looking for the next delimiter. Many libraries do not allow this.
To make the univocity parser stricter about this, use the UnescapedQuoteHandling.RAISE_ERROR setting.
Steps/Code to Reproduce
Validate any of the files attached in files used.
Expected Results
The reports should contain a csv_parsing_failed for the stops.txt file (and probably others)
Actual Results
The reports do not show a csv_parsing_failed error
Screenshots
No response
Files used
Here are some existing feeds with unquoted quotes.
mdb-2000-202411140002.zip
mdb-1271-202406071530.zip
mdb-1185-202406071652.zip
mdb-902-202402080014.zip
Validator version
6.0
Operating system
Windows 11
Java version
17.0.7
Additional notes
No response
The text was updated successfully, but these errors were encountered: