-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV files should use CRLF as row separator (or at least it should be possible to do so no matter what OS you are using!) #1722
Comments
https://miller.readthedocs.io/en/6.12.0/new-in-miller-6/#line-endings
i'm not sure if "line ending" refers to row separator in this paragraph, but if the row separator is not set to CRLF by default for all CSV files (as recommended), just using the same separator as detected from or specified for the input - no mater what OS you are using - seems to me a reasonable and far better default. |
@DJCrashdummy my apologies for the negative experience you're having. Here is a summary of events, and a list of imperfections: In Miller <= 5:
In Miller 6:
A suggested workaround: Using In summary:
|
first and foremost @johnkerl: there is nothing to apologize for! - i have to thank you for developing & maintaining Miller as FOSS! ok, wow... i didn't know that version 6 was a complete rewrite. kudos! 👍
yes, saw that... that's what i'm referring to in my first comment, that the
that's IMHO odd... it handles both line ending without any problem and can even use both for output (depending on the OS), but doesn't let you choose which one it should be. 🤨
if you do so, just a tiny heads up as i also tested older versions of Miller: be aware, that at least mlr 3.4.0 changed all line endings (even the ones within the field with pretty-print JSON!) to CRLF, which made the output unusable.
if the necessary amount of work is similar, perhaps the better approach.
that's IMHO the clearest description of the whole topic, but as i never used Miller before, i wasn't interested in "what's new" in the first place and had no clue about the
well... that was still confusing somehow, as most of the separator section in the man-page seems to more or less contradict it, as there is nowhere mentioned that CSV row separator are not alterable at all.
unfortunately not. piping the output through FWIW: interestingly csvkit - written in python - has the same issue (producing CSV files with only LF as row separator)... may it be that there is an underlying multi-lang-lib with the same issue? |
as we are talking about room for improvement for the docs, especially the
this is somehow confusing... as it referrers to default "line endings" again. and i'm still not sure if this is true for all file format or just CSVs. if it is basically true for all formats, it should be rather on top of the in general, i would reorganize the
everything else about specific formats (also and the last 2 points ( BTW: speaking about the table...
i guess it it should be rather... 😜
|
i user Miller for batch manipulating CSV files with an embedded pretty-print JSON (with LF line breaks in it) in one column.
everything worked fine, but i noticed one difference to the original: the row separator got changed from CRLF to LF without invoking it whatsoever!
then i spent quite some time because at the
man
-page there is a whole section about (changing) separators, but no mater what i tried (adding--rs crlf
,--ors crlf
,--rs '\r\n'
or--ors '\r\n'
), the output stayed the same.after further research via internet i found the following on the website:
https://miller.readthedocs.io/en/6.12.0/reference-main-separators/#which-separators-apply-to-which-file-formats
there are a few things which are at least quite unfortunate here:
-I
) or write the output into a file and share it with other people (running other OSes).man
-pages separator section similar to the JSON notes, that at least--ors
is ignored and you don't have to bother with that.at least when
--ors
(perhaps even--rs
) is used, there should be a warning, that it is ignored and\n
|\r\n
depending on your OS is used.for context, the link to the whole case: https://stackoverflow.com/a/79189576/2351568
The text was updated successfully, but these errors were encountered: