Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line endings are forcibly converted to CRLF #120

Open
ghost opened this issue Dec 3, 2021 · 10 comments
Open

Line endings are forcibly converted to CRLF #120

ghost opened this issue Dec 3, 2021 · 10 comments
Labels
bug Something isn't working

Comments

@ghost
Copy link

ghost commented Dec 3, 2021

Steps to reproduce

  1. Paste something on https://bpa.st/ via the website
  2. Get the raw link
  3. View it with something that shows line endings, e.g.: curl -s https://bpa.st/raw/M5HQ | cat -v

Expected behavior

Line endings should ideally LF or untouched. CRLF is unsuitable for Unix systems in most cases, especially for source code.

Actual behavior

Pastes get CRLF line endings even when they weren't explicitly pasted as such:

❯ curl -s 'https://bpa.st/raw/M5HQ' | cat -v
foo^M
bar%
@gvegidy
Copy link

gvegidy commented Dec 28, 2021

I also noted this problem and found it quite annoying. It means for example that you can't execute a downloaded bash script without prior conversion.

But I don't see an obvious solution to the problem. There are different line endings and picking the right one must be done during download, because the uploader could use a different OS than the downloader.

What I propose is this:

  • offer 3 kinds of raw download links with the 3 different line endings unix, windows and mac os
  • when showing the raw download link in the browser, show a dropdown offering unix, windows and mac os line endings
  • when you change the value in the dropdown, the raw link is changed with javascript
  • the os given in the browser user-agent string is used to select the default value of the dropdown

@ghost
Copy link
Author

ghost commented Dec 28, 2021

If the language is properly selected it should be able to automatically choose the correct line ending (e.g. LF for bash scripts, CRLF for batch).

@gvegidy
Copy link

gvegidy commented Dec 28, 2021

That might work for bash scripts or batch files as they are only designed for one platform.

But what about text files or python scripts? They should have the proper line endings on all platforms.

@ghost
Copy link
Author

ghost commented Dec 28, 2021

Text files are debatable, but Python scripts should always be LF, as (at least on Linux) they will refuse to run if they use CRLF line endings.

The only real reason to use CRLF for code (except things like batch that require it) is compatibility with Notepad, which I wouldn't consider that important as nobody should really be using Notepad to edit code.

@gvegidy
Copy link

gvegidy commented Dec 28, 2021

but Python scripts should always be LF, as (at least on Linux) they will refuse to run if they use CRLF line endings.

Yes, that is exactly the issue. But Python is used quite commonly on Windows and it should have CRLF there.

@ghost
Copy link
Author

ghost commented Dec 28, 2021

Again, there is no reason (other than Notepad compatibility) for Python scripts to have CRLF on Windows even if it works. They will run fine with LF too.

@supakeen
Copy link
Owner

supakeen commented Dec 28, 2021

Huh, interesting issue. As far as I know I'm not explicitly converting line endings and the raw files should be left just as the name implies, raw see here for where it gets nabbed from the HTTP request:

https://github.com/supakeen/pinnwand/blob/master/pinnwand/handler/website.py#L121

and here for where it goes into the database:

https://github.com/supakeen/pinnwand/blob/master/pinnwand/database.py#L135

If anything something would be implicitly converting it along the way (perhaps at the rendering stage, perhaps at the input stage). Does either of you have an idea where that would be happening before I take a closer look?

As far as the discussion about CRLF vs LF on its own, that's a hard decision to make. I'd be against making it lexer specific both for the fact that one can't say one language is used on one platform only and the fact that it's a big list to keep which would need to be kept against the pygments upstream support of lexers :(

Perhaps a separate download or raw view, is it common to use text editors on Windows that can't deal with a missing carriage return still?

@supakeen supakeen added the bug Something isn't working label Dec 28, 2021
@ghost
Copy link
Author

ghost commented Dec 28, 2021

HTTP generally uses CRLF, so maybe it's that? I haven't looked too deep into it though.

@gvegidy
Copy link

gvegidy commented Dec 28, 2021

is it common to use text editors on Windows that can't deal with a missing carriage return still?

The "notepad" delivered with Windows 10 recently learned to deal with LF. But notepad that comes with older Windows versions doesn't.

But it is not just editors, you might want to run a batch file or powershell script downloaded from pinnwand without having to run a converter first. And they need CRLF on Windows to properly work.

@gvegidy
Copy link

gvegidy commented Dec 28, 2021

HTTP generally uses CRLF, so maybe it's that? I haven't looked too deep into it though.

Yes, this looks likely to me. I just recorded the post request in the browser (firefox on linux) and this was in the POST-data:
_xsrf=2%7Cfb72ec3c%7C75219b05f93d9119e20dd080d9436564%7C1640713522&lexer=text&filename=&raw=this+is+a+test%0D%0Anext+line%0D%0Anext+line%0D%0A%0D%0A&expiry=1hour

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants