Preserve line separator format (Unix vs. Windows) #121

petsuter · 2015-04-11T16:30:45Z

I tried running python-modernize on Windows on source files that use Unix line separators. This changed them over to Windows line separators. That's not very helpful. A diff now shows every single line as changed. The previous line separator format should be preserved.

takluyver · 2015-04-11T18:40:39Z

Or maybe we should modernise them all to Unix style ;-)

We'll need to look at 2to3 and see how much code we'd need to duplicate in order to fix this ourselves. If it doesn't require duplicating too much, it should be OK.

forivall · 2015-04-11T21:30:11Z

For now, you can set your .gitattributes file appropriately so that it ignores line ending type changes and always commits them as unix line separators.

techtonik · 2015-04-13T08:35:42Z

Most likely that 2to3 need to be opened in binary mode for writing. https://github.com/python/cpython/blob/master/Lib/lib2to3/refactor.py#L527 (write_file)

daira · 2015-04-13T15:47:39Z

The implementation is different depending on whether python-modernize is running on Python 2 or 3: https://github.com/python/cpython/blob/master/Lib/lib2to3/refactor.py#L113 . On Python 2 at least, setting os.linesep to "\n" will make it use Unix newlines when writing, without any duplication of code.

The Right Thing is to preserve the line-ending style of each file, but that looks significantly more difficult to do, unless we detect the line-ending style ourselves and set os.linesep accordingly.

techtonik · 2015-10-17T08:27:58Z

@daira why not just read input file as binary? You will get the strings, but without any newline transformations.

techtonik · 2015-10-17T11:16:01Z

So, it looks like subclassing of lib2to3.RefactoringTool is required.

takluyver · 2015-10-18T11:04:45Z

2to3 already opens the file as binary, but it normalises the line endings to Unix style before processing the code, and then converts to platform native when writing it again. That presumably means that Windows style line endings can't be passed through 2to3's machinery, or 2to3 wouldn't bother normalising them.

We'd have to subclass RefactoringTool, duplicate _read_python_source, refactor_file, processed_file and write_file to detect and pass through the original newlines. I think the extra code outweighs the benefits, since it's easy to convert back to your preferred line endings after running modernize.

techtonik · 2015-10-18T11:11:25Z

It is easy to do once, but not every single time. So, how about hack to read the linefeed stats when "-w" is supplied and rewrite of the file after processing?

takluyver · 2015-10-18T11:21:12Z

It is easy to do once, but not every single time.

That's what scripting is for.

daira · 2015-10-18T17:27:29Z

As I pointed out, setting os.linesep (as an option, say --linesep=unix or --linesep=windows) may be sufficient to make the existing implementation do what we want.

[Edit: the monkey-patch below is probably better, if it works.]

petsuter · 2015-10-18T18:15:47Z

It's easy to do, but too much code? Everyone should script it, but doing it right once from the beginning is not worth it? :(

daira · 2015-10-18T19:58:08Z

I think this should work (untested):

from lib2to3 import refactor

def _identity(obj):
    return obj

refactor._from_system_newlines = _identity
refactor._to_system_newlines = _identity

if sys.version_info >= (3, 0):
    # Force newline mode to '', i.e. 
    # * on input, "universal newline mode is enabled, but line endings 
    #   are returned to the caller untranslated";
    # * on output, "no translation takes place".

    def _open_with_encoding(file, mode='r', buffering=-1, encoding=None, 
                            errors=None, newline=None, closefd=True):
        return open(file, mode=mode, buffering=buffering, encoding=encoding,
                    errors=errors, newline='', closefd=closefd)

    refactor._open_with_encoding = _open_with_encoding

daira · 2015-10-18T21:18:48Z

Hmm, not sure that will do the right thing on Windows for lines created by a fixer, though.

takluyver · 2015-10-18T23:12:14Z

I suspect that 2to3's parser expects Unix style newlines, otherwise it wouldn't bother normalising them in the first place. That could be a red herring, though.

daira · 2015-10-18T23:35:33Z

I used the os.linesep approach for the pull request. I believe this should also work on Python 3 based on the API docs; we'll see what Travis says.

daira · 2015-10-19T00:57:07Z

Nope, setting os.linesep doesn't work on Python 3, despite the docs implying that it should. I've run out of time to work on this now; perhaps someone else can have a go.

daira · 2015-10-19T01:22:32Z

I had another idea. Fixed now -- please review.

techtonik · 2015-10-19T10:03:01Z

https://github.com/python/cpython/blob/master/Lib/lib2to3/refactor.py#L540 explicitly converts file to system-specific newlines without any option to turn this off! What is the ill logic behind that?

Signed-off-by: Daira Hopwood <[email protected]>

daira · 2015-10-19T10:55:32Z

Hmm, although the approach I used in #132 works, subclassing RefactoringTool has the advantage that we can also log which files changed, as needed for #127. So I'm leaning toward that approach now.

daira · 2015-10-20T12:28:48Z

I think we should use something like the auto-detection method in @techtonik's patch, rather than the -line-endings options. (@techtonik makes a good argument that they look like fixer options and someone might expect them to operate for files that are not otherwise fixed.) I see how to do that (and how to test it) now; I'll have time to work on it ~~tomorrow~~ this weekend.

techtonik · 2015-10-22T04:31:39Z

@daira you can take my patch and work on top of it.

daira · 2015-11-03T04:40:33Z

Actually it looks as though I won't have time to do this before I go on holiday on Thursday (until the 28th November). So someone else should probably look at it.

daira added bug problem with fissix labels Apr 13, 2015

techtonik mentioned this issue Oct 17, 2015

Release 0.5 #125

Closed

daira mentioned this issue Oct 18, 2015

121 add line ending options #129

Closed

takluyver mentioned this issue Oct 19, 2015

Line ending options #130

Closed

techtonik added a commit to techtonik/python-modernize that referenced this issue Oct 19, 2015

Code to preserve original linefeeds (issue PyCQA#121)

b5e8eb5

daira added a commit that referenced this issue Oct 19, 2015

Add options to override how line endings are written. fixes #121

6b63419

Signed-off-by: Daira Hopwood <[email protected]>

daira mentioned this issue Oct 19, 2015

121 add line ending options 1 #132

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve line separator format (Unix vs. Windows) #121

Preserve line separator format (Unix vs. Windows) #121

petsuter commented Apr 11, 2015

takluyver commented Apr 11, 2015

forivall commented Apr 11, 2015

techtonik commented Apr 13, 2015

daira commented Apr 13, 2015

techtonik commented Oct 17, 2015

techtonik commented Oct 17, 2015

takluyver commented Oct 18, 2015

techtonik commented Oct 18, 2015

takluyver commented Oct 18, 2015

daira commented Oct 18, 2015

petsuter commented Oct 18, 2015

daira commented Oct 18, 2015

daira commented Oct 18, 2015

takluyver commented Oct 18, 2015

daira commented Oct 18, 2015

daira commented Oct 19, 2015

daira commented Oct 19, 2015

techtonik commented Oct 19, 2015

daira commented Oct 19, 2015

daira commented Oct 20, 2015

techtonik commented Oct 22, 2015

daira commented Nov 3, 2015

Preserve line separator format (Unix vs. Windows) #121

Preserve line separator format (Unix vs. Windows) #121

Comments

petsuter commented Apr 11, 2015

takluyver commented Apr 11, 2015

forivall commented Apr 11, 2015

techtonik commented Apr 13, 2015

daira commented Apr 13, 2015

techtonik commented Oct 17, 2015

techtonik commented Oct 17, 2015

takluyver commented Oct 18, 2015

techtonik commented Oct 18, 2015

takluyver commented Oct 18, 2015

daira commented Oct 18, 2015

petsuter commented Oct 18, 2015

daira commented Oct 18, 2015

daira commented Oct 18, 2015

takluyver commented Oct 18, 2015

daira commented Oct 18, 2015

daira commented Oct 19, 2015

daira commented Oct 19, 2015

techtonik commented Oct 19, 2015

daira commented Oct 19, 2015

daira commented Oct 20, 2015

techtonik commented Oct 22, 2015

daira commented Nov 3, 2015