-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix block scalar mangling bug #231 #232
base: main
Are you sure you want to change the base?
Conversation
803493d
to
751a7e6
Compare
Pull Request Test Coverage Report for Build 4628587345Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick PR @wrouesnel, I'm a bit concerned with the duplication of work though which will make the code slower.
If you find a way to fix this without that duplication I'll be happy to merge this
#Comment with some whitespace below | ||
""" # noqa: W293 | ||
) | ||
fixed_source = dedent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fixed_source
== source
I think it will be cleaner to use result == source
. If they are different I was not able to tell them apart :S
@@ -481,6 +481,14 @@ def test_fix_code_functions_emit_debug_logs( | |||
"Restoring jinja2 variables...", | |||
"Restoring double exclamations...", | |||
"Fixing comments...", | |||
# Fixing comments causes a re-run of fixers, so we get duplicates from here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the re-running of previous fixers, can't you fix comments before so that we don't have to rerun everything?
I think it is really bad from the point of view of the performance of the tool to do this.
Looking at the code, weren't you able to tweak the existing |
Check my comment on this issue in case it might help simplify the pull request |
751a7e6
to
0b4c13b
Compare
The regex based parsing for fixing comments was breaking block scalars. By using the ruyaml round trip handler, instead the comment formatting now can correctly identify block-scalars and avoid mangling them.
0b4c13b
to
0ed70b0
Compare
The problem with solving this issue is that the fixer in question takes a line-oriented approach to processing the files, and can't handle multiple lines. The specific issue is here - yamlfix/src/yamlfix/adapters.py Line 584 in 51410d6
# Comment in the middle of the line, but it's not part of a string
if (
config.comments_min_spaces_from_content > 1
and " #" in line
and line[-1] not in ["'", '"']
):
line = re.sub(r"(.+\S)(\s+?)#", rf"\1{comment_start}", line) This code has no way to know if it's part of a string or not, because it can't know if it's part of a block, which has a number of different possible presentations. I did try several ideas for detecting block-scalar membership, but when you get right down to it that's just re-implementing a half-baked YAML parser when there's a much better YAML parser already out there. IMO it's more important for the fixer to be correct (i.e. not change key values) rather then fast, but I can't see a way to handle this otherwise (which isn't just trying to half-bake a YAML parser. |
I'm sorry but I still feel that this PR is adding a lot of complexity to the code only to fix #231. I don't see myself maintaining this code. Why don't you take a simpler approach to have another fixer run after the comments are run that iterates over the source code and:
|
The regex based parsing for fixing comments was breaking block scalars. By using the ruyaml round trip handler, instead the comment formatting now can correctly identify block-scalars and avoid mangling them.
Resolves #231
Checklist