Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spaces in default value #46

Open
barbagus opened this issue Jan 10, 2023 · 4 comments
Open

Spaces in default value #46

barbagus opened this issue Jan 10, 2023 · 4 comments

Comments

@barbagus
Copy link

I would like to set a default value containing spaces like in - . It does work:

--name-sep=<sep>  name field separator [default:  - ]   

But it is not so pleasing to the eye and prone to bad interpretation. I don't know what is the standard here, but I would imagine accepting a quoted default value, as in:

--name-sep=<sep>  name field separator [default: " - "]   

Could be one way to go. Or am I missing something ?

@NickCrews
Copy link
Contributor

This seems like a reasonable idea, but I would want to get it right, and there are a few rough cases:

  • how to do escaping. For instance I might want the string Dwayne "The Rock" Johnson. This should be easily explainable.
  • how to deal with escaping. eg if the string contains a ] then the current re.findall(r"\[default: (.*)\]") would fail. Of course, this also currently fails if you want a ] in your default, but currenlty it is a bit more obvious that this should fail. Once you add quoting, I think we would need this to work.

Once we figure those out, and add tests and docs, then it sounds great!

@barbagus
Copy link
Author

Indeed I see the Pandora's box of string quoting.

The way I implemented in #47 cowardly avoid having to deal with it. It works as previously: re.findall(r"\[default: (.*)\]") but it just adds a check after the match is successful. If it is the case that the match starts and ends with the same quote character (both single or both double) than it strips them and that's it.

This example just works and result in Dwayne "The Rock" Johnson:

--main-actor=<name>    name of the movie's main actor  [default: Dwayne "The Rock" Johnson]

This might not be intuitive or elegant but it is easy to use and always has a solution, even for strings that do start and end with quotes:

  • enclose it with "the other" quote character [default: '"The Rock"']
  • enclose it with "the same" quote character [default: ""The Rock""]

However, confusion may arise when users of the library might not expect quoting to be implemented and, expecting "The Rock", writes this:

--main-actor-nickname=<name>   name of the movie's main actor  [default: "The Rock"]

One way around it would be to

  1. specifically enable this feature when needed: arguments = docopt(__doc__, quoted_defaults=True)
  2. document it properly and expect developers using the library to apply extra care when dealing with quotes anyway

@barbagus
Copy link
Author

As for string containing ], because .* is greedy, it doesn't pose any problem I reckon. The following example already works as expected (resulting in []) with or without quoting.

--group-enclosure=<pair>             open/close grouping characters [default: []]

@NickCrews
Copy link
Contributor

Sorry for taking so long, but I think this has a path to life.

specifically enable this feature when needed: arguments = docopt(doc, quoted_defaults=True)

This is a reasonable idea, but I want docopt to be as opinionated and un-optioned as possible. Let's just go with the one least-bad behavior. I think it can be explained as

Quotes are not required (anything after `default: ` is taken as the value), but they *can* be used to make whitespace or other characters more clear.
If a default value is quoted (after stripping whitespace, begins and ends with `"` or begins and ends with  `'`), then those quotes are removed:
- `default:  leading space` -> ` leading space`
- `default: ' leading space'` -> ` leading space`
- `default:                 "Dawyne "The Rock" Johnson"           ` -> `Dawyne "The Rock" Johnson`

OK, I think that expected behavior you laid out seems reasonable. If you implement these test cases, I will accept #47:

[
    pytest.param(' leading space', ' leading space', id="leading_space_unquoted"),
    pytest.param('" leading space"', ' leading space', id="leading_space_quoted"),
    pytest.param(' "whitespace before"', "whitespace before", id="whitespace_before"),
    pytest.param('"whitespace after" ', "whitespace after", id="whitespace_after"),
    pytest.param('Dawyne "The Rock" Johnson', 'Dawyne "The Rock" Johnson', id="basic"),
    pytest.param('"Dawyne "The Rock" Johnson"', 'Dawyne "The Rock" Johnson', id="nested"),
    pytest.param('"Dawyne \'The Rock\' Johnson"', 'Dawyne "The Rock" Johnson', id="nested_mixed"),
    pytest.param('"The Rock"', "The Rock", id="unneeded_quotes"),
    pytest.param('"The Rock', '"The Rock', id="leading"),
    pytest.param('The Rock"', 'The Rock"', id="trailing"),
    pytest.param('"Dawyne "The Rock" Johnson', '"Dawyne "The Rock" Johnson', id="leading_with_inner"),
    pytest.param('Dawyne "The Rock" Johnson"', 'Dawyne "The Rock" Johnson"', id="trailing_with_inner"),
    pytest.param('"The Rock\'', '"The Rock\'', id="mixed"),
    pytest.param('[]', "[]", id="unquoted_brackets"),
    pytest.param('"[]"', "[]", id="quoted_brackets"),
    pytest.param('"]"', "]", id="quoted_trailing_bracket"),
    pytest.param('"]"', "]", id="quoted_trailing_bracket"),
]
...
doc = "default: " + inp

Please note the whitespace_before and whitespace_after cases, we haven't talked about that yet. I think if there is some sneaky extra wuitespace before/after a quote, it should get stripped, but I'm open to counterarguments.

If you can think of other ones, please add them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants