Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Optional sorting for verdi code export #6345

Merged
merged 4 commits into from
May 13, 2024

Conversation

GeigerJ2
Copy link
Contributor

When running verdi code export, the generated yaml file is automatically sorted. However, the resulting alphabetical ordering is not what is typically found in the code yaml files in the aiida-code-registry and aiida-resource-registry. This PR turns the sorting off by default, and provides a command-line option to turn on the sorting.

If we don't find any other use cases for this flag, I'd also be fine to just turn off the yaml sorting alltogether, as I don't think this is the format that we'd usually want in the yaml file?

I found the previous behavior to be a slight annoyance, while setting up some unfamiliar codes by exporting them from an existing archive and modifying the resulting yaml files accordingly.

@GeigerJ2
Copy link
Contributor Author

Sry, forgot to update the tests. Will do so if this change is desired.

@sphuber
Copy link
Contributor

sphuber commented Apr 12, 2024

Even if you remove the explicit sorting, the actual order is not going to be the same as those on the registries, right? It would depend on whether the pydantic.BaseModel.fields have any particular ordering.

@GeigerJ2
Copy link
Contributor Author

GeigerJ2 commented Apr 16, 2024

Even if you remove the explicit sorting, the actual order is not going to be the same as those on the registries, right? It would depend on whether the pydantic.BaseModel.fields have any particular ordering.

That's a good point, which I didn't really consider. I just tested it, and it seems that when calling code.Model.model_fields as is done in the export function, they are returned in the same order as they are defined under class Model(BaseModel) of AbstractCode. This is then followed by the additional fields added by the derived classes (e.g. computer and filepath_executable for InstalledCode), while unset, optional fields are excluded in verdi code export.

When changing the ordering of the fields in the yaml file, importing the code, and then exporting it again, the initial, modified ordering from the yaml file is not retained, but the fields are automatically sorted, as described in the paragraph above. I think this should actually be the intended behavior, as it defines a standard ordering that derives from the actual implementation. If the sorting is different for yaml files in the registry, that should be the responsibility of the person who uploaded it there, and not the code export functionality of AiiDA. In any case, I don't think we can guarantee 100% the same sorting all the time, but at least with sort_keys=False in yaml.dump, append_text wouldn't be shown as the first field, which is something I found quite confusing with the previous behavior.

@sphuber
Copy link
Contributor

sphuber commented Apr 16, 2024

but at least with sort_keys=False in yaml.dump, append_text wouldn't be shown as the first field, which is something I found quite confusing with the previous behavior.

And perhaps others will find it surprising if the order changes and/or is not alphabetical. In the end though, these are just configuration files that are machine readable. Just thought that this change wouldn't be that impactful especially if the order is still not necessarily going to be uniform anywhere. But I don't really care that much about it going in either, so let's go ahead and add this change. Could you just please add a test for the option. And since this is (for now) just used in verdi code export not sure if it deserves a generic option. May as well just define it directly on the command for now

@GeigerJ2
Copy link
Contributor Author

GeigerJ2 commented Apr 16, 2024

And perhaps others will find it surprising if the order changes and/or is not alphabetical.

Yeah, I guess this just boils down to personal preference in the end. But at least having the option will be nice. Eventually, also having verdi computer export1 would be nice, so it might be an option there, as well. But for now it could just be specific for these two commands.

Footnotes

  1. Ofc being aware of the different implementation of a Computer as compared to a Code, and the need for setup and configure as separate steps. Might have a first look at this if I get around.

@GeigerJ2 GeigerJ2 force-pushed the code_export_sort branch 2 times, most recently from 0647449 to fc08e1d Compare May 7, 2024 09:31
@GeigerJ2
Copy link
Contributor Author

GeigerJ2 commented May 7, 2024

Alright, this should be ready for merge. I actually changed the option from --sort-outputs to --unsorted, with the default as false, so that people don't get surprised by the behavior suddenly changing. For testing via file_regression.check(), I added a new function, as I wasn't sure how to modify the existing test_code_export to check for the unsorted output.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GeigerJ2 few comments

Comment on lines 239 to 245
@click.option(
'-u',
'--unsorted',
is_flag=True,
default=True,
help="Don't alphabetically sort output yaml configuration file.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this works? The default seems to be True so it will always be unsorted and there is no way to change the value. You would be better off to use a switch in this case rather than a flag. I would suggest

Suggested change
@click.option(
'-u',
'--unsorted',
is_flag=True,
default=True,
help="Don't alphabetically sort output yaml configuration file.",
)
@click.option(
'--sort/--no-sort',
default=False,
help="Sort the keys of the output YAML.",
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked and for me it works:
image
But why should it not work? The default being True means it's unsorted by default, leading to sort_keys=True in the call to yaml.dump() and as it is a flag, using the -u option should turn unsorted to False, meaning sort_keys=False in the call to yaml.dump().

I'd also be fine with your proposed change, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, using the --sort/--no-sort option in that case changes the default to being unsorted, which I wanted to avoid as people might be used to the output being sorted by default as that was the previous behavior. And it requires using the longer --sort option for actually getting the sorted file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default being True means it's unsorted by default, leading to sort_keys=True

That's not true though, right? If the default is True, then sort_keys=True as you say, but then the output will be sorted. I think it is kind of confusing to have it defined this way. If unsorted is True than you are sorting 🤔

The downside of this approach is still that it will be difficult to change the default. Imagine that we would want to change the behavior in the future to make the output unsorted by default, then you would either have to change the option name or force users to specify --unsorted if they want the output sorted 😅

The standard practice for these kinds of switches is to use the --option/--no-option switch. This will allow changing the default without having to change the options and it is a lot clearer as to what is happening.

Though, using the --sort/--no-sort option in that case changes the default to being unsorted, which I wanted to avoid as people might be used to the output being sorted by default as that was the previous behavior.

You can still set the default value to be sorted, i.e. do:

@click.option('--sort/--no-sort', is_flag=True, default=True)
def main(sort):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. Not sure why it seemed fine, but I just checked again, and the behavior is reverse, as expected from the actual code... 🙃 The advantage of the --option/--no-option switch makes sense, already seeing the confusion that was emerging now on my side. I'll update it with the --sort/--no-sort switch, then it should be ready to go. Thanks!

src/aiida/cmdline/params/options/main.py Outdated Show resolved Hide resolved
src/aiida/cmdline/params/options/main.py Outdated Show resolved Hide resolved
@@ -273,6 +273,27 @@ def test_code_export(run_cli_command, aiida_code_installed, tmp_path, file_regre
assert isinstance(new_code, InstalledCode)


@pytest.mark.usefixtures('aiida_profile_clean')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use parametrization to include the sort/not-sort variations in the existing test_code_export test. Something like

@pytest.mark.usefixtures('aiida_profile_clean')
@pytest.mark.parametrize('sort', (True, False))
def test_code_export(run_cli_command, aiida_code_installed, tmp_path, file_regression, sort):
"""Test export the code setup to str."""
if sort:
    options = ['--sort']
else:
    options = ['--no-sort']

I think the file_regression should be clever enough to create separate reference files for each parametrized test, but that I am not a 100% sure off. Pretty sure there should be a way though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I created the new function as the unsorted test was failing, as it was comparing it to the sorted reference file. Didn't consider using @pytest.mark.parametrize, though, that could make it work. Will try!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, tried this, and it works, file_regression is indeed smart enough. Based on my other comment, I'd vouch for the unsorted option, though.

Copy link
Contributor Author

@GeigerJ2 GeigerJ2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why my comments are part of an additional review 🙃 but I submit it so that they actually appear.

Comment on lines 239 to 245
@click.option(
'-u',
'--unsorted',
is_flag=True,
default=True,
help="Don't alphabetically sort output yaml configuration file.",
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked and for me it works:
image
But why should it not work? The default being True means it's unsorted by default, leading to sort_keys=True in the call to yaml.dump() and as it is a flag, using the -u option should turn unsorted to False, meaning sort_keys=False in the call to yaml.dump().

I'd also be fine with your proposed change, though.

@@ -273,6 +273,27 @@ def test_code_export(run_cli_command, aiida_code_installed, tmp_path, file_regre
assert isinstance(new_code, InstalledCode)


@pytest.mark.usefixtures('aiida_profile_clean')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I created the new function as the unsorted test was failing, as it was comparing it to the sorted reference file. Didn't consider using @pytest.mark.parametrize, though, that could make it work. Will try!

Comment on lines 239 to 245
@click.option(
'-u',
'--unsorted',
is_flag=True,
default=True,
help="Don't alphabetically sort output yaml configuration file.",
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, using the --sort/--no-sort option in that case changes the default to being unsorted, which I wanted to avoid as people might be used to the output being sorted by default as that was the previous behavior. And it requires using the longer --sort option for actually getting the sorted file?

src/aiida/cmdline/params/options/main.py Outdated Show resolved Hide resolved
GeigerJ2 added 3 commits May 8, 2024 13:44
When running `verdi code export`, the generated yaml file is
automatically sorted. However, the resulting alphabetical ordering is
not what is typically found in the code yaml files in the
`aiida-code-registry` and `aiida-resource-registry`. This PR turns the
sorting off by default, and provides a command-line option to turn on
the sorting.

If we don't find any other use cases for this flag, I'd also be fine to
just turn off the yaml sorting alltogether, as I don't think this is
the format that we'd usually want in the yaml file?

I found the previous behavior to be a slight annoyance, while setting up
some unfamiliar codes by exporting them from an existing archive and
modifying the resulting yaml files accordingly.
@GeigerJ2 GeigerJ2 force-pushed the code_export_sort branch from 368fae7 to 2ca74f6 Compare May 8, 2024 11:44
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GeigerJ2 all good now

@sphuber sphuber merged commit 80c6068 into aiidateam:main May 13, 2024
19 checks passed
@GeigerJ2
Copy link
Contributor Author

As always, thanks for your help and assistance @sphuber, very much appreciated 🙏

@GeigerJ2 GeigerJ2 deleted the code_export_sort branch October 30, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants