Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CLI: First proof-of-concept implementation of file dump for workchains.
This commit builds on the [pull request](aiidateam#6276) by @qiaojunfeng. It implements the following changes: - `_get_input_filename` function is removed - `workchain inputsave` is split up and modified, with the recursive logic to traverse the `ProcessNodes` of the workchain moved to `_recursive_get_node_path`, the directory creation moved to `_workchain_maketree`, and the file dumping moved to `workchain_filedump` - `_recursive_get_node_path` builds up a list of tuples of "paths" and `CalcJobNodes`. The "paths" are based on the layout of the workchain, and utilize the `link_labels`, `process_labels`, as well as the "iteration" counter in the `link_labels` -> This might not be the best data structure here, but allows for extending the return value during recursion -> Rather than using the `ProcessNodes` directly, one could also only use the `pks` and load the nodes when needed - In the `PwBandsWorkChain` used for development, the "top level", processes had the `link_labels` set, so they were missing any numbering. Thus, I added it via `_number_path_elements`. Right now, this is just a quick fix, as it just works for the top-level, though, such a function could possibly take care of the numbering of all levels. Ideally, one would extract it directly from the data contained in the `WorkChain`, but I think that's difficult if some steps might be missing the iteration counter in their label. - Eventually I think it would be nice to be able to just create the empty directory tree, without dumping input/output files, so the `_workchain_maketree` is somewhat of a placeholder for that - `calcjob_inputdump` and `calcjob_outputdump` added to to `cmd_calcjob` So far, there's not really any error handling, and the code contains probably quite some issues (for example, the "path" naming breaks in complex cases like the `SelfConsistentHubbardWorkChain`), though, I wanted to get some feedback, and ensure I'm somewhat on a reasonable trajectory before generalizing and improving things. Regarding our discussion in PR aiidateam#6276, for working on an implementation of a *complete* version that makes the steps fully re-submittable, that might be an additional, future step, in which @sphuber could hopefully provide me some pointers (for now, I added a warning that about that). The current commands don't require any plugin, only `core` and the data. The result of `verdi workchain filedump <wc_pk> --path ./wc-<wc_pk>` from an exemplary `PwBandsWorkChain`: ```shell Warning: Caution: No provenance. The retrieved input/output files are not guaranteed to be complete for a full restart of the given workchain. Instead, this utility is intended for easy inspection of the files that were involved in its execution. For restarting workchains, see the `get_builder_restart` method instead. ./wc-3057/ ├── 01-relax │ ├── 01-PwBaseWC │ │ └── 01-PwCalc │ │ ├── aiida.in │ │ ├── aiida.out │ │ ├── _aiidasubmit.sh │ │ ├── data-file-schema.xml │ │ ├── _scheduler-stderr.txt │ │ └── _scheduler-stdout.txt │ └── 02-PwBaseWC │ └── 01-PwCalc │ ├── aiida.in │ ├── aiida.out │ ├── _aiidasubmit.sh │ ├── data-file-schema.xml │ ├── _scheduler-stderr.txt │ └── _scheduler-stdout.txt ├── 02-scf │ └── 01-PwCalc │ ├── aiida.in │ ├── aiida.out │ ├── _aiidasubmit.sh │ ├── data-file-schema.xml │ ├── _scheduler-stderr.txt │ └── _scheduler-stdout.txt └── 03-bands └── 01-PwCalc ├── aiida.in ├── aiida.out ├── _aiidasubmit.sh ├── data-file-schema.xml ├── _scheduler-stderr.txt └── _scheduler-stdout.txt 9 directories, 24 files ```
- Loading branch information