Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing only the XML output (of PW) #353

Open
borellim opened this issue Jul 9, 2019 · 3 comments · May be fixed by #940
Open

Parsing only the XML output (of PW) #353

borellim opened this issue Jul 9, 2019 · 3 comments · May be fixed by #940
Assignees

Comments

@borellim
Copy link
Member

borellim commented Jul 9, 2019

This issue is open-ended and meant for tracking our discussion on the topic.

Now we parse both XML and textual output of PW, and then we merge the two sources of information. Textual output, being less structured, is harder to parse and to validate, so we would like to only use the XML output. However, removing the ability to parse text creates a backwards-incompatibility problem, because:

  • versions between 6.0 and 6.1 (included) only enable the new XML format on request (when configured with --enable-xml or compiled -D__XSD), otherwise they use the old XML format;
  • versions before 6.0 only output in the old XML format;
  • in general, the new XML format/schema has improved with time, so older versions may have missing or incorrect information (which we must parse from the text).

One approach could be to start ignoring the text output from a certain version of QE onwards, and keep the rest of the code for backwards compatibility. After some time, old versions of QE can be deprecated and the text-parsing code removed completely.

One concern to keep in mind is our desire to keep consistent the results of the parser between different versions of the plugin; this may further constrain our choices.

@sphuber
Copy link
Contributor

sphuber commented Feb 9, 2023

@mbercx I think it is time to bite the bullet on this one. If the comment above is correct and all versions of QE v6.2 and above will always have the XML with the schema file, and the latest version of aiida-quantumespresso only supports QE v6.6 and up, we can ditch parsing from the stdout altoghether, correct?

@mbercx
Copy link
Member

mbercx commented Feb 14, 2023

Possibly, but we have to check if all of the information parsed from the stdout is currently in the XML. One example that came to mind immediately was the site charge and magnetization, which was only in the stdout a while back. It seems to have been added now though.

An exercise I still wanted to do is make a list of outputs that are still parsed from the stdout, and see which we can parse from the XML instead. Then we can also request that these outputs are also added to the XML from the QE folks.

EDIT: When making such a list, we should also add the QE version that the XML output was added, to make sure we maintain compatibility as promised. I think it only makes sense to add XML parsing in case:

  1. All QE versions we support have the XML output.
  2. The stdout parsing is broken in a new QE version, but XML output is provided.

@sphuber
Copy link
Contributor

sphuber commented Feb 14, 2023

Just FYI, I have been working on this already and almost have something ready. Will open a PR soon and then we can discuss online. But have some other pressing stuff that I need to deal with first now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants