Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report some additional information on export of artifacts #249

Open
gregcaporaso opened this issue Aug 18, 2020 · 1 comment
Open

report some additional information on export of artifacts #249

gregcaporaso opened this issue Aug 18, 2020 · 1 comment

Comments

@gregcaporaso
Copy link
Member

Improvement Description
To facilitate linking of exported data to pre-export data provenance, it would be useful for export to print some additional information to the terminal. Users could direct that to a log file, if they're interested, or ignore it if they're not interested.

Current Behavior
Currently the name of the file being exported, the format of the export, and the path that it is being exported to are printed to stdout. For example:

$ qiime tools export --input-path xyz.qza --output-path xyz
Exported xyz.qza as DNASequencesDirectoryFormat to directory xyz

Proposed Behavior
This could be expanded to include the uuid of the artifact being exported, the md5sum of the exported file(s), and maybe even the QIIME 2 version that the export command was run with. That would capture some information that would allow an interested user to link a file back to provenance.

I store this information when I'm exporting artifacts for collaborators who need to do something with the data outside of QIIME 2. Reporting this could help us to promote good data management practices.

For example, matching the current text-based description, this could become something like:

$ qiime tools export --input-path xyz.qza --output-path xyz
Exported xyz.qza (UUID: 55a808d0-6713-4e27-b5f2-3f6a1ac87c52) as DNASequencesDirectoryFormat to directory xyz with QIIME 2 2020.2. MD5 sums of the exported files are as follows:
xyz/dna-sequences.fasta = 0876c93a44fb8a23b231be52fa9a7b68

We could also provide an option that presents that information in a tabular format, to facilitate creating a log by appending information from multiple export commands.

@thermokarst
Copy link
Contributor

I like it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants