Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds a --metadata-only option to pack #72

Merged
merged 18 commits into from
Jan 25, 2024
Merged

adds a --metadata-only option to pack #72

merged 18 commits into from
Jan 25, 2024

Conversation

glyg
Copy link
Contributor

@glyg glyg commented Dec 5, 2023

Here is a PR related to #69

Sorry for the messy diff, I forgot to disable automated black formatting in my editor. I'll try to revert that and just show the additions tomorrow.

Mainly I just changed from the line 494 to the end of the function

I added a very naïve test, but I did not yet manage to have the test suite running.

Copy link
Contributor Author

@glyg glyg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highlighted the relevant changes

README.md Outdated
Comment on lines 64 to 71
`--server` creates the transfer.xml file but does not copy data
or generate an archive. The last cli argument is the path where the `transfer.xml`
file will be written

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new documentation in README

README.md Outdated

Examples:
```
omero transfer pack Image:123 transfer_pack.tar
omero transfer pack --zip Image:123 transfer_pack.zip
omero transfer pack Dataset:1111 /home/user/new_folder/new_pack.tar
omero transfer pack 999 tarfile.tar # equivalent to Project:999
omero transfer pack --server Dataset:1111 /home/user/new_folder
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new example

Comment on lines 248 to 250
pack.add_argument(
"--server",
help="Only generate the xml file, don't create the archive",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argument added

Comment on lines 463 to 420
if args.server and args.simple:
raise ValueError("The --server and --simple options are incompatible")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--server and --simple are mutually exclusive (I think?)

Comment on lines 494 to 503
if not args.server:
tar_path = Path(args.filepath)
folder = str(tar_path) + "_folder"
else:
tar_path = Path(args.filepath)
if tar_path.suffix:
folder = os.path.splitext(tar_path)[0]
print("Output will be written to {folder}")
else:
folder = tar_path
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath argument change

Comment on lines 524 to 470
if not args.server:
print("Starting file copy...")
self._copy_files(path_id_dict, folder, self.gateway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't copy

Comment on lines 538 to 577
elif not args.server:
self._package_files(os.path.splitext(tar_path)[0], args.zip, folder)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't pacakge

@glyg glyg changed the title adds a --server option to pack adds a --metadata-only option to pack Dec 6, 2023
@erickmartins
Copy link
Collaborator

That was fast! I have a busy day today but will have a closer look tomorrow. Thanks a lot!

README.md Outdated
`--metadata` allows you to specify which transfer metadata will be saved in `transfer.xml` as possible MapAnnotation values to the images. Defaults to image ID, timestamp, software version, source hostname, md5, source username, source group.
`--metadata` allows you to specify which transfer metadata will be saved in `transfer.xml` as possible MapAnnotation values to the images. Defaults to image ID, timestamp, software version, source hostname, md5, source username, source group.

`--metadata_only` creates the transfer.xml file but does not copy data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this parallelism, I wonder if this is actually a --binary flag similar to the --metadata flag which takes options: files (default), none, and then in the future we could ask it to convert to OME-TIFF/OME-Zarr, for example. That doesn't help with the final argument handling or the exclusivity with simple/rocrate/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here "binary" means all data that is not metadata?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, "big things that might be downloaded". File attachments might end up being a special case with my other examples (OME-TIFF/Zarr). This is certainly where --server was more straight-forward because it meant, "I have access to all of the binaries and can do what I want with them"

Copy link
Collaborator

@erickmartins erickmartins Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point if we decide to be granular about which kinds of annotations are saved with the images as well. maybe 3 similar options covering what can be stored (--metadata, --annotations, --binaries) is the way to go?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that, with there just being some confusing overlap between metadata and annotation (and in the case of FileAnnotations, all three)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the confusion is lessened if this is ordered metadata < annotation < full metadata would then be a single file (xml, tsv, jsonld) and annotation everything but the binaries.

I don't see why you would want only annotations or binaries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK where ROIs sit though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will probably need some rework in the near future anyway (current --metadata would probably be better as --provenance or something like that) and there are good questions (ROIs? FileAnnotations? Where do they fit?). But in general I like --binaries as a solution for now ("none" would fall into the case for this PR, "all" would be default).

@joshmoore
Copy link
Member

A minor heads up that my test of:

omero transfer pack --binaries=none Image:42665 transfer.tar

was slightly confusing in that it didn't tar but created the "transfer" directory.

@glyg
Copy link
Contributor Author

glyg commented Jan 5, 2024

Yes, this is not ideal! Maybe in that case (if the last positional argument ends with .tar a single file archive should be created?

@erickmartins
Copy link
Collaborator

Yes, this is not ideal! Maybe in that case (if the last positional argument ends with .tar a single file archive should be created?

that makes sense to me. (in fact, this should probably be true for everything: if it ends in .tar it generates a tar file, if it ends in .zip it generates a zip, if it's a folder path it creates that folder (with relevant files inside). Future improvements!

@glyg
Copy link
Contributor Author

glyg commented Jan 23, 2024

Hi @erickmartins I can't find how to re-run the tests, it does not look like something I can fix on my side.

Is there anything I should do now to advance the PR? I can implement what @joshmoore was saying about argument parsing

@joshmoore
Copy link
Member

Assuming #75 is merged in or this is rebased on top, then I'd assume:

git clone [email protected]:ome/omero-test-infra .omero
.omero/docker cli

should run the tests. (@jburel can let us know if a special branch of omero-test-infra is needed)

@jburel
Copy link
Member

jburel commented Jan 23, 2024

the tests are now passing in #75.
When #75 is merged, this PR should be green after a rebase

@erickmartins
Copy link
Collaborator

#75 is now merged. Note that you do need to set an environment variable with POLICY_BINARY_ACCESS=+read,+write,+image,+plate to get tests to pass locally.

@jburel
Copy link
Member

jburel commented Jan 24, 2024

@glyg you need to rebase, the files have now been removed. This repository uses the upstream omero-test-infra

@glyg
Copy link
Contributor Author

glyg commented Jan 24, 2024

Yes sorry, rebasing now

@jburel
Copy link
Member

jburel commented Jan 24, 2024

strange that f073695 is necessary I will have assumed that this should be included with the rebase

@glyg
Copy link
Contributor Author

glyg commented Jan 25, 2024

might be my bad, I solved a lot of merge conflict, maybe I wrongly rm'ed .omero at some point

@erickmartins erickmartins merged commit 97cf7ef into ome:main Jan 25, 2024
1 check passed
@glyg glyg deleted the server branch January 25, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants