-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Video files organize #841
Video files organize #841
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Saksham20!
Very preliminary review to possible help align ongoing development. Extra overall
- I think we need docstrings for all those functions since even though name is descriptive somewhat, the purpose and what is returned is not clear
- also please use type annotations
- Need to come up with unittest(s)
Codecov Report
@@ Coverage Diff @@
## master #841 +/- ##
==========================================
+ Coverage 86.57% 86.65% +0.08%
==========================================
Files 61 61
Lines 7291 7435 +144
==========================================
+ Hits 6312 6443 +131
- Misses 979 992 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
…elative renaming of video files wrt to nwbfiles
@bendichter @yarikoptic @rly @oruebel This is the strategy that I am currently thinking of to implement supporting video files on dandi: Currently there are 3 CLI options during the
Since we would need to change the nwbfile (rewrite the
|
todo:
|
@Saksham20 - have you considered the external name in NWB to be some form of a template possibly following a URI scheme. for example: the potential advantage of such a scheme is that it could use either relative or absolute paths. an appropriate tool (e.g., pynwb, dandi cli) could also determine how to interpret host properly. this does require interpreting the location, but even currently the user has to do that. |
@satra (and attn @bendichter who was participating straight from an airport seat) In NWB-DANDI sync call we arrived at an agreement that the best way to proceed ATM would be
|
@yarikoptic - does that mean no possibility for templating? |
I would say, no plan to support it ATM. Possibility can always exist! ;-) what particular extra use cases, in presence of sidecar file support, templating would allow for? |
@yarikoptic - just a quick pointer to an old comment :) - #70 (comment) - more seriously, external files are not interpreted by pynwb at present as anything other than a string right? as in there is no representation of seeking into a file, or that it is a file. i'm assuming that the link could be a url and pynwb won't complain. so having a template which is resolved by a reader would be fine without even any changes to pynwb (i think). and this would allow for external referencing of a related file and potentially even bytestream and offset without the need to rewrite any component of the file. i think this becomes a better standard operating procedure moving forward than a fixed file. if one wanted to encode specific version, one could optionally embed the checksum into the link as well. using a schema uri would offer a lot of benefits in the long run i think and would still support an absolute path in the short term. |
This is true only for the special case of ImageSeries types which define an external reference to a non-HDF5 file. External links to HDF5 files are resolved automatically by HDF5.
In principle yes, but this is only because PyNWB currently does not check that the paths are valid (but it arguably should). In terms of the schema, I would argue that this sort of thing would require an update to the schema, as it explitly states this should be "Paths to one or more external file(s)."
Could you provide an example of how such a template would look like (i.e., what are the components of the template) and how a reader (e.g., PyNWB) would be able to determine the parameters for the template from the file without additional input from the user? |
Wouldn't the URI change once uploaded on DANDI? Then we would have to rewrite the value in the nwbfile |
@Saksham20 - the ideal scheme would require no renaming inside the nwb file at all. i'm not sure we can achieve ideal, but some examples are below.
sure @oruebel . it may be useful to read this article on URIs and browse this list of schemes. if you look at the list of schemes, what a scheme is completely up to us to define and convey. example 1: locally pynwb or other reader would interpret this as example 2: more bids like: basebidsname would remove the suffix, which is about technique/assay type typically in bids. example 3: add offset example 4: relative path: example 5: add checksum: my use of example 5alt: add checksum: example 6: absolute file |
@satra thanks for those examples. It wasn't clear to me that you were thinking about URI schemes. It seems this is effectively means designing a REST API around
Does this mean, the user would need to know the original source of the file (e.g., that this file was orignially downloaded from DANDI) to interpret the path? |
@oruebel - i'm not sure this would require a REST API. this is more like jinja templating. regarding the link between files for local or downloading, that's going to be the case independent of this suggestion. if you are local, the external file needs to be in a specific location relative to the current file (unless an absolute path is used). if the relative or absolute path changes, the link breaks. if you are remote same thing holds and something needs to know/represent where the other file is. in either case a reader can inform the user what the filename should look like, so in the worst case scenario, a user could then download the file and place it manually.
i don't quite follow what you mean by a sidecar file? are you talking about an external json file? or the video file itself. and the path scheme can be as simple as pointing to a file, but can be as complex as providing a template for a file. |
Sorry, I was referring to the ability to update datasets/attributes via a JSON file, that we are discussing here hdmf-dev/hdmf#677 |
@Saksham20 This PR previously contained a function for validating a movie file using cv2, which was removed when PR #853 was merged into this branch. Was the removal intentional or accidental? |
@jwodder we decided to remove that in order to make this PR smaller and easier to integrate. That section of code added opencv as a requirement and we thought that might be too heavy for this package. We are happy to add that back in a future PR. |
Co-authored-by: John T. Wodder II <[email protected]>
Co-authored-by: John T. Wodder II <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments.
Co-authored-by: Yaroslav Halchenko <[email protected]> Co-authored-by: John T. Wodder II <[email protected]>
Thank you @Saksham20 !!! All comments were addressed, @jwodder has approved the PR. I can now only do the honors of pressing the green button ;-) |
fixes #785
This PR adds the functionality to upload on DANDI additional files of different extensions that are be linked to the nwbfile with the
external_file
argument.Prior to upload this requires editing the nwbfile's
external_file
argument to reflect the new location of the video in the DANDI re-organized folder structure as it would be on the cloud. This would keep external links consistent for another user that downloads that dandiset. This is a multi-step process: we need a strategy to organise the video files likeDandi organise
, edit the nwbfile and then upload this.