Skip to content
This repository has been archived by the owner on Mar 7, 2024. It is now read-only.

WebDAV as Documents API contract? #11

Open
devonsparks opened this issue Jul 7, 2020 · 5 comments
Open

WebDAV as Documents API contract? #11

devonsparks opened this issue Jul 7, 2020 · 5 comments

Comments

@devonsparks
Copy link

Forgive if this is a silly question - trying to get up to speed on the latest work here: Has the team considered use of an existing neutral file management API like WebDAV for the Document API? WebDAV gives most of what you'd expect to find in a file API (files, directories, movement commands, locking, etc). Extensions like Delta-V support revision control too. Is there too much of an impedance mismatch between WebDAV and existing industry file systems? Too much overhead to implement the full protocol?

@GeorgDangl
Copy link
Member

Hi @devonsparks,

thank you for the input! So far, WebDAV has not been considered. I’m not familiar with the protocol, so I don’t yet have an opinion on this. The general approach, though, is to rely on easy to implement REST APIs, so I’m not sure if a full file transfer protocol would be an appropriate integration for the openCDE group.

@devonsparks
Copy link
Author

Thanks @GeorgDangl. Is it right to think that the Documents API intends to be a minimal interface over backend document databases that supports:

  1. Asking the system for a "bucket" in which the client may upload a (possibly modified) document based on one or more metadata fields (e.g., Type, Discipline, Phase)
  2. Uploading a document to that bucket (assuming the client has appropriate permissions to do so)
  3. Later retrieving one or more documents from one or more buckets by searching over the metadata fields

If so, a few questions around mechanics to check my understanding:

  1. What should implementers do if their backend system only generates document IDs after file upload events? What should the register-file-upload property of UploadSessionCreatedResponse return? Box's API might be a simple example to test the idea here.
  2. Has the "browse" API endpoint described in (3) and shown on the later slides of the Summit Doc been fleshed out yet?
  3. Who decides what the "filing" criteria (Type, Discipline) in the select-documents endpoint is?

I appreciate the intent to simplifying the user experience, so just eager to dig in and work out the details :)

Thanks!

@ykulbak
Copy link
Collaborator

ykulbak commented Jul 13, 2020

@devonsparks thank you for your suggestion.

The reason we haven't considered WebDAV for the documents API is that WebDAV, if I understand correctly, if designed for file and file system management but not for document management. Your 3rd question, "Who decides what the "filing" criteria (Type, Discipline) in the select-documents endpoint is?" touches the heart of the problem: Document management and control, in the construction industry, requires comprehensive document metadata management which is not supported by WebDAV. The document metadata problem is made even harder by the substantially different document control paradigms supported by different vendors.

The documents API, is currently designed to allow exchanging files without standardising any aspect of document metadata. Furthermore, standardising document metadata has been expressly made a "non-goal" for the initial versions of the documents API.

@devonsparks
Copy link
Author

Hey @ykulbak - thanks for the notes. Makes sense. Webdav does support resource metadata through PROPFIND and PROPPATCH methods. It's not clear to me one way or the other whether they're sufficient for a majority of AECO document management workflows. Just figured I'd ask :)

Would you ever consider placing slightly stronger constraints on the Documents API? Currently the /select-documents endpoint returns an HTML document representing a UI to assist the user in document filing. This approach isn't amenable to form processing by machine as @bigdoods points out in #4, because clients can't tell what the data contract of the resulting HTML form might be. What if instead a GET /select-documents returned, say, a JSON-Schema description, where each JSON-Schema attribute matched a field in the associated form. Machines would be able to read this schema directly, including hypermedia links to related resources. Those looking for a better (human) user experience could bolt on one of the many available JSON-Schema-based form UI libraries (like uniforms) to dynamically generate the document filing form at runtime. The same schema definition could then be used by human or machines for document filing. Submitted forms could have Content-Type multipart/form-data, where the first Content Index holds the form instance data validated against the form's JSON-Schema, while the second Content Index holds the binary data of the file. Thoughts on the general approach? I'm just targeting some way to keep the filing requirements to a minimum while still ensuring the document selection can be driven programmatically. Thanks!

@ykulbak
Copy link
Collaborator

ykulbak commented Jul 16, 2020

Hi @devonsparks your observation about the the current Documents API specification is absolutely correct. As currently specified, the Documents API only caters for interactive use cases where a human can navigate the websites presented by GET /select-documents or POST /upload-documents;

We believe that interactive use cases are important enough to stick with the dedicated, simplistic exchange. For this reason @bigdoods is now leading a subgroup which is working towards a draft specification for the machine-to-machine (non-interactive) use cases; We will be looking for opportunities to align both specifications once the subgroup completes its work.

Please contact @bigdoods to join the subgroup, your suggestions are interesting and I'm sure that your contribution would be greatly appreciated.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants