Skip to content
This repository has been archived by the owner on Apr 13, 2022. It is now read-only.

Note that only empty containers can be deleted #172

Closed
michielbdejong opened this issue May 17, 2019 · 28 comments
Closed

Note that only empty containers can be deleted #172

michielbdejong opened this issue May 17, 2019 · 28 comments

Comments

@michielbdejong
Copy link
Contributor

I added that note about this in the WAC spec but it actually belongs here in the api-rest document.

The LDP spec seems to leave this out of scope.

This implies a 'no' to the question of should deleting a container also delete all its members and members of members (descendants).

@michielbdejong
Copy link
Contributor Author

we discussed this in the weekly meeting, and I think we have consensus

@RubenVerborgh
Copy link
Contributor

we discussed this in the weekly meeting, and I think we have consensus

Being? 🙂

@fabiancook
Copy link

I feel like the specification should allow this, often times you find yourself needing to be able to do a recursive delete of all information within a container, for example, if a pod provider was deleting out a directory because the agent has a request to be forgotten, and the container is owned by that agent.

Instead of stating a container can't be deleted if it has members, state that an agent without sufficient access to delete all members of a container, therefore cannot delete all members in the container. This would mean an agent could request to delete all members within a container that I have sufficient access to do so. An example of where this would be relevant is where an agent is requesting all information to be deleted, but someone is processing a payment which has been verified beforehand, so the agent can't request to delete their bank account, because there may be an associated payment resource associated with that payment method that owns to the person being paid.

This is a really rough example of where this would be super handy, but I think it warrants consideration as to how this fits into the discussion. One thing we need to be able to support from the specification up is the rights to agents and consider exactly what should be always available to them, no matter who the provider. One of those things I believe is right to move across the network with zero lockins, whether that is the agent just doing a general migration between machines, or if its between providers.

@michielbdejong
Copy link
Contributor Author

@fabiancook the specification does allow deleting an entire subtree by deleting resources one-by-one, it will just take multiple DELETE requests instead of a single one

@fabiancook
Copy link

I'm aware of being able to do multiple deletes, but feels over stepping at the spec level when the only technical limitation seems to be if a member has different access requirements. If the agent can delete all members anyway, they should be allowed to do it in one request.

@michielbdejong
Copy link
Contributor Author

they should be allowed to do it in one request

OK, then we disagree! :)
So let's go forward as follows:

  • we document what node-solid-server, Trellis and inrupt pod-server all implement, namely container delete does not delete subtree, but only deletes the container itself, and only if it's empty (like rmdir)
  • we create a separate issue to document your proposal (subtree is also deleted, like rm -rf) as a proposed spec change, following the regular process for proposed spec changes.

would that be ok with you?

@acoburn
Copy link
Member

acoburn commented May 27, 2019

Just be aware that a recursive delete (rm -rf) has some thorny issues when doing this at scale on a distributed platform. At some level, a recursive delete will require/assume some degree of consistency and atomicity (i.e. locking) across the entire operation. Performing this operation over a small container when the server is implemented on a single node over a filesystem is really not much of an issue; when there are millions of sub resources spread across multiple nodes, this becomes extremely difficult.

The other issue is that consistency of behavior is important across implementations. If DELETE operations behave very differently on different implementations of Solid, that will make interoperability much more complicated.

By restricting DELETE operations to empty containers, this does make DELETE operations more onerous on the client if the intent is to remove an entire sub-tree, but it also makes the interaction more explicit.

@fabiancook
Copy link

would that be ok with you?

Yep thats perfect. Below are some more details, as well as details for each implementation


Recursive delete can be optional, either a delete of that container will return something like 202 Accepted if its a background task, 204 No Content if its already done, or 405 Method Not Allowed if we can't delete on that container, if 405 is returned, fall back to recursive delete from client.

From the LDP spec's point of view the only thing that needs to be done is:

5.2.5.1 When a contained LDPR is deleted, the LDPC server must also remove the corresponding containment triple, which has the effect of removing the deleted LDPR from the containing LDPC.
5.2.5.2 When a contained LDPR is deleted, and the LDPC server created an associated LDP-RS (see the LDPC POST section), the LDPC server must also delete the associated LDP-RS it created.

So the only thing we should be doing is removing the reference for the container, not all its children. I read this as "If my container is deleted, I no longer exist", rather than "Delete members when container is deleted", all the members are orphaned instantly, whether they're actually deleted or not is a clean up task for the service.

If I'm reading what it's saying wrong, then is there some more resources we could reference on this topic?

node-solid-server

node-solid-server currently only deletes a container if it is empty, which is only a logical condition, not technical limitation.

Given this PR, it will also delete the container if its only members are .acl and .meta files: nodeSolidServer/node-solid-server#1180

Trellis

As far as I can tell, all resources are deleted the same https://github.com/trellis-ldp/trellis/blob/a9f4c235dea3919e5c0ff343f466bb9049853267/core/http/src/main/java/org/trellisldp/http/impl/DeleteHandler.java#L149

A BasicContainer is a Resource AFAIK, so deleting a container would delete everything within.

Inrupt pod-server

This implementation (with in memory being the target store) will retrieve all descendents and delete them:

https://github.com/inrupt/wac-ldp/blob/85989783456a6d2211e429c413af5c295037eee0/src/lib/storage/BlobTreeInMem.ts#L52

Open Network

This implementation will delete by way of rimraf: https://github.com/o-network/http-store/blob/master/src/fs-store/methods/delete.ts#L86

@RubenVerborgh
Copy link
Contributor

405 Method Not Allowed if we can't delete on that container

That's not right; 405 means that this method is not supported by the target resource, whereas it is supported but depends on other resources being deleted first.

@fabiancook
Copy link

Ah yes, was unsure what should be used for that negative case. What status code would be more relevant?

@RubenVerborgh
Copy link
Contributor

409 Conflict https://tools.ietf.org/html/rfc7231#section-6.5.8

@fabiancook
Copy link

It could also be 403 Forbidden, which would cover the case when the agent doesn't have sufficient access to delete all members.

@RubenVerborgh
Copy link
Contributor

403 only applies to current resource permissions

@fabiancook
Copy link

The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it.

This seems to be only about the request rather than the resource, or are we taking ldp specific? A single request could encompass many resources.

@RubenVerborgh
Copy link
Contributor

If you make an HTTP request for resource X, and the server replies with a 403, this means that you do not have the right permissions on resource X to perform that operation.

@fabiancook
Copy link

If the resource is a container, does that not mean the container and everything contained is that resource? It says that the actual definition of a resource isnt strict. I would think the resource in the context of a delete is the resource node with all it's members, as if the resource itself is a full graph.

@fabiancook
Copy link

If not 403, what's a suitable status code to forbid the action because of the children resources? 409 is about state of the resource, rather that access controls around it right?

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented May 28, 2019

If the resource is a container, does that not mean the container and everything contained is that resource?

A 403 is solvable by logging in as a different user; this case is not.

If not 403, what's a suitable status code to forbid the action because of the children resources? 409 is about state of the resource, rather that access controls around it right?

Spec:

The 409 (Conflict) status code indicates that the request could not
be completed due to a conflict with the current state of the target
resource.

The state of the resource is "non-empty", which is not the right state for deletion of a container. Hence, 409.

@michielbdejong
Copy link
Contributor Author

@fabiancook I thought we agreed that the current situation is rmdir, not rm -rf. https://github.com/trellis-ldp/trellis/wiki/Trellis-Architecture#recursion clearly states the recursive DELETE is not available.

You're right that the BlobTreeInMemory class from wac-ldp's master branch (still) implements rm -rf but that's how I implemented it before I started thinking about this question. My intention was always to follow whatever NSS does, so I'm updating that to rmdir behaviour soon.

Importantly, both main pod providers currently run NSS. So then I think it's fair to say that rmdir is the current real-world situation which this spec should describe, and that your proposal of switching to rm -rf would (if it were adopted by pod implementers) be a change from what we have currently?

@fabiancook
Copy link

So there are two problems here:

  • A technical limitation that is implementation dependent, depending on if that implementation wants to implement it
  • A non-technical limitation, where access controls will not allow deletion if members within the container cannot be deleted

My proposal isn't to go either way, if the implementation doesn't have that limitation, and they can verify all members can be deleted, then the client should be able to delete it.

So rather noting that it cannot be done, note that it could be implemented, but also reasons why it might not be, and if it is implemented, what rules must be followed (e.g. check all members can be deleted).

So instead of:

Deleting containers

A container can only be deleted if contains neither member resources nor ACL documents.

We have

Deleting containers

A container can only be deleted if the requesting agent can delete all (recursive) members contained within according to their individual access controls

An implementation may choose to only delete containers if it is empty (excluding orphaned ACL descriptors and any direct container resources for example meta & ACL files)
It is common for an implementation to exclude recursive deletion of a container due to technical limitations, for example a reference implementation, Trellis LDP gives reasoning here

This covers all implementations, doesn't restrict, shows the intention and reasoning.

@michielbdejong
Copy link
Contributor Author

An implementation may choose

That's not how specs work :) That leaves too much unnecessary freedom to server implementers to invent how the interface between a Solid app and a Solid pod works, and it creates a risk of divergence between pod servers, and apps that will display "Sorry, this app has only been tested with pod server X".

As I said, you and I disagree on what we think is the desired situation, and that's fine. We describe the current situation, and open a discussion issue about your desired situation, and maybe if the community agrees with you, there will eventually be a spec change.

Writing down your or my desired situation in the spec as if it were the current one is not a good idea.

Writing both down, stating that the spec doesn't know and it will be different from pod server to pod server is also not a good idea.

So we we should just write down the current situation, removing (not inserting) ambiguity, which is what I did (to the best of my knowledge) in #177.

If you think the current situation is the other one (rm -rf instead of rmdir), then we should keep talking until we agree.

As soon as we agree on what the current situation is, we can open the discussion on what the next situation should be! :)

@csarven
Copy link
Member

csarven commented May 30, 2019

The spec does not prescribe. It describes a non-normative ie. "may" as to what happens to the container, containment triples, and the contained resources. That non-normative "may" is even looser than a normative "MAY" (generally as per RFC).

Resources may end up in a container, eg. physically in a "directory" if you will, but it does not mean that the container is tracking it. Deleting a container typically involves assets that the container is aware of in its universe ie. having containment triples. If an implementation decides to delete a container and actually wipes out everything supposedly under it (conceptually speaking) that'd be clearly outside of the spec - not wrong.. just unpredictable and probably undesirable, and arguably more problematic to handle all the cases across implementations. I raise this as an example because we need to clearly distinguish that without assuming what's actually (or loosely) contained from the perspective of LDP.

The LDP spec stays close to RFC 7231 without overstepping. DELETE is optional in the spec, and it does not go out of its way to prescribe anything.

I think the Solid spec would benefit from:

  1. Prescribe container deletion only with respect to the containers and membership information.
  2. Prescribe container deletion with respect to LDP-BC, LDP-DC, LDP-IC.

To facilitate the development of simpler implementation designs, I'd suggest that DELETE only removes the LDPC URI and updates (removes) membership information - however, does not touch contained LDPRs. The interface should be simple as possible and not turn into a magic API doing all sorts of stuff behind the scenes. The client should have clear-enough idea what the action entails. Affordances are obvious that way ie. client acts on a particular URI and not a bunch of URIs. What happens to the LDPRs and membership information will depend on whether the LDPC is a BC, DC, or IC.

@michielbdejong
Copy link
Contributor Author

All LDPCs are BCs. I see your point about how we could design Solid in theory, but that's not the aim of this this issue. I'm just trying to make the spec more clearly describe what NSS does, so that new pod server implementations can make those behave in the same way, thus minimizing surprises for app developers.

@csarven
Copy link
Member

csarven commented May 30, 2019

What is the aim of this issue if not but to iron out how DELETE should work in context of LDP toward the "Solid spec"?

If you want to describe what a particular implementation does, that is its own documentation. Certainly not abstracting a single tool's behaviour to spec level. This is especially when we already know that NSS has issues and in some cases not implementing the specs correctly. There are other servers that in fact implements LDP more properly.

I think the LDP spec sufficiently distinguishes between BC, DC, IC such that I would not flat out treat them all simply as BCs.

If you want to be more methodical, create a table of some sort comparing how some of the LDP implementations out there handles container deletion. Use the commonality as the starting point for the "Solid spec" - and if and only if it ends up being more prescriptive or different than LDP's DELETE.

@michielbdejong
Copy link
Contributor Author

@csarven (and others): I’m mainly interested in getting https://github.com/solid/solid-spec/pull/177/files merged, can you +1 it?

@michielbdejong
Copy link
Contributor Author

To answer your question, yes, I think the Solid spec should describe what Solid app developers (can) expect from a pod server, and so that's more prescriptive and different than LDP's DELETE.

@jaxoncreed
Copy link

Is there a conclusion to this? We want to be sure nodeSolidServer/node-solid-server#1180 is safe to merge in.

@michielbdejong
Copy link
Contributor Author

My conclusion is here: inrupt/pod-server#15 (comment) (bottom one). I'll create a new spec issue about whether nodeSolidServer/node-solid-server#1180 should be merged or not.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants