Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/Child objects ACL #283

Merged
merged 4 commits into from
Feb 5, 2024
Merged

fix/Child objects ACL #283

merged 4 commits into from
Feb 5, 2024

Conversation

carpawell
Copy link
Member

No description provided.

@carpawell carpawell self-assigned this Jan 12, 2024
@carpawell carpawell changed the title fix/Child-objects-acl fix/Child objects ACL Jan 12, 2024
object/types.proto Outdated Show resolved Hide resolved
@roman-khimov
Copy link
Member

roman-khimov commented Jan 18, 2024

Judging from nspcc-dev/neofs-sdk-go#543 we're missing at least offset data for #264. And I'm worried about #263 as well.

@carpawell
Copy link
Member Author

Judging from nspcc-dev/neofs-sdk-go#543 we're missing at least offset data for #264. And I'm worried about #263 as well.

Yes, I separate them to make every change smaller but merge them with a few PRs. Do you think we need to change them at once?

@roman-khimov
Copy link
Member

Let's make a complete spec for the new format first, then implement it step by step fixing along the way if needed.

@carpawell
Copy link
Member Author

A few intermediate questions:

  1. Should the link object be a separate object type? There are no requirements (at least for now) for it. Making a separate type for it certainly makes it harder to adopt in the node and may increase the number of unexpected bugs. Although, I just feel like it should be a separate object type.
  2. Is it OK to have a "broken" parent header in the initial object? It just cannot have any correct payload-dependent fields.
  3. Do we need some structure for the initial object too? Should it have a payload or parent object is ok to be in the header like it was before?

My answers:

  1. Yes (but be ready it takes much more time to implement)
  2. No
  3. Not sure. Want to say "yes" since it is clearer to me but it would be too many different messages and types IMO (especially with 2.).

@roman-khimov
Copy link
Member

  1. A separate type likely will make it easier to adopt, you'll be able to separate the old one from the new easily.
  2. It can have a lot of data in many cases, what's broken about it? Hashes?
  3. It has a special type, then it depends on what else we need. Payload should OK.

@carpawell
Copy link
Member Author

carpawell commented Jan 24, 2024

  1. A separate type likely will make it easier to adopt

If we make it from scratch -- yes. But currently a lot of places think that link object is a regular object. I can not even imagine how many. Moreover, backward compatibility is a question here: there are places that are not ready to get anything except REGULAR, TOMB, LOCK. But I do support you.

  1. It can have a lot of data in many cases, what's broken about it? Hashes?

No ID, no payload length (in a general case), no hashes. That is not a header I would say. Previously, "parent" was a ready term in NeoFS. If you saw "parent" somewhere, you know what it was and were able to get anything you want about it. But not finished header solves our problem for now.

  1. It has a special type, then it depends on what else we need. Payload should OK.

So you think INIT object should have a payload? Or not? Currently it does not have it and parent is placed in the header's split.parent field. If it is a separate message in its payload, we may think about storing parent header not like a header but like a new struct without "payload" fields.

// IDs. IDs MUST be ordered according to the original payload split, meaning the
// first payload part holder MUST be placed at the first place in the corresponding
// link object.
message Link {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really deserves a separate package? to me its overhead. object fits well, or at least in refs

Copy link
Member Author

@carpawell carpawell Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on the final solution about a separate object type. if it is a separate object type, i do not see any difference with tombstone, storagegroup and lock pkgs. but if you ask me, should it be like that, my answer is no, but we have what we have, do not want this object type to be different in any way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo the previously used approach only complicates the structuring and, as a consequence, the knowledge of the protocol concepts. All these entities are inextricably linked with objects - they are about objects, they are in the payload of objects. Сurrent packetization is extremely redundant to me

we have what we have

obviously, but we are not burdened with creating a new package (). Keep the existing ones, and this message can be left in object to comply with the system architecture hierarchy

being a newbie, i'll see acl, container, object and other large sections of the system. And then link - specific detail elevated to the top level of the system with a non-descript name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to comply with the system architecture hierarchy

so do i. type of the objects are placed where they always have been

the issue was created #284

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for issue, link is still under the question

link/types.proto Outdated Show resolved Hide resolved
link/types.proto Outdated Show resolved Hide resolved
link/types.proto Outdated Show resolved Hide resolved
link/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
link/types.proto Outdated Show resolved Hide resolved
@cthulhu-rider
Copy link
Contributor

cthulhu-rider commented Jan 24, 2024

separate type link

the concept is very simple: if payload format is strict, then separate type is required. Separate type also grants efficient SEARCH power (if needed). Ofc some servers could deny service of unsupported types, so LINK objects must be essentially optional within the current protocol version. It’s nice that link objects in the current scheme are optional, this will allow both of its variations to be present in the same network

Do we need some structure for the initial object too?

it cannot do without logical features expressed in structural differences. For example, there should be no incompatible collisions with the current split schema. About the structure itself, i think it should be as efficient as possible to solve exact problems regardless of the server implementation

// ID. It is NOT required for the original object assembling. It MUST have ALL
// the "child objects" IDs. Child objects MUST be ordered according to the
// original payload split, meaning the first payload part holder MUST be placed
// at the first place in the corresponding link object. Sizes MUST NOT be omitted
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we wanna use it, we should be able to rely on it so there is MUST word. but then, we need to validate it and that is N head requests per link object PUT (N is num of objects in the chain)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N head requests per link object

the problem is more specific - when assembling a split object, the list of descriptors of child objects can diverge with the stored children (not only the sizes btw, IDs too). The proposed validation is correct, but since we cannot guarantee it at the protocol level (this is a server implementation detail), the assembler - one who operates with split chain - must be prepared for the divergence in any case

protocol-level solution is to additionally store such metadata in the child object headers themselves (mentioned here). Then the link object will remain a pure stand-alone helper

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but since we cannot guarantee it at the protocol level (this is a server implementation detail)

what do you mean? nodes may not accept such link objects

must be prepared for the divergence in any case

there is no solution for #264 then. if you cannot trust the link object, you cannot be sure you are answering with a correct range

protocol-level solution is to additionally store such metadata in the child object headers themselves

should also be validated somehow. a node can get Nth object with some offset in it. what does it do? just believes it is true? searches for the previous ones and calculates their sizes?

@roman-khimov

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean? nodes may not accept such link objects

nodes may do whatever they want. They may accept linking objects w/o any validation

there is no solution for #264 then

i dont deny current solution, im just highligting that linking object can be unavailable or diverge with the split chain and any network node must be ready for this. In these cases, extra metadata in child objects may be a painkiller in some cases

what does it do?

its own implementation detail, each node can only control what to do itself and what it receives from other nodes


or maybe i misunderstood ur original comment. I dont see any problem with current structure and requirements. If u didnt mean that N head requests per link object PUT is a problem, then everything is ok to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodes may do whatever they want

what does it do?

its own implementation detail, each node can only control what to do itself and what it receives from other nodes

i meant we need to find a balance to implement our "perfect" node that may exist in the world of "worst" nodes. if we may not rely on link object, i think the whole work is useless. but if we want to rely on it, we should validate it somehow

extra metadata in child objects may be a painkiller in some cases

here i want to say that ensuring child object data is ok is the same to ensuring similar info about the link object

If u didnt mean that N head requests per link object PUT is a problem, then everything is ok to me

yes. this and the consequences

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation should be implemented. Then I'd be optimistic wrt the link object contents. If it's not correct --- either you get an error (trying to access inexisting object) or reply with a wrong result, but it's not node's fault, garbage in -- garbage out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation should be implemented.

Yes, sure. Just not sure about some link object replication: imagine you have a 1к parts object and receive a link to it for the first time: you need to do 1k head objects before you finish with this object. And you may be an attacker that sends 1k of such 1k-parts objects. 64mb data for a HEAD chaos in the NeoFS network.

object/types.proto Outdated Show resolved Hide resolved
@carpawell carpawell marked this pull request as ready for review January 24, 2024 15:19
// ID. It is NOT required for the original object assembling. It MUST have ALL
// the "child objects" IDs. Child objects MUST be ordered according to the
// original payload split, meaning the first payload part holder MUST be placed
// at the first place in the corresponding link object. Sizes MUST NOT be omitted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N head requests per link object

the problem is more specific - when assembling a split object, the list of descriptors of child objects can diverge with the stored children (not only the sizes btw, IDs too). The proposed validation is correct, but since we cannot guarantee it at the protocol level (this is a server implementation detail), the assembler - one who operates with split chain - must be prepared for the divergence in any case

protocol-level solution is to additionally store such metadata in the child object headers themselves (mentioned here). Then the link object will remain a pure stand-alone helper

link/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
object/types.proto Outdated Show resolved Hide resolved
// IDs. IDs MUST be ordered according to the original payload split, meaning the
// first payload part holder MUST be placed at the first place in the corresponding
// link object.
message Link {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo the previously used approach only complicates the structuring and, as a consequence, the knowledge of the protocol concepts. All these entities are inextricably linked with objects - they are about objects, they are in the payload of objects. Сurrent packetization is extremely redundant to me

we have what we have

obviously, but we are not burdened with creating a new package (). Keep the existing ones, and this message can be left in object to comply with the system architecture hierarchy

being a newbie, i'll see acl, container, object and other large sections of the system. And then link - specific detail elevated to the top level of the system with a non-descript name

cthulhu-rider
cthulhu-rider previously approved these changes Jan 25, 2024
Copy link
Contributor

@cthulhu-rider cthulhu-rider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format is ok, left some discussions about docs

Copy link
Member

@roman-khimov roman-khimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a behavior spec in https://github.com/nspcc-dev/neofs-spec/? I guess most of mechanics stays the same, except for INIT.

I'm also not sure how split_id/init interact, do we need split_id if init is present?

object/types.proto Outdated Show resolved Hide resolved
// ID. It is NOT required for the original object assembling. It MUST have ALL
// the "child objects" IDs. Child objects MUST be ordered according to the
// original payload split, meaning the first payload part holder MUST be placed
// at the first place in the corresponding link object. Sizes MUST NOT be omitted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation should be implemented. Then I'd be optimistic wrt the link object contents. If it's not correct --- either you get an error (trying to access inexisting object) or reply with a wrong result, but it's not node's fault, garbage in -- garbage out.

@carpawell
Copy link
Member Author

Can we have a behavior spec in https://github.com/nspcc-dev/neofs-spec/?

You mean before this PR merge? Let's have at least approve+draft to ensure we agree about this PR. I can't continue before we are done here, it has already changed 3 times.

I'm also not sure how split_id/init interact, do we need split_id if init is present?

I'm not sure init object changes something here: if the link and the init objects are lost and there is no SplitID, there is no ability to assembly the original object (except you find somehow the latest part and compare its parent header). Also, you cannot update any indexes related to the original object if you meet some object part before the init or the link object (possible situation, why not?). cc @cthulhu-rider

@roman-khimov
Copy link
Member

To me, init can completely replace split_id. But if it's better to keep split_id for compatibility/recovery cases --- ok. The problem with missing spec is that these objects/fields are just data, while it's important how exactly we treat this data in various circumstances (init/split_id for example). At the same time, data-wise this seems to be complete, split_id can't be removed anyway.

roman-khimov
roman-khimov previously approved these changes Jan 30, 2024
carpawell added a commit to nspcc-dev/neofs-spec that referenced this pull request Jan 31, 2024
carpawell added a commit to nspcc-dev/neofs-spec that referenced this pull request Feb 1, 2024
carpawell added a commit to nspcc-dev/neofs-spec that referenced this pull request Feb 1, 2024
…#283

Keep both versions of the split objects and mark them as v1 and v2.

Signed-off-by: Pavel Karpy <[email protected]>
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
@carpawell
Copy link
Member Author

  • no INIT object, the first part is used as a parent object's header holder
  • LINK object is a separate type
  • SplitID is deprecated, the first part is used instead (so it is placed in every part now)

@cthulhu-rider
Copy link
Contributor

@carpawell closes #264?

It allows faster seeking through a split object without fetching the whole
chain. Closes #264.

Signed-off-by: Pavel Karpy <[email protected]>
This commit makes it easier to differ link objects from the other types. Object
split hierarchy rework increases the link object's structure and makes it
more strictly formatted, so now it plays a more important role in the split
chains (and the split rules became more complex too).

Signed-off-by: Pavel Karpy <[email protected]>
There is no need to generate some UUID with non-specified rules if the first
object part allows the same identification routines but uses hashes widely
accepted in the protocol.

Signed-off-by: Pavel Karpy <[email protected]>
@carpawell
Copy link
Member Author

closes #264?

Yes, added it to one of the commits.

@cthulhu-rider cthulhu-rider merged commit 533950f into master Feb 5, 2024
3 checks passed
@cthulhu-rider cthulhu-rider deleted the fix/child-objects-acl branch February 5, 2024 17:08
carpawell added a commit to nspcc-dev/neofs-spec that referenced this pull request Feb 5, 2024
…#283

Keep both versions of the split objects and mark them as v1 and v2.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit to nspcc-dev/neofs-spec that referenced this pull request Feb 6, 2024
…#283

Keep both versions of the split objects and mark them as v1 and v2.

Signed-off-by: Pavel Karpy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants