Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify what is notarized on the ledger #321

Open
SteveLasker opened this issue Dec 16, 2024 · 24 comments
Open

Clarify what is notarized on the ledger #321

SteveLasker opened this issue Dec 16, 2024 · 24 comments
Assignees

Comments

@SteveLasker
Copy link
Collaborator

From @mcr,

https://mailarchive.ietf.org/arch/msg/scitt/sOGI7xcUaOJx5Zag8uqsygooLrc/

it seems that the receipt received from service A can easily be added to the statement submitted to service B. It's unclear to me that the signed statement submitted to service A can then be updated. Does the Append-Only Log record just the protected and payload of the Sign1, or does it also witness the unprotected header?

We don't seem to clarify what content is used to append to the ledger. Is it just the to-be-signed-bytes, or the entirety of the signed statement, including the unprotected header?

@achamayou
Copy link
Collaborator

We don't seem to clarify what content is used to append to the ledger. Is it just the to-be-signed-bytes, or the entirety of the signed statement, including the unprotected header?

I think we should mandate that it's at least (*) the signed statement with an empty unprotected header, because:

  1. It's a form of the statement from the issuer that does not contain un-authenticated content.
  2. It's convenient for clients/verifiers/tooling to deal with in a way that the TBS is not.

(*) Implementations could decide that they want to allow some unprotected headers in, perhaps as supporting evidence (such as x5chain where it's sent unprotected), and should be encouraged to validate it as much as possible before they do so. Exposing what unprotected headers are counter-signed, if any, could be done in the transparency config, for the benefit of the verifiers.

Allowing unprotected headers wholesale amounts to blindly counter-signing something that hasn't been signed in the first place, and should at the very least be heavily discouraged if not forbidden.

@SteveLasker
Copy link
Collaborator Author

Notes from editors meeting:
We should create a PR, for text to be added to the registration section: 6.2. Registration of Signed StatementsRegistration of Signed Statements

  • Set the value of the unprotected header to an empty map

This would include the protected-header, payload and signature.
We're specifically calling out emptying the unprotected header.

@robinbryce
Copy link
Collaborator

Per the conversation here https://youtu.be/DK40XJ651Cw I don't object to the requirement that un protected headers be stripped. I can see the could be left to being "client choice", but I do understand the chalenges that could create for scanning the log for the entry. While the client knows what was registered, I can see auditors may not.

One concerne I'd like to be clear on in any resulting changes: Did the conversation settle on permiting the ledger to include additional content in the actual ledger entry on disc ?

Concrete example from the datatrails ledger we define the merkle leaf of our ledger like this

H(DOMAIN-SEPARATOR || EXTRABYTES || IDTIMESTAMP || SERIALIZEDBYTES)

Our hope is that SERIALIZEDBYTES is the result of the normalization process discussed on the call (regardless of what is decided with respect to the inclusion of unprotected headers), and that its for the TS registration policies to defined additional content. Is this what is intended ?

Additional context:

In our case the log is actually a mog, in that it includes a trie friendly leaf index as well as the more regular merkle tree data. So all material needed to verify the receipt can be obtained (efficiently) from the combination of the original statement, the receipt, and the index entry in the log. This means there is a VDS/ receipts format specific step to assemble the leaf value.

We do this so that while we may have duplicate SERIALIZEDBYTES, all entries in the ledger leaves have a unique and time sortable component to their pre-image. And one which is established by the process which commits the VDS to persistent storage.

@rjb4standards
Copy link
Collaborator

rjb4standards commented Dec 18, 2024 via email

@achamayou
Copy link
Collaborator

@robinbryce I believe additional content was always allowed, we certainly do it in our implementation, as reflected in the CCF tree profile.

This is (again, as I understand it) orthogonal to how the Signed Statement is normalised prior to inclusion, and whether the unprotected/unsigned bits are included or not.

@robinbryce
Copy link
Collaborator

@rjb4standards appologies, that was a lazy reference. mog is a short hand for a "map backed log", in otherwords a combination of an append only log with some kind of index. its definitely not a formal term. some treatment of this can be found here https://transparency.dev/verifiable-data-structures/ though it doesn't specificaly use "mog".

@robinbryce
Copy link
Collaborator

@robinbryce I believe additional content was always allowed, we certainly do it in our implementation, as reflected in the CCF tree profile.

This is (again, as I understand it) orthogonal to how the Signed Statement is normalised prior to inclusion, and whether the unprotected/unsigned bits are included or not.

Great, that is completely aligned with how I understood things. Thanks!

@robinbryce
Copy link
Collaborator

robinbryce commented Dec 18, 2024

Ok, on the specific issue of normalization: my instinct here has always been "don't mess with the bytes" in the strictest sense. A client registering a statement gets to chose and clients may have good reasons for wanting the un protected headers included in the normalised bytes.

The most compeling counter point is that, given a transparent statement with 4 receipts on it that you find under a rock, how would you know which of those receipts contributed pre-image bytes for the others ?

On a transparent statement, receipts are contained in an array

https://ietf-wg-scitt.github.io/draft-ietf-scitt-architecture/draft-ietf-scitt-architecture.html#fig-transparent-statement-edn

I can't see any way that can be re-conciled without knowing something about the specific receipt being verified.

@robinbryce
Copy link
Collaborator

accidental close!

@robinbryce
Copy link
Collaborator

robinbryce commented Dec 18, 2024

Last thought: If I wanted to register a statement which represented "this statement was issued and successfully registered, because its got a receipt attached", given the current arch, I would make it the payload and be done I think

Maybe "any receipts in the unprotected headers MUST be stripped" would thread the needle but if we are going to mess with the bytes at all, this lacks the compeling simplicity of "remove them all"

@achamayou
Copy link
Collaborator

A client registering a statement gets to chose and clients may have good reasons for wanting the un protected headers included in the normalised bytes.

I agree, but the problem with them being unprotected is that there's no telling who set them. It may be the issuer, the client, or someone else. It could be the server implementation.

The most compeling counter point is that, given a transparent statement with 4 receipts on it that you find under a rock, how would you know which of those receipts contributed pre-image bytes for the others ?

I think the point of stripping the unprotected headers is that you don't need to know, because the receipts don't contribute to the pre-image. All the receipts are for the same pre-image, which allows parallel registration and verification.

@robinbryce
Copy link
Collaborator

Yeah, I am now leaning quite a lot to remove all. But wondering if there is any support for "remove just receipts" from the headers

@achamayou
Copy link
Collaborator

I guess the possibilities are:

  1. Remove only some things (receipts, ...?)
  2. Remove everything
  3. Remove everything except some things (x5chain, ...?)

I am not keen on 1. because it amounts to what I perceive as very open-ended acceptance of completely unauthenticated content. A single issuer getting MITM'ed can lead to content unwanted by both the issuer and the ledger making it in. A dispute between the ledger and an issuer about this content also cannot be satisfactorily resolved.

  1. seems fine to me, it's what I do! It's also what @OR13 does, if I understand correctly.
  2. seems plausible, so long as the content is subject to some kind of constrained validation. Because I can't think of a use case, I'm a bit skeptical though.

@robinbryce
Copy link
Collaborator

I agree, but the problem with them being unprotected is that there's no telling who set them. It may be the issuer, the client, or someone else. It could be the server implementation.

And also

A single issuer getting MITM'ed can lead to content unwanted by both the issuer and the ledger making it in

Yes totally, but this is why this choice is so impactful it goes to the heart of the trust assumptions:

The issuer has signed only the protected headers and the payload. This is not in question here.

The transparency service has either:
A: counter signed (by way of a receipt) only the protected headers and payload
B: counter signed protected, payload & unprotected(or some redacted subset)

For B: whether all or some are signed, the trust model is different for the unprotected headers. The registering entity is the last party in the chain. they are explicitly not necessarily the issuer in the arch.

B: has utility, but the mental model of sig + counter sig is now incomplete

In otherwords, it has quite an impact on the trust model as the unprotected headers are only attested in a meaningful way by the TS. (Granting that the arch does not place trust in the authentication required to call registration endpoints)

@robinbryce
Copy link
Collaborator

If we do anything other that 2, then the TS is attesting that "this random stuff was included in the statement at time of registration". It would then be a matter for TS registration policies to declare what that means I think ?

@achamayou
Copy link
Collaborator

For B: whether all or some are signed, the trust model is different for the unprotected headers. The registering entity is the last party in the chain. they are explicitly not necessarily the issuer in the arch.

Yes, completely agree. I tried to say this in the recording, but the term "counter-signed" proved controversial :)

It seems to me that the unprotected header (minus the receipt) cannot be said to be counter-signed, because it isn't signed to begin with. If the service decides to sign it, then a more obvious and explicit location to do that is in the protected header of the receipt itself. That is very unambiguously "the service says" (as opposed to "the issuer says", which is the signed statement).

I struggle to explain simply what the service signing the unprotected header means, it's something like "the service says that it heard this from a submitter, and the issuer may or may not have meant that".

@achamayou
Copy link
Collaborator

achamayou commented Dec 18, 2024

If we do anything other that 2, then the TS is attesting that "this random stuff was included in the statement at time of registration". It would then be a matter for TS registration policies to declare what that means I think ?

It's possible to let the registration policy decide, but that makes validating a transparent statement a lot more complicated all of a sudden, because you need to work out what the registration policy has decided to leave in or not.

@JAG-UK
Copy link
Contributor

JAG-UK commented Dec 19, 2024

I'm not sure which hat I'm wearing for this comment, but here are my thoughts for the consideration of the group.

TL;DR: The unprotected header (UH) MUST be empty when registering a signed statement.

Why?
As others have argued, the responsibility of the Transparency Service is to faithfully register exactly what the Issuer submits for Registration. Anything else seems strange: the very reason for Issuer signature is to prevent the TS or the channel from tampering or spoofing statements on the Issuer's behalf.

So then if the Issuer feels the need to put something in the UH then that should be recorded also. Think about why they might do this: to give hints to the Transparency Service? To add tags or something that for whatever reason they want to be able to change and so don't put in the meta map? Well, if these entries are to have any effect at all, then those effects might be security relevant to the application layer, or say something critical about the circumstances of Registration, and so not recording them potentially opens an avenue for a dishonest Issuer to smuggle or hide important information away whilst still getting a valid receipt.

BUT the other side of the threat graph also applies: because the UH is not covered by the Sign1 it is possible for a dishonest TS or channel to add things in here, and then record them, and the Issuer would have no recourse to say "that wasn't part of my submission".

And putting these things together one must also conclude that the supposed potential intention of an Issuer to put instructions or metadata or whatever in the UH is unsafe because of this channel/TS threat. And so the assumption must be that no security relevant info can be put in the UH, nor should an Issuer expect that anything in the UH necessarily be processed or actioned by the TS.

At which point it's completely useless and we might as well do away with it.

BUT, there is a stated use for it which remains, which is that on OUTPUT or READ you might have a bunch of useful info stuffed in there, such as SignedStatement + Receipt-in-UH => TransparentStatement. Using the UH as a convenient attached transport for related info that necessarily accrues after signing and after registration is still pretty neat, and so far the things stuffed in there have their own independent integrity measures so while they're still vulnerable to being removed altogether, there is some trust that can be built there.

@robinbryce
Copy link
Collaborator

It seems to me that the unprotected header (minus the receipt) cannot be said to be counter-signed, because it isn't signed to begin with

Absoloutely

but that makes validating a transparent statement a lot more complicated

Yes

(JAG-UK) And so the assumption must be that no security relevant info can be put in the UH

Indeed

A clarification I might then make is "By definition, a transparent statement can not be registered".

If that sort of thing is desired the transparent statement would need to be a payload for a different statement. And as such would be opaque to the protocol

It is implied, but the use of un protected headers to convey receipts has contributed to the need for clarification here I think.

To be clear, I'm supportive of "The unprotected header (UH) MUST be empty when registering a signed statement."

@robinbryce
Copy link
Collaborator

I also think that

"The unprotected header (UH) MUST be empty when registering a signed statement."

Means that https://ietf-wg-scitt.github.io/draft-ietf-scitt-scrapi/draft-ietf-scitt-scrapi.html#name-register-signed-statement

Should do 400 "invalid request" rather than silently stripping. This means that validation can decode to check for the absense of headers, but the exact bytes of the original (valid) request are submited to the ledger. I think has practical benefits for various potential implementation issues.

@achamayou
Copy link
Collaborator

"The unprotected header (UH) MUST be empty when registering a signed statement."

That would mean that a client submitting an already transparent statement would need to strip the receipt prior to sending. It's not a major problem (easy enough to ship the CLI tool/utility function), but just to make sure that's what's intended.

Although our implementation strips the UH now, I think I am warming up to the idea of forbidding it altogether, because it's a more definitive solution to canonical serialisation issues, and it ticks the "don't mess with the bytes" box at the same time.

@JAG-UK
Copy link
Contributor

JAG-UK commented Dec 19, 2024

That would mean that a client submitting an already transparent statement would need to strip the receipt prior to sending.

Yes, I can see the practical inconvenience there but I think it's cleaner to just forbid it altogether in the formal spec and make it an SDK problem as you say

@achamayou
Copy link
Collaborator

Fair enough, it definitely is cleaner and clearer to only accept valid inputs.

@SteveLasker
Copy link
Collaborator Author

Summarizing the journey:

  • The unprotected header is intended for the TS to return information after registration or for the client to attach information, such as the receipt, for convenience of storage and transport of associated metadata.
  • Upon registration, the unprotected header MUST be empty, or a 400 is returned via SCRAPI to provide consistency and avoid potential MIM attacks.
  • If we wish to register a Transparent Statement, it would need to be submitted as the payload, with a known content-type, and could use the same subject to associate them. This latter one is something we should consider how we handle this scenario and why it's needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants