-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict mode for CBOR #2662
Comments
(Updated with more info from the spec.) |
Yes, it sounds very reasonable. I do not mind adding it at some point in the future or via contribution. |
Great, I may try to work up a PR. Where would you recommend getting started? I found |
@timmc What happens from a format perspective is that it recognizes that this is a collection serializer. Then it will call the update decoder (at some point it was part of DeserializerStrategy, but it is no longer) with the previous list value.That update will create/update the collection with the new value (or not). It would be possible to create a new version of this function that takes a strict argument (with the default implementation of the previous one passing false for strictness). |
Would you be able to point me to a code location? I'm not finding useful references to |
OK, found it -- |
I've found it straightforward to reject duplicate keys. Making it configurable would be a larger undertaking—I can either plumb it through via the serializers or via the descriptors, but there's not really a clean way to do it since there's no config object being passed along. Instead, I would have to add Would you accept a PR that just forbids duplicate keys in maps and objects, with an option to later add a configuration option that permits duplicate keys? |
@timmc I don't think it is a good idea to change existing behavior so that people would start receiving exceptions on a code that worked before, especially without an option to bring old behavior back. So it's better to have it. Flags are already stored in |
@timmc The code involved is: kotlinx.serialization/core/commonMain/src/kotlinx/serialization/internal/CollectionSerializers.kt Lines 99 to 113 in 51cb8e8
and kotlinx.serialization/core/commonMain/src/kotlinx/serialization/internal/CollectionSerializers.kt Lines 26 to 51 in 51cb8e8
It is important to note that for some formats it is valid to read list/map items one by one (for example in Protobuf): Lines 210 to 218 in 51cb8e8
This later case requires special handling of the MapSerializer. So you have two separate cases to support this generically (reading multiple items at a time, directly in MapSerializer; and repeatedly reading sublists). |
I'm not sure it is a good idea to add changes to |
@sandwwraith Indeed modifying CollectionSerializer would be a major undertaking with various compatibility concerns/challenges. |
I don't think Maybe it would make sense for |
I like this solution (even though some bits of the name/semantics could be "improved"). The function can have a default implementation that results in the current behaviour. Note that you should probably also think about the set serializer (sets also don't allow duplicate values/only keep the last one). I had a look at the library source code. You probably want to do something that checks the result of |
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. Also fixes a typo in an unrelated method docstring.
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. Also fixes a typo in an unrelated method docstring.
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. Also fixes a typo in an unrelated method docstring.
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. A new config option `Cbor.allowDuplicateKeys` can be set to false to enable this new behavior. This can form the basis of a Strict Mode in the future. Also fixes a typo in an unrelated method docstring.
Yeah, it seems to work out OK -- I have #2681 which still needs some work but should be on the right path. |
I'm coming back to this work and before I try to get my PR updated against recent changes, I'm wondering if there's still any interest in this issue? There are some significant merge conflicts and I don't want to work through them if the changes would be unwelcome. I also noticed that there is now a COSE compliance mode, but the COSE spec indicates that parsing duplicate keys is illegal:
(from https://www.rfc-editor.org/rfc/rfc8152.txt section 14 "CBOR Encoder Restrictions") Did this restriction get implemented somewhere after all, or is it still missing? |
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. A new config option `forbidDuplicateKeys` can be set to true to enable this new behavior. This can form the basis of a Strict Mode in the future.
(I ended up just replaying my changes on top of dev, since that was more expedient. But I'd still like to know if the changes are wanted, and the answer to my COSE question.) |
Fixes Kotlin#2662 by adding a `visitKey` method to `CompositeDecoder`; map and set serializers should call this so that decoders have an opportunity to throw an error when a duplicate key is detected. Calls to this method are added to: - `MapLikeSerializer.readElement` to handle maps - `CborReader.decodeElementIndex` to handle data classes A new config option `forbidDuplicateKeys` can be set to true to enable this new behavior. This can form the basis of a Strict Mode in the future.
What is your use-case and why do you need this feature?
My app needs to be able to sign and verify serialized data, which requires that the parse be unambiguous. The main problem is duplicate map keys. If there are repeated keys, then two recipients may disagree on what the signed data says, which is obviously a problem. :-) I've written on this kind of parser mismatch vulnerability previously: https://www.brainonfire.net/blog/2022/04/29/preventing-parser-mismatch/
The CBOR implementation currently accepts duplicate keys by using the last one encountered, but the spec says in §2.1:
§3.10 "Strict Mode" then waffles a bit and puts the onus on the sender to "avoid ambiguously decodable data" and defines an optional strict mode that decoders are not required to implement, particularly calling out duplicate key detection as a possibly heavy-weight requirement.
However, kotlinx.serialization already has the ability to detect duplicate keys because it builds up a HashMap or similar which can simply be consulted before insertion. (There's no streaming interface that I'm aware of.) The language in §3.10 is likely aimed at embedded and other highly constrained environments, which isn't particularly relevant to Kotlin.
Describe the solution you'd like
Cbor
implementation, enabled by default to comply with §2.1.(I'd also love for the duplicate key rejection to be enabled by default for all formats, as this is a known vulnerability in many, many protocols and people should be protected by default—but still able to explicitly opt out of this protection if they know what they're doing. Major version bump, I know.)
Note: #1990 covers rejection of duplicate keys, but it wasn't clear if that was meant to be specific to JSON.
The text was updated successfully, but these errors were encountered: