-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shouldn't take object properties order into consideration #65
Comments
Here's a simple test case for the scenario described:
|
Right now names and values are stored using lists, so an order-independent comparison could be as expensive as O(m·n) where m and n are the number of values. How do other reference libraries do it? I don't care about this from a practical standpoint (never showed up to me until now), but from a design perspective, this seems to be an issue. Good catch, @awvalenti. |
Actually, considering objects with the same set of members but a different order as not equal was a conscious decision. I see The However, I understand the request and suppose that it would be helpful to ignore the order, e.g. in unit tests. How about adding an additional equals method like |
@ralfstx I like the idea. I don't know if your reasoning for the Also, I would appreciate if the internal representation of JsonObject didn't change. This should not cause a performance hit, right? |
@mafagafogigante right, we should add JavaDoc for As for the performance, I don't think it's a problem if the new |
@ralfstx glad we agree on Javadoc'ing what we consider JsonObject equality. Documenting the decision is even more important than the decision itself. I don't really care about the performance of equalsIgnoreOrder(), I just don't want the internal representation to change to allow for this. Again, good that we are on the same page about that too. @awvalenti I suppose that this completely addresses the raised issue (as long as design goes). Do you have any remarks or comments on points we missed? |
Hello, I see now that it was a conscious decision, especially after I found a test case specifically stating that. I don't quite agree that JsonObject and Array should represent a "text JSON", although I understand your point of view. An Another possible solution would be to provide a What do you think? |
I think I disagree mainly because of the design rationale of this library. We start by giving it a equalsIgnoreOrder() because it seems almost a requirement to conform to JSON, then we get toMap and toList and two days later this is a clone of Jackson. It is not my call in the end, but I think that |
Clone of Jackson? I've used Jackson a little and had problems with it, but why would these features lead minimal-json to becoming a copy of Jackson? And what's your opinion about test output on failure? |
minimal-json cannot and should not get "nice features" for the sake of convenience.
I've missed that one, please provide a brief summary of what the question is. |
The question is:
|
I do agree that "expected true, found false" is not very informative, as any reasonable developer would. However, I don't think that we should expose internals or add methods to address this type of issue. Better messages in your assertion calls would remediate it, as in "the JsonArray does not have the expected elements". Furthermore, if you want the explicit values to be shown, it shouldn't be hard to do in a high level language such as Java for your own project, locally. minimal-json (emphasis mine, again) is meant to stay away from such verbosity and expose only a clean and short API, which it has been successful in doing so far. It is not meant to be a fully-fledged, we-solve-everything, gargantuan library. This is why I think that these sort of conversion methods do not belong in here. |
You got me convinced that minimal-json should in fact be minimal :). However, My opinion remains that The O(m . n) solution keeps the current implementation of JsonObject and just modifies the The O(n) solution would be to use I don't see any drawbacks on the second suggestion. Do you? |
O(1) was a typo, right? The second solution is linear, not constant.
Equality checks would be linear indeed. However, populating a properly written dynamic array (ArrayList here) is substantially faster than populating a LinkedHashMap. The doubly linked list is not very expensive, but every insertion will call hashCode() now, which is expensive in some cases, while ArrayList's add() never calls it. If we had to comply to the standard, your latter solution would be a right way to do it. However, as this does not seem like a requirement, I am against it. The performance hit is not trivial even for use cases I have. I agree with you that I just need to make it very clear to you that the increased cost for addition would introduce a noticeable performance regression. This would affect users of minimal-json. If it wasn't for this, I would also defend the ordering-independent equality behavior. |
Oops, O(1) was a typo, indeed. Sorry! I see your point about performance and agree with it. I haven't understand this part: "ask everyone to write ordering-independant equality checks". Which checks would those be? About toMap() and toList(), it seems to me now that they could simply be implemented on the test case (something like |
Wait a second... I thought the map to be used would be a generic Are there examples in which this would raise performance issues even so? |
I simply meant that I believe the library should provide an equalsIgnoreOrder() method. This is not something trivial to delegate to users. It is to some users, but not for all of them, I suppose. Also, by simply having an equals() and an equalsIgnoreOrder() methods it is made clear that equals() may not ignore ordering. Documentation in code at its best. I am not sure if String hashCode() is cached on all implementations. However, I am fairly confident that it is on all that I use. Just remember that, in the end, even if it "works" for me and you, we shouldn't ignore the rest of the world. Still on the caching issue, I am fairly certain that the library would not reuse the String objects of one Json object in other Json objects that also have it. This implies that the effective hashCode caching would not happen as you think it will. See this line of code to understand what I am talking about. There is also the case in which you are creating JsonObject objects from Java code, in which you could ensure String reuse (and therefore hashCode caching) yourself, but the JsonParser cannot do it at parsing time.
This is a good point. However, "usually" is not good enough for a library that needs to deal with big amounts of data (sometimes in real time). Additionally, usually there are tons of objects, each with several properties of its own, which implies a lot of hashCode() calls that are not going to be cached. |
@awvalenti we don't use |
@mafagafogigante I'm not generally opposed to the idea of a The implementation of I wouldn't consider performance to be an issue, as this function would not be used in the context of parsing or writing JSON, would it? |
@ralfstx I think you misunderstood me a little bit. I don't really care if toMap() and toList() are implemented as long as the internal implementation of the current objects is left unchanged. I think they shouldn't be implemented on a lean library, and your arguments against a toMap() seem to make this even more obvious.
You mean equalsIgnoreOrdering(), right? Well, I haven't seen duplicate keys used in practice but as far as I remember the standard is indifferent about it. However, for instance, this reference implementation does not allow for duplicate keys. This library does, which is also correct according to the JSON standard. Ultimately, I do not believe that the standard defines when two JSON objects are equal. I think that a sensible thing to do is to see if each and every key-value pair of an object has an identical key-value pair in the other object, not allowing the same key-value pair to be matched multiple times. This would make {a: b} != {a: b, a: b} but {a: b, a: b} == {a: b, a: b}. Implementation-wise: sort by a chained comparator of a key comparator and a value comparator, walk from first to last element and fail as soon as you get two entries that differ. This assumes you will also fail fast on the number of entries being different. Then we get "fast" comparison in O(n) with what we currently have and "correct" comparison in O(n lg n), which is only required in a few cases. Selection sort would be a good idea if most objects happened to be different as failing it midway would reduce the running time substantially, but any superlinear algorithm seems a simpler (and more likely correct) solution. Just don't let users even specify which algorithm they want us to use internally at runtime for equalsIgnoreCase(). |
This "duplicate key" thing is a bit scary :)... I was thinking about it yesterday. I thought the JSON spec prohibited it, but it simply says nothing about it, leaving for the implementation to decide. By the way, if duplicate keys are deliberately accepted by minimal-json, I believe it should have unit tests for that. Does it? In my opinion, toMap would only be useful if recursive. What if we simply add |
@mafagafogigante About the caching issue, you're right, it wouldn't work as expected. I was going to suggest calling |
I don't know. It is well documented, so I never even bothered checking it. Buy you may check it yourself. And yes, there should be some test cases using duplicate keys. Additionally, a quick read of the JsonObject tend to make things very clear. But not many users are willing to read through the library code.
And that's what most users would expect it to be.
I agree with you about implementing it. But if we are going to do it, I want it to be done right. Your solution allows for {a: b} == {a: b, a: b}, which I think are not the same JSON object and should produce false. |
Let's not go there. The implementation has changed a lot in the past few years and I don't think a library should be messing with the string table. I think this is going to end on implement equalsIgnoreOrdering(), however that is done. |
Like I said, I gave up suggesting the
You're right. If the library allows {a:b, a:b}, equals shouldn't behave this way. (by the way, I created a pull request with two new unit tests to document the "multi-keys" behavior)
I agree. I'm not sure if I understood your proposal of solution, but it seemed O(n . log n) to me if it involves sorting before comparing. |
Yes, asymptotical worst case of the simplest one is n · log n, which shouldn't be a big problem. In addition, I couldn't think of a way to test for equality as we want to in linear time. |
Explain the conditions which imply equality for JsonObject and JsonArray. In particular, point out that JsonObjects are considered equal only if the members have the same order. See #65
I think I agree on that. In fact, calling Considering a deep comparison of JsonObjects, an |
Probably. Isn't the ordinary |
I liked your library a lot, much simpler than most of the others. I found that a test here failed because the "id" property was in the end of the object instead of the beginning. According to json.org: "An object is an unordered set of name/value pairs".
The text was updated successfully, but these errors were encountered: