-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoth parses invalid JSON #52
Comments
Thoth.Json.Net is not directly responsible for parsing the JSON. It is using Newtonsoft under the hood. It seems like Newtonsoft is permissive by default... Unfortunately, there isn't much we can do on Thoth.Json.Net side about it right now. |
open System.Text.Json
let json =
"""
{
"x": 123,
"y": 456,
}
"""
let x = JsonDocument.Parse json
printfn $"{x}"
What are your thoughts on building a third implementation of Thoth backed by Happy to contribute toward this. |
The plan right now, is more to stop having 2 implementations of the library and use a single/unified package. The reason behind this choice is to reduce the number of packages to maintain and also to allow people to easily create libraries/package depending on Thoth.Json. Right now, if they want there package to be compatible with .NET and Fable they needed to depends on both Thoth.Json and Thoth.Json.Net and use compiler directives in there code. It would be better if they didn't have to do that. I am still unsure if I want to hard code the implementation or create an interface which contains more of less the Using the interface means losing a bit of performance but it was never a priority with Thoth.Json. Solution could also be a mix:
But I feel like the mix is going harder to create and perhaps easy to break. If I am using a custom JSON parser I can unify almost all the code and only worry about .NET / Fable difference in the reflection API. However, creating a JSON parser is not that simple as if we want to be compliant to the spec it is almost impossible because the spec itself doesn't specify everything... And after that people say that JSON is simple :p My message got a bit side tracked but I hope it answer a bit your question. Feel free to comment if you want to speak more about it |
@MangelMaxime @njlr I have suddenly become very interested in this comment:
We are using Thoth extensively in an API/back-end F# project, and it is working really well with us. However, we have some hot, high-load endpoints/code paths where performance is really critical, and one thing that has come up is that Hypothetically, if one was going to fork this repo to something like If it's really a mostly straightforward transition, we could consider pitching in and help with that fork. On the other hand, if it would be a virtual re-write and take months of work to complete, that might sway us in a different direction. (Btw, @MangelMaxime really appreciate all your work for the F# ecosystem. I've learned a ton from your work and benefited from your work on several F# libraries that we use!) |
Hello @olo-ntaylor First of all thank you for you kind words and always happy to know that my work is used and benefit others :). If one wanted to fork this repo the roadmap would be:
In theory For Example: Lines 23 to 41 in dde3551
For Lines 11 to 42 in dde3551
I don't think you would need to change the Auto API because that code use the manual decoder internally and the rest of the code is related to the reflection which should be the same.
Lines 74 to 107 in dde3551
FYI I still have the plan to work on the next major version of Thoth.Json and I am still debating how I want it to be done. I have 2 ideas:
Cons: Several package to use so it is not a total unification of the packages. Thoth.Json + (Thoth.Json.Fable | Thoth.Json.Newtonsoft | Thoth.Json.System.Text.Json | etc.) Pros: Use battle tested JSON parser and benefit from their work on performance.
Cons: Need to write a custom JSON parser, meaning it will probably be slower than the others. Pros: Have the same JSON parser, so the exact same behaviour everywhere + only 1 package instead of several. |
The main issue (which you may not care about anyway) is building libraries on top of the Thoth packages. For example, the If we added a third implementation, The issue compounds as the dependency tree gets deeper! I would suggest making an interface so that you can switch at run-time ( I do not believe that static configuration (using SRTP?) is possible but would love to be proven wrong :) This would be a big change to If you want something that works today, Fleece has a System.Text.Json backend (but none for Fable). Edit: Beat me to it! I strongly support proposal 1. A from-scratch, pure F# parser that works everywhere could be provided as a backend, giving of the benefits of 1 and 2. |
@njlr You are right that libraries build on top of Thoth.Json would have more compiler directives in the code.
It is true that, option 1 is not exclusive of option 2. When option 2 is indeed exclusive. Option 2, was at the time tried because using an interface means the helpers are not |
Gathering up what's been said so far... it sounds like the preference would be to not introduce a separate At first glance, that seems like a ton more work 😅 Not sure if my intuition is correct on that... how long do you expect such a refactor would take, (assuming you had at least some community contributions)? |
Creating your own fork means:
Having Thoth.Json support several backend natively is a more viable solution over time IHMO.
Hum such a work does indeed require some work. Most of the code already exist but I need change the types from: type Decoder<'T> = string -> JsonValue -> Result<'T, DecoderError>
type Encoder<'T> = 'T -> JsonValue to something like type Decoder<'T> = IDecoderHelpers<'Value> -> string -> JsonValue -> Result<'T, DecoderError>
type Encoder<'T> = IEncoderHelpers<'Value> -> 'T -> JsonValue The other big change is to merge Thoth.Json.Net and Thoth.Json code into a single package to have the logic in one package only. I can't really gave an estimate but my last complete rewrite of Thoth.Json took me 1 week to have a beta version. I didn't release it because I was not happy with a few things and also I made some improvement on how some types where represented but didn't track it correctly and so was not able to ensure compatibility with previous version. I now have learn from this error 😅 The initial bulk work would need to be done by me otherwise I don't think I could understand the implications of the changes. After that, community contributions to report errors / send PR would be welcome indeed. @olo-ntaylor I can't give an estimate of when I will be able to work on that project. But in the past, I had some companies sponsoring some of my work time so I could focus on working on their specifics needs. If you would like to discuss a possibility like that, I am available on F# Slack by the handle @MangelMaxime. |
Thanks for the quick responses, @MangelMaxime!! I think one approach we may consider is forking the repo into our private GH and doing the swap to System.Text.Json there so we don't pollute the ecosystem of packages available, and that way we're not going to screw anyone else over if we miss something 😅 Then we could potentially re-evaluate our needs and how urgent it would be to move to the updated framework-agnostic version in the future. |
Seems like a good middle ground to me. |
I was curious what Option 1 might look like so I sketched this out. It works in an F# script. // Core definitions, agnostic to the particular JSON backend
type IEncoderOps<'json> =
abstract member Encode : int -> 'json
abstract member Encode : 'json list -> 'json
type IDecoderOps<'json> =
abstract member TryAsInt : 'json -> int option
abstract member TryAsArray : 'json -> ('json list) option
type IEncoder<'t> =
abstract member Encode<'json> : ops : IEncoderOps<'json> * value : 't -> 'json
type IDecoder<'t> =
abstract member Decode<'json> : ops : IDecoderOps<'json> * path : string * json : 'json -> Result<'t, string>
[<RequireQualifiedAccess>]
module Encode =
let int : IEncoder<int> =
{
new IEncoder<int> with
member this.Encode<'json>(ops : IEncoderOps<'json>, t : int) =
ops.Encode(t)
}
let list (element : IEncoder<'t>) : IEncoder<'t list> =
{
new IEncoder<'t list> with
member this.Encode<'json>(ops : IEncoderOps<'json>, xs : 't list) =
xs
|> List.map (fun x -> element.Encode(ops, x))
|> ops.Encode
}
[<RequireQualifiedAccess>]
module Decode =
let int : IDecoder<int> =
{
new IDecoder<int> with
member this.Decode<'json>(ops : IDecoderOps<'json>, path : string, json : 'json) =
match ops.TryAsInt(json) with
| Some i -> Ok i
| None -> Error $"%s{path}: Expected an int, but found %A{json}"
}
let list (element : IDecoder<'t>) : IDecoder<'t list> =
{
new IDecoder<'t list> with
member this.Decode<'json>(ops : IDecoderOps<'json>, path : string, json : 'json) =
match ops.TryAsArray(json) with
| Some xs ->
let rec loop (index : int) (xs : 'json list) (acc : 't list) =
match xs with
| [] -> Ok (List.rev acc)
| x :: xs ->
match element.Decode(ops, $"%s{path}[%i{index}]", x) with
| Ok x -> loop (index + 1) xs (x :: acc)
| Error error -> Error error
loop 0 xs []
| None -> Error $"%s{path}: Expected an array, but found %A{json}"
} // Backend for System.Text.Json
module SystemTextJson =
open System.Text.Json
type private EncoderOps() =
interface IEncoderOps<JsonElement> with
member this.Encode(i : int) =
JsonSerializer.SerializeToElement(i)
member this.Encode(xs : JsonElement list) =
JsonSerializer.SerializeToElement(xs)
type private DecoderOps() =
interface IDecoderOps<JsonElement> with
member this.TryAsInt(element : JsonElement) =
match element.ValueKind with
| JsonValueKind.Number ->
Some (element.GetInt32())
| _ ->
None
member this.TryAsArray(element : JsonElement) =
match element.ValueKind with
| JsonValueKind.Array ->
Some
[
let mutable en = element.EnumerateArray()
while en.MoveNext() do
yield en.Current
]
| _ ->
None
[<RequireQualifiedAccess>]
module Encode =
let private ops =
EncoderOps()
:> IEncoderOps<JsonElement>
let toString (encoder : IEncoder<'t>) (t : 't) : string =
encoder.Encode(ops, t).ToString()
[<RequireQualifiedAccess>]
module Decode =
let private ops =
DecoderOps()
:> IDecoderOps<JsonElement>
let fromString (decoder : IDecoder<'t>) (json : string) : Result<'t, string> =
try
let jsonDoc = JsonDocument.Parse(json)
decoder.Decode(ops, "$", jsonDoc.RootElement)
with
| :? JsonException as exn ->
Error exn.Message // Demo
open SystemTextJson
let encoder =
Encode.list Encode.int
let decoder =
Decode.list Decode.int
let xs = [ 1; 2; 3 ]
printfn $"%A{xs}"
let json = Encode.toString encoder xs
printfn $"%s{json}"
match Decode.fromString decoder json with
| Ok xs ->
printfn "%A" xs
| Error err ->
printfn "Error: %s" err Some findings:
I would be happy to contribute if / when the next version is started. |
Glad to see that the concept works in theory. You did it a bit differently that I would have done it. Indeed, I would have kept the let list (decoder : Decoder<'value>) : Decoder<'value list> =
fun decoderHelpers path value -> // ... To me it seems simpler than using object expression. |
Also, I re-created a new issue in Thoth.Json repository to track this new re-design work. I will add to that issue, the list of all the improvements I want to include because if we are to do a major rewrite it is the good time to improve other stuff too. |
Maybe I'm missing something, but I'm not sure this can work. What is the type of I think it needs an extra type-parameter to be filled by |
The types would be something like that: type IDecoderHelpers<'JsonValue> =
abstract GetField : FieldName : string -> value : 'JsonValue -> 'JsonValue
abstract IsString : value : 'JsonValue -> bool
abstract IsBoolean : value : 'JsonValue -> bool
abstract IsNumber : value : 'JsonValue -> bool
abstract IsArray : value : 'JsonValue -> bool
abstract IsObject : value : 'JsonValue -> bool
abstract IsNullValue : value : 'JsonValue -> bool
abstract IsIntegralValue : value : 'JsonValue -> bool
abstract IsUndefined : value : 'JsonValue -> bool
abstract AnyToString : value : 'JsonValue -> string
abstract ObjectKeys : value : 'JsonValue -> string seq
abstract AsBool : value : 'JsonValue -> bool
abstract AsInt : value : 'JsonValue -> int
abstract AsFloat : value : 'JsonValue -> float
abstract AsFloat32 : value : 'JsonValue -> float32
abstract AsString : value : 'JsonValue -> string
abstract AsArray : value : 'JsonValue -> 'JsonValue[] I never tested my idea of using that interface so it is possible that indeed the compiler will want something more that just a function for the decoder/encoder definition. |
@MangelMaxime @njlr Just FWIW, in my hacking yesterday I realized that I only mention that because I think you'd have to keep in mind that |
Hello @njlr, I started working on the unification of Thoth.Json and you were right. Using object expression allows to avoid surfacing the It allows to write: type User =
{
FirstName: string
LastName: string
}
module User =
let decoder: Decoder<User> =
Decode.map2
(fun firstName lastName ->
{
FirstName = firstName
LastName = lastName
}
)
(Decode.field "firstName" Decode.string)
(Decode.field "lastName" Decode.string) instead of something like And I am happy to report that the experimentation seems to be really promising. Implementing a new runtime/target is as simple as implementing let encoderHelpers =
{ new IDecoderHelpers<obj> with
member _.isString jsonValue = jsonValue :? string
// ...
member _.anyToString jsonValue =
emitJsStatement jsonValue
"""
JSON.stringify($0, null, 4) + ''
"""
}
module Decode =
module Helpers =
[<Emit("$0 instanceof SyntaxError")>]
let isSyntaxError (_ : obj) : bool = jsNative
let fromString (decoder : Decoder<'T>) =
fun value ->
try
let json = JS.JSON.parse value
Decode.fromValue encoderHelpers "$" decoder json
with
| ex when Helpers.isSyntaxError ex ->
Error("Given an invalid JSON: " + ex.Message) I have a working minimal prototype which works for JavaScript and Newtonsoft.Json. I am now working on transposing all the manual API to the new implementation in order to checks if it works on a much larger scales. @olo-ntaylor Thank you for the note. Once, I have completed the work of JavaScript and Newtonsoft.Json, I will give it a try to use System.Text.Json in order to make sure that the point you raised is supported. Progress on the rewrite will be reported in thoth-org/Thoth.Json#175 |
This should not parse since the trailing comma is out of spec.
The text was updated successfully, but these errors were encountered: