Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework GFromJSON instances to enable elimination of Generics #653

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

remyoudompheng
Copy link

@remyoudompheng remyoudompheng commented Jul 22, 2018

The Parser monad signature makes inlining very difficult
because GHC only inlines saturated applications, which is
hard to achieve with Parser.

We work around this by defining an alternate method in GFromJSON
which manipulates the semantically equivalent IResult monad
(Parser is technically more expressive but not for the terms appearing
in GFromJSON). The existing method is left for API compatibility.

The resulting code achieves rougly the same compilation speed but
reaches large performance improvements, for a generated code size
not far from the manually written instances in benchmarks.

The object file for Twitter.Generic grows to 332kB (from 285kB),
but the Twitter.Manual module is 300kB large. The inlining level of
typeMismatch has been reduced to curb code blow-up.

Due to inlining levels differences, the Generic fromJSON is now
often quicker than its TH or Manual counterpart in benchmarks.

Benchmark results (Core i5 Haswell)

TH Generic (before) Generic (after) Manual
D/fromJSON 1.720 µs 5.076 µs 1.505 µs -
BigRecord/fromJSON 3.720 µs 8.450 µs 2.964 µs -
BigProduct/fromJSON 2.412 µs 3.427 µs 1.812 µs -
BigSum/fromJSON 173 ns 1230 ns 171 ns -
twitter100 1.611 ms 2.063 ms 1.334 ms 1.516 ms
jp100 1.844 ms 2.18 ms 1.481 ms 1.677 ms

The Parser monad signature makes inlining very difficult
because GHC only inlines saturated applications, which is
hard to achieve with Parser.

We work around this by defining an alternate method in GFromJSON
which manipulates the semantically equivalent IResult monad
(Parser is technically more expressive but not for the terms appearing
in GFromJSON). The existing method is left for API compatibility.

The resulting code achieves rougly the same compilation speed but
reaches large performance improvements, for a generated code size
not far from the manually written instances in benchmarks.

The object file for Twitter.Generic grows to 332kB (from 285kB),
but the Twitter.Manual module is 300kB large. The inlining level of
typeMismatch has been reduced to curb code blow-up.

Due to inlining levels differences, the Generic fromJSON is now
often quicker than its TH or Manual counterpart in benchmarks.

Benchmark results (Core i5)

                    TH        G(before) G(after)  Manual
D/fromJSON          1.720 µs  5.076 µs  1.505 µs  -
BigRecord/fromJSON  3.720 µs  8.450 µs  2.964 µs  -
BigProduct/fromJSON 2.412 µs  3.427 µs  1.812 µs  -
BigSum/fromJSON       173 ns   1230 ns    171 ns  -
twitter100          1.611 ms  2.063 ms  1.334 ms  1.516 ms
jp100               1.844 ms  2.18  ms  1.481 ms  1.677 ms
@Lysxia
Copy link
Collaborator

Lysxia commented Jul 22, 2018

That's very cool!

It's unfortunate that gParseJSON is exported in the toplevel modules Data.Aeson/Data.Aeson.Types. Users should really not depend on these Generics internals. It would be a good idea to hide this at some point, if not now, so we can just remove gParseJSON.

@remyoudompheng
Copy link
Author

What is the process for contributions ?

@Lysxia
Copy link
Collaborator

Lysxia commented Jul 25, 2018

Just wait for bergmark to look at the PR.

@bergmark
Copy link
Collaborator

Thanks for the contribution! And sorry for the delay. I'll try to review this and #652 tonight.

@@ -240,6 +240,15 @@ class GFromJSON arity f where
-- or 'liftParseJSON' (if the @arity@ is 'One').
gParseJSON :: Options -> FromArgs arity a -> Value -> Parser (f a)

-- | An internal method that return an IResult, which is easier to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is not a breaking change since the method is not exported and it has a default?

@bergmark
Copy link
Collaborator

This looks great to me! As for #652 I was surprised that it didn't make compilation speed a lot worse. If someone else could do a review as well, I'd appreciate it.

It may be that this will cause issues with older GHCs, as with #652, but if so we can consider dropping some support.

Re gparseJson being exported - I wonder if anyone uses it? I can create a separate ticket for that as this change doesn't seem to require a major bump.

@remyoudompheng
Copy link
Author

This patch may increase compilation time of records on GHC 7.10 and 8.0 (see #652)

This may be alleviated by adding {-# NOINLINE gParseJSON #-} whenever gParseJSON' is already defined (since the other one becomes unused). Tests on Text.Pandoc.Definition from pandoc-types show 500MB of resident memory reduction out of 1.5-2GB.

@bergmark
Copy link
Collaborator

bergmark commented Aug 6, 2018

I'm not sure I understand what you mean by "whenever gParseJSON is already defined", Is that something we could do as a part of this PR?

@bergmark bergmark removed this from the 1.4.1.0 milestone Sep 24, 2018
@guaraqe
Copy link

guaraqe commented May 22, 2019

The result of Generics here seem even better than the TH ones. What is blocking this PR?

@bergmark
Copy link
Collaborator

Sorry for the delay. I had a question in my last comment, and i just internalized this PR as "needs more work" but really I think we should merge this. I'll try to look into the merge conflict some time unless someone beats me to it.

@bergmark bergmark added this to the next milestone Jun 23, 2019
@bergmark bergmark removed this from the next milestone Sep 7, 2019
@phadej
Copy link
Collaborator

phadej commented May 16, 2020

This conflicts, and I'm worried that class now has two-fields. It's most likely affects optimisation as it's no more newtype-like.

After a bit of thought, let us leave this from 1.5 (It's indeed could be fixed in minor version, as we don't export internals).

@arybczak
Copy link

I'm worried that class now has two-fields. It's most likely affects optimisation as it's no more newtype-like.

The OP demonstrates a significant runtime performance increase though.

This PR isn't very complicated, might be a good idea to solve conflicts and revisit it. I might do that at some point (and simultaneously look into compile time performance of generic deriving of instances too as it's currently slow).

@runeksvendsen
Copy link

runeksvendsen commented Dec 13, 2024

Is it possible for me to modify my instances to realize these performance gains, thus avoiding needing to wait for this PR to be merged?

@phadej
Copy link
Collaborator

phadej commented Dec 13, 2024

@runeksvendsen don't use aesons Generics. You may use TH. But in general I advice to not use any meta-programming derivation of To/FromJSON instance if you care about performance or stability of your API.

TL;DR The existing Generics in aeson are bad, and are there mostly to not break existing code.

@Lysxia
Copy link
Collaborator

Lysxia commented Dec 13, 2024

You could check out https://hackage.haskell.org/package/deriving-aeson

@runeksvendsen
Copy link

You could check out https://hackage.haskell.org/package/deriving-aeson

@Lysxia does this library incorporate the performance improvements in this PR? Or is there another reason you recommend it?

@Lysxia
Copy link
Collaborator

Lysxia commented Dec 13, 2024

I don't know but I would expect that it would be easier to push for those changes if they are still necessary; the library may already naturally be more efficient due to its more modern architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants