-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: sconcat
and stimes
.
#580
base: master
Are you sure you want to change the base?
Conversation
Tada, all checks have passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing benchmarks!
src/Data/Text.hs
Outdated
@@ -361,6 +361,8 @@ instance Read Text where | |||
-- | @since 1.2.2.0 | |||
instance Semigroup Text where | |||
(<>) = append | |||
stimes = replicate . P.fromIntegral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat cautious about potential overflow in fromIntegral
. Let's throw an error if Text
is non-empty and n
does not fit into Int
, same as base
does for ByteArray
: https://hackage.haskell.org/package/base-4.19.1.0/docs/src/Data.Array.Byte.html#stimesPolymorphic
(We do not need to check that len * n
fits into Int
, because this is validated by Data.Text.replicate
itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not follow why we should evaluate to the empty text when given a negative number but evaluate to an error when given a number that is bigger than maxBound ∷ Int
, but your wish is my command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, we are not going to check the input in the same way as base
because we should stay abstract. The way integers are represented is an implementation detail that had changed in the past and may change in the future. Supporting all versions of GHC we test with will need a lot of CPP without significant improvement in performance. So, I shall do an equality check instead of matching on the constructors of Integer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not follow why we should evaluate to the empty text when given a negative number but evaluate to an error when given a number that is bigger than maxBound ∷ Int, but your wish is my command.
Ah, I did not remember this peculiarity of replicate
.
I'd be in favor of stimes
throwing an error on negative arguments, even if replicate
does not mind to swallow it silently. @Lysxia how do you feel about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the behaviour on lists:
ghci> replicate (-1) [1..10]
[]
ghci> import Data.Semigroup
ghci> stimes (-1) [1.. 10]
*** Exception: stimes: [], negative multiplier
So, stimes
is not defined even while replicate
is defined on lists and negative numbers. We can do the same. Let me patch it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is a shocking revelation — the property checks of text
overall compare the behaviour of functions on Text
to their correspondents on String
, but the check for stimes
is missing both for strict and lazy Text
. It would be consistent to add such a check for stimes
for both strict and lazy Text
and adjust the definitions as needed for it to pass. @Bodigrim Should I add a check for lazy Text
and make it pass, or should I only add a check for strict Text
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Asking to "replicate a string n times" with negative n is nonsense. There must be an error in the definition of n, so throwing an exception lets users be aware of that error and fix it.
- The default definition of
stimes
already throws an exception forn <= 0
. People haven't complained about it. Extending the definition forn = 0
is reasonable for a monoid. - There could still be a case made in favor of making
stimes
less partial and more similar toreplicate
(I think "replicate a string n times" is nonsense as a sentence in natural language, but I don't have a strong argument that code must follow natural language). Until someone makes a good case for extendingstimes
, throwing an exception for negative arguments is forward-compatible: we can extend the function later (it would only break code that catches the exception, a fishy thing to do). If we made it total now and changed our minds later, that would be a more breaking change.
You can add the stimes
test for strict Text
now and add lazy stimes
in another PR (small PR = good!)
2a4b0fc
to
9deed15
Compare
b653fe4
to
b425505
Compare
The constructors of `Integer` and the module they are exported from all changed between GHC 8 and 9.
b425505
to
c314ce2
Compare
-- | Beware: this function will evaluate to error if the given number does | ||
-- not fit into an @Int@. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly it turns out Haddock does not see comments to instance methods. I looked at the documentation generated by cabal haddock
— this comment is not rendered. This also seems to be confirmed on the Internet.
- https://stackoverflow.com/questions/17758681/haddock-documentation-for-instance-functions-with-quirks-replaced-by-default-cl
- Comments on instance methods haddock#123
I am going to move this comment to the instance.
Resolve #288.
There are two commits here.
Pure
section: one forsconcat
and one forstimes
.sconcat
andstimes
to the instance ofSemigroup
for strictText
.The benchmarks can be run like so:
This will take a few minutes. You should see better times almost everywhere — only the
tiny.sconcat
will show worse times.This is the report as seen on my machine: