Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add foldlM #543

Merged
merged 3 commits into from
Mar 13, 2024
Merged

Add foldlM #543

merged 3 commits into from
Mar 13, 2024

Conversation

noughtmare
Copy link
Contributor

@noughtmare noughtmare commented Oct 30, 2023

I haven't added tests or benchmarks yet, because I'm not quite sure how to do it.

Open questions:

  • Can QuickCheck generate monadic functions?
  • And what would be a good benchmark?
  • Do we need a strict foldlM'?
  • Why are the Data.Text.Internal.Fusion.Common folds marked INLINE [0]?

For motivation, see this recent stackoverflow question. There the problem is to convert a text into a vector of characters, but that requires extraneous allocation if done naively with unfoldr Text.uncons. If a foldlM function were exposed like I implemented in this PR then you can write it efficiently like this:

import Data.Text as T
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import Test.Tasty.Bench

createFoldMethod :: Text -> VU.Vector Char
createFoldMethod t = VU.create $ do
  m <- VUM.new (T.length t)
  _ <- T.foldlM (\i x -> VUM.write m i x *> pure (i + 1)) 0 t
  pure m

main = defaultMain
  [ bench (show (10 ^ i) ++ " characters") $ whnf createFoldMethod $ T.replicate (10 ^ i) (T.pack "0")
  | i <- [1..5]
  ]

Benchmark results:

  10 characters:     OK
    33.1 ns ± 2.7 ns, 111 B  allocated,   0 B  copied, 6.0 MB peak memory
  100 characters:    OK
    228  ns ±  21 ns, 471 B  allocated,   0 B  copied, 6.0 MB peak memory
  1000 characters:   OK
    2.16 μs ± 169 ns, 3.9 KB allocated,   0 B  copied,  10 MB peak memory
  10000 characters:  OK
    21.0 μs ± 1.7 μs,  39 KB allocated,   2 B  copied,  10 MB peak memory
  100000 characters: OK
    210  μs ±  15 μs, 386 KB allocated,  21 B  copied,  11 MB peak memory

This matches the performance of an implementation I wrote that loops over the internal stream representation manually.

@Bodigrim
Copy link
Contributor

As suggested in comments (was it @thomasjm asking?), the simplest solution is encodeUtf32LE + unsafeFromForeignPtr indeed. There is nothing particularly unsafe here.

But yes, foldlM makes sense to have. We already got spanM, for instance.

@noughtmare
Copy link
Contributor Author

noughtmare commented Oct 30, 2023

I tried that:

encodeMethod :: Text -> VS.Vector Char
encodeMethod t = 
  case encodeUtf32LE t of
    BS ptr len -> 
      runST $ VS.unsafeFreeze $ 
        VSM.unsafeFromForeignPtr0 (castForeignPtr ptr) (len `quot` 4)))

But it requires reaching into internal parts of the bytestring library* and more importantly it wasn't efficient:

  10 characters:     OK
    281  ns ±  24 ns, 3.3 KB allocated,   0 B  copied, 6.0 MB peak memory
  100 characters:    OK
    2.32 μs ± 181 ns,  31 KB allocated,   2 B  copied, 6.0 MB peak memory
  1000 characters:   OK
    22.3 μs ± 1.8 μs, 302 KB allocated,  20 B  copied, 6.0 MB peak memory
  10000 characters:  OK
    221  μs ±  11 μs, 3.0 MB allocated, 212 B  copied, 6.0 MB peak memory
  100000 characters: OK
    2.31 ms ± 169 μs,  30 MB allocated,  85 KB copied, 7.0 MB peak memory

* I guess perhaps I could use unsafeUseAsCStringLen instead somehow?

src/Data/Text.hs Outdated
@@ -1070,6 +1071,11 @@ foldl1' :: HasCallStack => (Char -> Char -> Char) -> Text -> Char
foldl1' f t = S.foldl1' f (stream t)
{-# INLINE foldl1' #-}

-- | /O(n)/ A monadic version of 'foldl'.
foldlM :: Monad m => (a -> Char -> m a) -> a -> Text -> m a
foldlM f z t = S.foldlM f z (stream t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this implementation match Data.Foldable.foldlM, especially with regards to short-circuiting and the order of effects? Would be nice to have a test, at least for m ~ Maybe and m ~ Either.

@Bodigrim
Copy link
Contributor

But it requires reaching into internal parts of the bytestring library* and more importantly it wasn't efficient

Sigh. encodeUtf32LE should be implemented as a loop in ST instead of relying on streaming. Thanks for checking.

@thomasjm
Copy link

Whoa, thanks for doing this @noughtmare! Yes, that was me asking on StackOverflow :)

@noughtmare
Copy link
Contributor Author

I just found out that we can define foldlM efficiently using just the public API of text:

foldlM k z t = T.foldr (\x go s -> k s x >>= go) pure t z 

This optimizes to a tight loop.

@Lysxia
Copy link
Contributor

Lysxia commented Mar 13, 2024

I added some tests and rebased. To answer some of the questions:

Can QuickCheck generate monadic functions?

We could but we don't need to. Because of foldlM's polymorphism, we can simplify tests by specializing foldlM' to (1) only the state monad (2) only one special function argument and initial accumulator. (So the only remaining thing for quickcheck to generate is the text.)

Specializing foldlM to the state monad still determines its behavior in all monads. The idea is to use the state monad to log all the function calls made by foldlM, and parametric polymorphism guarantees that the behavior of foldlM in any other monad follows that log.

And what would be a good benchmark?

Idk about an actual benchmark, but I added an inspection test that the simplified Core is the same as foldl' in the Identity case.

Do we need a strict foldlM'?

What you defined is actually already foldlM' because there is a bang on the accumulator. I've renamed the function accordingly.

Given that the lazy foldl/foldlM are known to be footguns and not that useful, I'm inclined to not include foldlM in the first place. Only add the strict foldlM'.

I have no clue about the INLINE [0] annotation.

foldlM k z t = T.foldr (\x go s -> k s x >>= go) pure t z

Both the direct implementation and the foldr implementation look fine to me. I leave it up to you.

Although it's a one-liner with foldr, it's not obvious for newcomers. foldlM seems a quite accessible primitive so +1 for adding it.

@Bodigrim
Copy link
Contributor

@Lysxia looks good to me, thanks for wrapping it up. Could you please fix the CI job with -fdeveloper?

@Lysxia
Copy link
Contributor

Lysxia commented Mar 13, 2024

Fixed by inspection-testing S.foldlM' instead of T.foldlM' (which is a wrapper around it, and with the +developer flag there's a HasCallstack constraint that confuses inspection-testing.)

tests/Tests/Properties/Folds.hs Show resolved Hide resolved
@Bodigrim Bodigrim merged commit 66fa38f into haskell:master Mar 13, 2024
25 of 27 checks passed
@Bodigrim
Copy link
Contributor

Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants