Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Net Core 3 support to compare new implementation of hashmap #2

Open
mvkara opened this issue Jan 14, 2020 · 5 comments
Open

Net Core 3 support to compare new implementation of hashmap #2

mvkara opened this issue Jan 14, 2020 · 5 comments

Comments

@mvkara
Copy link

mvkara commented Jan 14, 2020

Just found this repo from the mention on fsprojects/FSharp.Data.Adaptive#17. I recently had some performance issues with F#'s built in map and so wrote my own implementation (https://github.com/mvkara/fsharp-hashcollections) to use on a project I've been working on. So far performance looks good compared to the collection types I've found although admittedly I've only tested lookup's. Didn't realise there was other implementations out there. I've put my benchmarks on the README.md file in the repo.

I was thinking how easy would it be, or do you have a net core 3 version of this? Wanted to try out my library on your benchmarks to compare to other libraries out there to see if its worth finishing my library and my impl requires netcoreapp3.1. Don't want to fracture the ecosystem with many different versions of the same collection if its already been implemented.

@mvkara mvkara changed the title Net Core 3 support for further benchmarks Net Core 3 support to compare new implementation of hashmap Jan 14, 2020
@krauthaufen
Copy link
Owner

Hey, this repo was mainly intended for benchmarking different implementations.
I completed the Okasaki-style implementation and moved it to FSharp.Data.Adaptive.

The current implementation has some minor bugs with netcore3 bit intrinsics, but as long as you don't compile it with USE_INTRINSICS it should be just fine (performance gain was actually quite disappointing anyways)

You could simply use the FSharp.Data.Adaptive package or copy the file atm.
We could also create a separate package only containing the HashCollections if you want to use them.

@mvkara
Copy link
Author

mvkara commented Jan 14, 2020

Thanks for pointing me to the package and making me aware of the library. In case your interested I tried using the package and bench marking it but FSharp.Data.Adaptive is still much slower than my implementation especially as the collection size grows. Not sure if there is any plans to move any of these collections to FSharp.Core; I would prefer it there but it should still be fast otherwise what's the point of using it over the standard F# Map.

EDIT: The full benchmarks with other collections I've tried is at: fsprojects/fsharp-hashcollections#5

|                   Method | CollectionSize |      Mean |    Error |   StdDev |
|------------------------- |--------------- |----------:|---------:|---------:|
|               GetHashMap |             10 |  11.07 ns | 0.023 ns | 0.022 ns |
| GetFSharpDataAdaptiveMap |             10 |  21.05 ns | 0.024 ns | 0.021 ns |

|               GetHashMap |            100 |  13.17 ns | 0.011 ns | 0.010 ns |
| GetFSharpDataAdaptiveMap |            100 |  43.17 ns | 0.091 ns | 0.085 ns |

|               GetHashMap |           1000 |  11.16 ns | 0.013 ns | 0.013 ns |
| GetFSharpDataAdaptiveMap |           1000 |  65.02 ns | 0.043 ns | 0.038 ns |

|               GetHashMap |         100000 |  23.24 ns | 0.022 ns | 0.021 ns |
| GetFSharpDataAdaptiveMap |         100000 | 140.15 ns | 2.027 ns | 1.797 ns |

|               GetHashMap |         500000 |  69.15 ns | 1.056 ns | 0.936 ns |
| GetFSharpDataAdaptiveMap |         500000 | 302.25 ns | 1.888 ns | 1.766 ns |

|               GetHashMap |         750000 |  98.61 ns | 0.074 ns | 0.062 ns |
| GetFSharpDataAdaptiveMap |         750000 | 352.38 ns | 0.350 ns | 0.311 ns |

|               GetHashMap |        1000000 | 109.27 ns | 0.044 ns | 0.041 ns |
| GetFSharpDataAdaptiveMap |        1000000 | 399.60 ns | 7.830 ns | 6.941 ns |

|               GetHashMap |        5000000 | 133.62 ns | 1.209 ns | 1.131 ns |
| GetFSharpDataAdaptiveMap |        5000000 | 610.57 ns | 3.078 ns | 2.880 ns |

|               GetHashMap |       10000000 | 151.67 ns | 0.146 ns | 0.129 ns |
| GetFSharpDataAdaptiveMap |       10000000 | 705.60 ns | 1.254 ns | 1.173 ns |

@krauthaufen
Copy link
Owner

Hey, cool
The only thing alienating me here is that is in my benchmarks the fsharpx implementation was significantly slower than ours.
The test seems to be quite identical, except that I explicity used containsKey instead of tryFind but that shouldn't play much of a role. The other difference here is that i use random entries for the table...
Maybe I'll just add your implementation to our benchmarks too to understand this better.
Nonetheless this looks promising and I'd totally love my implementation to be replaced with something faster 👍
One comcern here is that the okasaki variant also supports efficient binary operations (symmetricdiff, etc) which may be tricky for other implementations...

@mvkara
Copy link
Author

mvkara commented Jan 15, 2020

That's why I created this issue in the first place; I'm skeptical of benchmarks in general including my own and wanted to see if it was possible to incorporate yours into the mix. If I put something out there for others to use I want to make sure I haven't missed something.

The use case I was aiming for in my current project was a drop in replacement for F#'s Map which most people typically use as a dictionary - i.e. tryFind, add and remove rather than for the operations you describe. I focused more on the tryFind case as it is the case I needed to optimise for my own reasons and is the one most suspectible to small changes (e.g. equality dispatch, struct vs classes, intrinsic use, etc). The add and remove case where copying of data swamps most of these concerns at least in my benchmarking wasn't as critical for me so I didn't optimise those operations heavily.

If symmetricdiff and such are important maybe this algorithm in general isn't the best fit? Would need to dig into this more. In any case there is room for other data structures that optimise these operations over others I would think.

@davidglassborow
Copy link

Also be aware there has been ongoing work in the fsharp repo about optimising maps and sets with various benchmarking going on, e.g. dotnet/fsharp#5365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants