NSString: Cache ICU collator in thread-local storage #450
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Apple's CoreFoundation caches a
UCollator
instance for a language in TSD. It turns out that instantiation of collators is a very expensive operation. Reusing existing collators greatly improves runtime when repeatedly comparing strings (i.e. when sorting).Benchmarks
Here are some selected benchmarks to validate the collator optimisation and the KVC optimisation from #445.
I've exported the titles of a large media libraries (~70000 songs) for the following micro benchmarks.
Benchmarks were performed on an AMD Ryzen 7 5700G (Freq. scaling disabled), 16GB of DDR4 RAM, Fedora 40.
Comparison
GNUstep Base master
GNUstep Base with Collator Opts
Sorting
I cannot share the full micro benchmark as I am using the same sorting logic as implemented in Djay (isolated into
mediaObjectAttributeValueCompare
).For
BM_DjayMediaLibrarySort
, I am loading the titles into anNSArray
, randomise all entries and run a sort operation with a comparator block. I track the number of comparisons by swizzlingcompare:range:mask:locale:
withcompareOptionsRangeLocaleStatisticsIMP
.Here, I am using a sort descriptor instead of a comparator, as I am accessing properties of instances of
DummyClass
instead of directly comparing array entries. This makes heavy use of KVC to resolve the value of the property, so it is interesting to see the performance with the KVC optimisations in #445 in practice.GNUstep Base master
GNUstep Base with Collator Opts
GNUstep Base with Collator and KVC Opts
There is still room for more improvements: