LINQ implementation using specialized value type enumerators.
The goal of this project is to provide an option to reduce the number of heap allocations (in best case completely avoid) while you can still preserve the convenience of using LINQ-like operators.
By default, enumerables and their enumerator pairs are reference types. So, every time you call a .Select(d => d.Id)
or a .Where(u => u.Enabled)
, you allocate at least one or two instances to the heap. If you work in a performance critical environment, like measuring samples at a high frequency, or reading data as a server, you have two choices:
- Write a specialized algorithm the iterative way, so you can decide what to put where to avoid allocations, but also lose the readability and maintainability compared to a functional implementation what LINQ is good at.
- Use LINQ but allocate to the heap in each itaration, literally producing garbage, putting high pressure on GC.
IEnumerable<T>
and IEnumerator<T>
are interfaces, which means that even if you would have a value type (struct
) implementation that implements those interfaces, but you would refer to the instance as the interface, you would end up with boxing, and create a reference on the heap anyway. But less people know, that if you have a foreach
statement or either use the query syntax, the C# compiler doesn't require the collection to implement those interfaces, but uses patterns instead. So even if the type doesn't implement the interface, but implements an appropriate GetEnumerator()
method, it is able to use that. Also, it is smart enough to choose a value type implementation over the interface implementation, if you have one. For this reason, many of the built-in types (e.g. StringTokenizer) use value type enumerators to avoid allocations.
So, this project provides you a (limited) 3rd option where you can have the advantages of both worlds: write more readable, functional code, while not having to worry about allocations.
The extensions methods are in the System.Linq
namespace, but they are all prefixed with Value
.
You can use them in high performance scenarios, like parsing GPS data from a CSV-like (NMEA) format. The following code snippet is easy to read and maintain, while it is allocation free and parses each line only once.
if (line.StartsWith("$GPGGA"))
{
var parts = new StringTokenizer(line, Separators);
foreach (var (index, part) in parts.ValueSelect((s, i) => (index: i, part: s)))
{
switch (index)
{
case 2:
Double.TryParse(part, NumberStyles.Float, CultureInfo.InvariantCulture, out latitude);
break;
case 4:
Double.TryParse(part, NumberStyles.Float, CultureInfo.InvariantCulture, out longitude);
break;
case 6:
Int32.TryParse(part, NumberStyles.Integer, CultureInfo.InvariantCulture, out quality);
break;
}
}
}
Or you can use them with regular reference type collections, like this:
foreach (var item in products.ValueWhere(p => p.Price < 1000))
{
Console.WriteLine(item.Name);
}
Operator | IEnumerable<T> | IReadOnlyList<T> | StringTokenizer | ReadOnlySpan<T> |
---|---|---|---|---|
references | arrays, collections, lists | string segments | ||
Append |
X | |||
Concat |
X | |||
Except |
X | |||
Intersect |
X | |||
Join |
X | |||
Prepend |
X | |||
Select |
X | X | X | X |
SelectMany |
X | |||
Skip |
X | |||
SkipWhile |
X | |||
Take |
X | |||
TakeWhile |
X | |||
Where |
X | X | ||
Zip |
X |
Support is planned for T[]
, ReadOnlySpan<T>
and ReadOnlySequence<T>
as well.
This library provides some value-type collections as well, which hold a predefined number of values.
var one = ValueEnumerable.Create(1);
var two = ValueEnumerable.Create(1, 2);
Or you can create an empty one as well:
var empty = ValueEnumerable.Empty<int>();
All of these collection types implement IEnumerable<T>
, IReadOnlyCollection<T>
, and IReadOnlyList<T>
.
Note: for many usecases Array.Empty<int>()
may be a sufficient or even better choice. Also, you can take a look at the type StringValues
, if you need to work with string
instances.
Each test has the following three implementations to compare:
- using built-in LINQ (built on reference types) (usually with most allocations)
- using the value type enumerators of this project (usually with less allocations, sometimes faster, sometimes slower)
- custom implementation of an algorithm, that uses no LINQ-like functions (usually the fastest, with minimum allocations)
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
Select | 150.919 ns | 2.9182 ns | 2.7297 ns | 0.0076 | - | - | 48 B |
ValueSelect | 124.028 ns | 0.4206 ns | 0.3935 ns | 0.0050 | - | - | 32 B |
SelectIterative | 9.359 ns | 0.0268 ns | 0.0238 ns | - | - | - | - |
Where | 86.296 ns | 0.2998 ns | 0.2658 ns | 0.0076 | - | - | 48 B |
ValueWhere | 121.756 ns | 0.5245 ns | 0.4906 ns | 0.0050 | - | - | 32 B |
WhereIterative | 12.382 ns | 0.0485 ns | 0.0453 ns | - | - | - | - |
The cost here is the allocation of the IEnumerator<T>
.
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
Select | 141.082 ns | 0.5028 ns | 0.4703 ns | 0.0076 | - | - | 48 B |
ValueSelect | 131.368 ns | 0.5197 ns | 0.4607 ns | - | - | - | - |
SelectIterative | 6.125 ns | 0.0280 ns | 0.0248 ns | - | - | - | - |
Compared to IEnumerable<T>
s, in this case we can completely replace the enumerators with our custom ones, so avoid its allocation.
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
Select | 454.9 ns | 3.62 ns | 3.39 ns | 0.0253 | - | - | 160 B |
ValueSelect | 276.3 ns | 0.40 ns | 0.35 ns | - | - | - | - |
SelectIterative | 208.7 ns | 1.04 ns | 0.87 ns | - | - | - | - |
Where | 474.6 ns | 0.56 ns | 0.53 ns | 0.0267 | - | - | 168 B |
ValueWhere | 297.7 ns | 1.71 ns | 1.34 ns | - | - | - | - |
WhereIterative | 220.7 ns | 0.44 ns | 0.41 ns | - | - | - | - |
Measured on: BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.329 (2004/?/20H1) Intel Core i7-8700K CPU 3.70GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
- Chaining operators may result in boxing.