stop being clever with reduction result types #20560

StefanKarpinski · 2017-02-10T14:44:46Z

Let's play "guess the result type":

julia> typeof(sum(rand(Bool, 4)))
Int64

julia> typeof(sum(rand(Int8, 4)))
Int32

julia> typeof(sum(rand(UInt8, 4)))
UInt32

julia> typeof(sum(rand(Int16, 4)))
Int32

julia> typeof(sum(rand(UInt16, 4)))
UInt32

julia> typeof(sum(rand(Int32, 4)))
Int64

julia> typeof(sum(rand(UInt32, 4)))
UInt64

julia> typeof(sum(rand(Int64, 4)))
Int64

julia> typeof(sum(rand(UInt64, 4)))
UInt64

julia> typeof(sum(rand(Int128, 4)))
Int128

julia> typeof(sum(rand(UInt128, 4)))
UInt128

This isn't entirely inconsistent, but it's not super simple either. It's not entirely obvious why 32 bits is some kind of special threshold on a 64-bit machine. I propose that we simply return the type that + would have produced instead and if people want a different result type for their reduction, we should allow them to choose it by specifying a zero element of that type. This applies to prod and other built-in reductions as well.

The text was updated successfully, but these errors were encountered:

martinholters · 2017-02-10T15:11:52Z

So should typeof(sum(xs)) == typeof(x[1] + x[2] + ... + x[N]) but typeof(sum(xs, v0)) == typeof(v0)? Or should typeof(sum(xs)) == typeof(zero(eltype(xs)) + zero(eltype(xs)))?

I like the former, because it generalizes to reduce for any weird type-untable op. (I guess that's what we do if we don't hit a special case?) But this generalization would also suggest that typeof(sum(xs, v0)) == typeof(v0 + x[1] + x[2] + ... + x[N]). Then the zero element could no longer be used to choose the result type, only to influence it (especially ensuring a minimum bit size). But that's probably sufficient in practice?

StefanKarpinski · 2017-02-10T15:47:45Z

I think the zero element should dictate the result type if given, not influence it. I guess I feel like the former behavior is better but I don't think there should be any difference for reasonable types, so it probably doesn't matter much.

timholy · 2017-02-10T16:35:40Z

The issue with returning just the result of x[1] + x[2] + ... is that many people will suffer from overflow without knowing it. Moments ago I just submitted something rather similar over at JuliaImages/ImageFiltering.jl#22. The strategy I took was to accumulate using a wider type, hoping for the InexactError when storage didn't work out. If that's viewed as desirable, we could make the last line of mapreduce be oftype(v0, result) and get similar behavior.

StefanKarpinski · 2017-02-10T18:55:37Z

We don't really coddle people when it comes to overflow anywhere else in the language, although this is a bit different since having at native size accumulator is pretty efficient. It would be better not to make the return type user-visible. I still worry that we won't be guaranteed to catch all overflows – we'll just catch some of them.

JeffBezanson · 2017-02-10T19:58:23Z

I do generally agree that less magic is better, but in the case of really small types I worry that making the default sum(Int8[...]) -> Int8 is effectively useless.

felixrehren · 2017-02-13T08:05:19Z

Xref #20561

StefanKarpinski · 2017-02-13T20:25:56Z

I'm fine with it being something "useful" but the current guessing game of result type is not ok. If it was Int or UInt for every small integer type, I'd be fine with that, and it's way easier to explain/predict than what we're doing now.

stevengj · 2017-02-13T21:19:16Z

+1 for defaulting to Int (or UInt for unsigned) for every integer type narrower than Int.

oscardssmith · 2017-02-13T21:24:28Z

I like this as a default, but it would be great if there was a way for users to override this behavior.

oscardssmith · 2017-02-13T21:26:26Z

Otherwise things like adding an array of elements close to zero to each element of the array (or anything similar) would allocate a ton.

StefanKarpinski · 2017-02-13T21:27:58Z

The override would be by providing a "neutral" element (zero) of the desired type. So if you wanted the useless sum(Int8[...]) => Int8 behavior, you could write sum(zero(Int8), v).

StefanKarpinski · 2017-02-13T21:28:32Z

Otherwise things like adding an array of elements close to zero to each element of the array (or anything similar) would allocate a ton.

I don't understand this. What do you mean?

JeffBezanson · 2017-02-13T21:46:58Z

I'm also fine with defaulting to Int and UInt.

StefanKarpinski · 2017-02-13T22:56:48Z

Ok, seems like we have a decision here and this just needs a PR.

Petr-Hlavenka · 2017-02-15T08:45:55Z

I've run into very same problem with JuliaMath/FixedPointNumbers.jl#41 where we discussed widening of Normed{T,f} to Fixed{Int64, f} for all math operations to protect ourselves from over/underflow errors. From my point of view, there is no use case for overflow math in reduction operations and I fully support widening of UInt8 directly to UInt64 or even better to Int64.

kmsquire · 2017-02-15T16:45:01Z

From my point of view, there is no use case for overflow math in reduction operations and I fully support widening of UInt8 directly to UInt64 or even better to Int64.

There are other ways to do this in Julia, but what about reducing to find the max along a particular axis?

StefanKarpinski · 2017-02-15T19:56:20Z

Not all reductions have to behave identically. For minimum, maximum and extrema it seems sensible to return the actual values from the input.

sbromberger · 2017-03-18T07:13:09Z

I think this is reasonable as long as there's a way to dictate the return type. We have too many cases in my package where memory is at a premium, and we're introducing abstractions specifically to allow larger data structures with fewer allocations.

StefanKarpinski · 2017-07-13T19:56:25Z

@TotalVerb, would you be willing to tackle this issue and #21523?

TotalVerb · 2017-07-13T21:34:30Z

Certainly. So the consensus is for the mathy operations that make sense to widen (sum, prod) to widen to Int/UInt always, and for reduce to never widen, right.

JeffBezanson · 2017-07-13T23:21:08Z

Awesome, thanks. Yes I think that's the consensus.

StefanKarpinski added this to the 1.0 milestone Feb 10, 2017

StefanKarpinski added needs decision A decision on this change is needed maths Mathematical functions labels Feb 10, 2017

martinholters mentioned this issue Feb 13, 2017

give sum(T::Type, xs) = zero(T) when xs is empy #20561

Closed

felixrehren mentioned this issue Feb 14, 2017

RFC reduce promotion: default to system default #20607

Closed

5 tasks

stevengj mentioned this issue Apr 24, 2017

Unexpected type promotion in reduce and mapreduce #21523

Closed

TotalVerb mentioned this issue Jul 15, 2017

RFC: Continuation of #20607, revision of sum, prod, and reduce #22825

Closed

14 tasks

JeffBezanson closed this as completed in 26f2333 Oct 15, 2017

nalimilan mentioned this issue Jun 25, 2023

Mean overflows when using smaller types (e.g. Float16) JuliaStats/Statistics.jl#140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stop being clever with reduction result types #20560

stop being clever with reduction result types #20560

StefanKarpinski commented Feb 10, 2017 •

edited

Loading

martinholters commented Feb 10, 2017

StefanKarpinski commented Feb 10, 2017

timholy commented Feb 10, 2017

StefanKarpinski commented Feb 10, 2017

JeffBezanson commented Feb 10, 2017

felixrehren commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

stevengj commented Feb 13, 2017

oscardssmith commented Feb 13, 2017

oscardssmith commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

JeffBezanson commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

Petr-Hlavenka commented Feb 15, 2017

kmsquire commented Feb 15, 2017

StefanKarpinski commented Feb 15, 2017

sbromberger commented Mar 18, 2017

StefanKarpinski commented Jul 13, 2017

TotalVerb commented Jul 13, 2017

JeffBezanson commented Jul 13, 2017

stop being clever with reduction result types #20560

stop being clever with reduction result types #20560

Comments

StefanKarpinski commented Feb 10, 2017 • edited Loading

martinholters commented Feb 10, 2017

StefanKarpinski commented Feb 10, 2017

timholy commented Feb 10, 2017

StefanKarpinski commented Feb 10, 2017

JeffBezanson commented Feb 10, 2017

felixrehren commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

stevengj commented Feb 13, 2017

oscardssmith commented Feb 13, 2017

oscardssmith commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

JeffBezanson commented Feb 13, 2017

StefanKarpinski commented Feb 13, 2017

Petr-Hlavenka commented Feb 15, 2017

kmsquire commented Feb 15, 2017

StefanKarpinski commented Feb 15, 2017

sbromberger commented Mar 18, 2017

StefanKarpinski commented Jul 13, 2017

TotalVerb commented Jul 13, 2017

JeffBezanson commented Jul 13, 2017

StefanKarpinski commented Feb 10, 2017 •

edited

Loading