Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce needed? #20

Open
russel opened this issue Mar 28, 2014 · 15 comments
Open

Reduce needed? #20

russel opened this issue Mar 28, 2014 · 15 comments

Comments

@russel
Copy link

russel commented Mar 28, 2014

I may just be missing it, but there appears not to be a reduce operation on streams.

@timyates
Copy link
Owner

You're right...

There's inject if your coming at it from a groovy angle, but nothing Java interfacing for doing this...

Now, do I call it inject to align with Groovy, or reduce or foldl to align with everything else?

@russel
Copy link
Author

russel commented Mar 28, 2014

Given there is map and filter, it should be reduce. Given there is collect, it should be inject.

@timyates
Copy link
Owner

Ah, the joys of naming things... Think I'll go for reduce. As you say, I've already gone off piste with map and filter

@timyates
Copy link
Owner

Is this purely from a Java perspective as you can inject from groovy?

@russel
Copy link
Author

russel commented Mar 28, 2014

Reduce works for me :-)

I'm not sure what the question re Java is asking (possibly just me being slow, sorry).

@timyates
Copy link
Owner

As of v0.8.0, you can do this in Java using the library:

        Map<String,List<Integer>> map = new HashMap<String,List<Integer>>() {{
            put( "x", Arrays.asList( 1, 2, 3 ) ) ;
            put( "y", Arrays.asList( 4, 5, 6 ) ) ;
        }} ;
        Stream.from( map )
                .map( e -> e.get( "x" ) + e.get( "y" ) )
                .filter( e -> e % 2 == 1 )
                .forEachRemaining( System.out::println ) ;

Or, in pre java 8:

Stream<Integer> s = Stream.from( map )
        .map( new Function<Map<String, Integer>, Integer>() {
            @Override
            public Integer call( Map<String, Integer> map ) {
                return map.get( "x" ) + map.get( "y" );
            }
        } )
        .filter( new Predicate<Integer>() {
            @Override
            public boolean call( Integer val ) {
                return val % 2 == 1;
            }
        } ) ;
while( s.hasNext() ) {
    System.out.println( s.next() );
}

So from this Java point of view, there's no reduce functionality baked in to the library.

But from Groovy, you could use the standard inject method like so:

@Grab( 'com.bloidonia:groovy-stream:0.8.1' )
import groovy.stream.*

// print the sum of the lengths of the words in dict/words
println new File( '/usr/share/dict/words' ).withReader { r ->
    Stream.from( r ).map { s -> s.length() }.inject { a, b -> a + b }
}

Which behaves as a reduce method would do (and is not written by me, so is more likely to be bug free) ;-)

@timyates
Copy link
Owner

I guess the question is, is there any new functionality (over the stock inject method) that I could supply?

@russel
Copy link
Author

russel commented Mar 28, 2014

Java 8 Streams do have reduce methods.

My worry about using the Groovy inject is that it might force the creation of a data structure rather than continuing to work with the iterator.

@timyates
Copy link
Owner

Ahhh, I think I understand...

Are you saying you want a reduce method that returns a Stream containing only a single element when iterated?

In Java 8, Stream.reduce is a terminal operation, so it returns an Optional, similar to calling (in Groovy) groovyStream.inject (apart from you get a value or NullObject from groovy's inject). You can't continue to use Streaming methods from either of these unless you wrap the response itself in another Stream.

I might be missing the point entirely though (as is often the case)

And as I said before, as you have no inject in Java, this reduction step is noticeably absent if you are using groovy-stream from Java

@russel
Copy link
Author

russel commented Apr 2, 2014

Actually the issue is where the input is not a sequence of scalars, but a sequence of non-scalars. So for example:

def s = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
]

def a = Stream.from(s).flatMap{i -> i}.sum()
assert a == 45
assert a instanceof Integer

def b = Stream.from(s).sum()
assert b == [1, 2, 3, 4, 5, 6, 7, 8, 9]
assert b instanceof List

def c = Stream.from(s).sum().sum()
assert c == 45
assert c instanceof Integer

def d = Stream.from(s).reduce{t, i -> t + i}.sum()
assert c == 45
assert c instanceof Integer

The expression for d fails as reduce doesn't exist on Streams, it is a place holder for what might be interesting (or not).

The expression for c works but only because the first sum is terminal and create a List on which sum operates. This is indicating the problem I see, the Stream pipeline is "terminated" earlier than would be good.

Maybe this is just a theoretical problem and will never be met in reality?

PS Is this Stream system seen as being replaced by the Java 8 streams?

@timyates
Copy link
Owner

timyates commented Apr 2, 2014

Yeah, for d currently, you can do:

def d = Stream.from(s).inject {t, i -> t + i}.sum()
assert d == 45
assert d instanceof Integer

c is an interesting foible of Groovy where sum acts as flatten when passed a List where the first element is a List

assert [ [ 1 ], 2, [ 3, 4 ] ].sum() == [ 1, 2, 3, 4 ]

As for the Java 8 question, my gut feeling tells me yes... The only thing I add is faux list comprehension (which could probably be written as a custom Supplier), and the zip method (which was removed from Java Stream at some point early in its development). Java 8 Streams add parallelisation (albeit with some caveats around pool blocking).

@russel
Copy link
Author

russel commented Apr 2, 2014

The problem with d using inject is that the inject on the Stream creates an ArrayList instead of keeping a Stream, so the sum is Collection.sum rather than the Stream terminating sum.

I find the c case entirely consistent since sum is just a reduce with +.

I hadn't spotted that zip had gone; there must be an idiom…

@timyates
Copy link
Owner

timyates commented Apr 2, 2014

Yeah, I've searched for the reason zip was removed, but can't find it anywhere...if I had to guess it would be that it didn't play at all well with parallel, but that's a complete guess whilst my google-fu is in a slump ;-)

I see what you mean now about reduce vs inject, but I'm struggling to think of a way I could implement reduce that doesn't just construct an array list in memory -- and still works for things like

// Return a Stream containing a single Integer 6
Stream.from( [ 1, 2, 3 ] ).reduce { a, b -> a + b }

Wouldn't I have to exhaust the input iterator to the reduce and then wrap that in a Stream?

Maybe in this instance, flatmap is the way to go?

@russel
Copy link
Author

russel commented Apr 2, 2014

flatMap works fine for the case where all the leaf items are from the same domain and the sequence structure is irrelevant. My earlier example was perhaps not a good example of the (potentially academic) problem I am thinking of, where what I want is a reduction by one dimension of a problem that is still stream based but not soluble with flatMap because the structure under the first level is important to the problem.

I'll try and come up with an example.

@timyates
Copy link
Owner

timyates commented Apr 3, 2014

An example would be brilliant 😸

I had a thought, do you mean some sort of feedback iterator, where it passes the previous result and the next element from the upstream iterator? Something like:

// Feedback with `null` as the initial last element
def elements = Stream.from( [ 1, 2, 3 ] ).feedback { last, next -> [ last, next ] }.collect()
assert elements == [ [ null, 1 ], [ [ null, 1 ], 2 ], [ [ [ null, 1 ], 2 ], 3 ] ]

Or (as a more simple example)

// Feedback starting with an initial value of 0 for `last`
def elements = Stream.from( [ 1, 2, 3 ] ).feedback( 0 ) { last, next -> last + next }.collect()
assert elements == [ 1, 3, 6 ]

Not sure if this would cover your use-case, or indeed if feedback is a good name for it ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants