Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.4.1
-
All
Description
reduceByWindow has an optional invReduceFunc argument which allows the reduction to be performed incrementally.
The incremental reduction performed in ReducedWindowedDStream only depends on the reduction being associative (as shown by the reduce applied to oldValues), but does not require those functions to be commutative.
In particular, if the inverse reduction is the non-commutative, non-associative substraction (e.g. what you're computing is a running sum), it's necessary to know that the intermediate result (to be substracted from) is the first argument of invReduceFunc and that the second argument is the old value to substract.
It's only in the commutative case that we don't care which is which.
The Scaladoc for the various overloads of reduceByWindow should let the user know which is the accumulator, and which is the old value. A concise, unambiguous way to state this is to write an inversion law in the Scaladoc:
invReduceFunc(reduceFunc(x, y), y) = x