[SPARK-9819] reduceBy(KeyAnd)Window should specify which is the accumulator argument in invReduceFunc - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.4.1
Fix Version/s: 2.0.0
Component/s: Documentation, DStreams
Labels:
- documentation
- streaming
Environment:

All

Description

reduceByWindow has an optional invReduceFunc argument which allows the reduction to be performed incrementally.

The incremental reduction performed in ReducedWindowedDStream only depends on the reduction being associative (as shown by the reduce applied to oldValues), but does not require those functions to be commutative.

In particular, if the inverse reduction is the non-commutative, non-associative substraction (e.g. what you're computing is a running sum), it's necessary to know that the intermediate result (to be substracted from) is the first argument of invReduceFunc and that the second argument is the old value to substract.

It's only in the commutative case that we don't care which is which.

The Scaladoc for the various overloads of reduceByWindow should let the user know which is the accumulator, and which is the old value. A concise, unambiguous way to state this is to write an inversion law in the Scaladoc:

invReduceFunc(reduceFunc(x, y), y) = x

Attachments

Issue Links

links to

[Github] Pull Request #8103 (huitseeker)

Activity

People

Assignee:: François Garillot

Reporter:: François Garillot

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/Aug/15 12:28

Updated:: 03/May/16 18:45

Resolved:: 03/May/16 18:45