Spark Streaming中Batch interval, window length和slide duration的关系
Batch interval的含义
“Batch interval” is the basic interval at which the system with receive the data in batches. This is the interval set when creating a StreamingContext. For example, if you set the batch interval as 2 second, then any input DStream will generate RDDs of received data at 2 second intervals.
Length of window and slide duration的含义
A window operator is defined by two parameters -
- - the length of the window
- Slide duration - the interval at which the window will slide or move forward
Its a bit hard to explain the sliding of a window in words, so slides may be more useful. Take a look at slides 27 - 29 in the attached slides.
三者的关系
Both the window duration and the slide duration must be mult