Combining windowed (groupBy) and mapGroupsWithState (groupByKey) in Spark Structured Streaming
Spark 2.2.0 structured streaming is currently in use.
Given a watermarked timestamped data stream, is there a way to combine (1) an operation groupBy
to reach a window using a timestamp field and other grouping criteria with (2) an operation groupByKey
to apply mapGroupsWithState
to groups for a user session?
Or is it something I have to go along with somehow embedding windowing and other grouping logic in groupByKey
?
For context:
-
a call
groupBy
that supports windowed mode on Dataset returns a RelationalGroupedDataset , which does notmapGroupsWithState
. -
the call
groupByKey
that supportsmapGroupsWithState
returns KeyValueGroupedDataset , but it has no support for the window
source to share
No one has answered this question yet
Check out similar questions: