Where does Apache Spark's reductionByWindow function run?
I'm trying to learn apache spark and I can't figure out from the documentation how windows work.
I have two worker nodes and I am using Kafka Spark Utils to create a DStream from a theme.
In this DStream, I am using the function map
and reductionByWindow
.
I can't figure out if it is executed reductionByWindow
for each worker or in the driver.
I have searched on google without any results.
Can someone explain to me?
source to share
Both reception and processing of data takes place at the worker nodes. The driver creates sinks (on worker nodes) that are responsible for collecting data, and periodically runs jobs to process the collected data. Everything else is pretty much standard RDD and normal Spark works.
source to share