How auto update% spark.sql results in zeppelin for structured streaming request
I am running structured streaming (spark 2.1.0 with zeppelin 0.7) on data coming from kafka and I am trying to visualize the streaming result using spark.sql
as shown below:
%spark2
val spark = SparkSession
.builder()
.appName("Spark structured streaming Kafka example")
.master("yarn")
.getOrCreate()
val inputstream = spark.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "n11.hdp.com:6667,n12.hdp.com:6667,n13.hdp.com:6667 ,n10.hdp.com:6667, n9.hdp.com:6667")
.option("subscribe", "st")
.load()
val stream = inputstream.selectExpr("CAST( value AS STRING)").as[(String)].select(
expr("(split(value, ','))[0]").cast("string").as("pre_post_paid"),
expr("(split(value, ','))[1]").cast("double").as("DataUpload"),
expr("(split(value, ','))[2]").cast("double").as("DataDowndownload"))
.filter("DataUpload is not null and DataDowndownload is not null")
.groupBy("pre_post_paid").agg(sum("DataUpload") + sum("DataDowndownload") as "size")
val query = stream.writeStream
.format("memory")
.outputMode("complete")
.queryName("test")
.start()
after executing the request i am on "test" as below:
%sql
select *
from test
it only updates when i run it manually, my question is how to make it update as new data is processed (streaming visualization) in the following example:
Analysis Without Compromise: Using Structured Streaming in Apache Spark
+5
source to share