Why does from_json fail with "not found: value from_json"?
I am reading a Kafka topic using Spark 2.1.1 (kafka 0.10+) and the payload is a JSON string. I would like to parse a line with a schema and go to business logic.
Everyone seems to be saying what I should be using from_json
to parse JSON strings, however it doesn't seem to compile for my situation. Mistake
not found : value from_json
.select(from_json($"json", txnSchema) as "data")
When I tried the following lines in a spark shell it works just fine -
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
Any idea what I might be doing wrong in the code to see this part working in the shell but not in the IDE / compile time?
Here's the code:
import org.apache.spark.sql._
object Kafka10Cons3 extends App {
val spark = SparkSession
.builder
.appName(Util.getProperty("AppName"))
.master(Util.getProperty("spark.master"))
.getOrCreate
val stream = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", Util.getProperty("kafka10.broker"))
.option("subscribe", src_topic)
.load
val txnSchema = Util.getTxnStructure
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
}
source to share
you probably just didn't specify the appropriate import - import org.apache.spark.sql.functions._
.
You imported spark.implicits._
and org.apache.spark.sql._
, but none of them imported a separate function into functions
.
I also imported
com.wizzardo.tools.json
which looks the same as the functionfrom_json
that should have been selected by the compiler (since it was imported first) and which seems to be incompatible with my version of spark
Make sure you are not importing the function from_json
from any other json library, as this library may not be compatible with the spark version you are using.
source to share