How do I set spark.akka.frameSize in spark shell?

For a particular spark shell session , I am trying

spark-shell -Dspark.akka.frameSize=10000 --executor-memory 4g

      

Inside the shell, I get this:

System.getProperty("spark.executor.memory")
res0: String = 4g
System.getProperty("spark.akka.frameSize")
res1: String = null

      

Maybe this line is wrong, but I am getting a frameSize error when I try to take () on my dataset.

org.apache.spark.SparkException: Job aborted due to stage failure: Serialized task 6:0 was 12518780 bytes which exceeds spark.akka.frameSize (10485760 bytes). Consider using broadcast variables for large values.

      

Shows the default frameSize 10M. Maybe I have wrong syntax. Please help. Thank!

+3


source to share


2 answers


This is described in the Spark configuration guide under Dynamically Load Spark Properties :

Spark and the tool spark-submit

support two ways to dynamically load configurations. The first are command line parameters such as the --master

one shown above. spark-submit

can accept any Spark property using a flag --conf

, but uses special flags for properties that come into play when the Spark application starts.



For example:

./bin/spark-submit --name "My app" --master local[4] --conf spark.akka.frameSize=100 --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" myApp.jar 

      

+5


source


This syntax works in a spark shell:

spark-shell  --executor-memory 4g --driver-java-options "-Dspark.akka.frameSize=100"

      



This was terribly unclear in the Spark documentation. It is clear that this still requires a lot of work.

This was in 1.0.1. Looks like Josh's answer below for 1.1.0+

+2


source







All Articles