Use multiple performers and workers in the work of Spark

I am running a spark offline with a lower configuration -

export SPARK_WORKER_INSTANCES=4
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=4g

      

With this, I can see 4 workers on my 8080 spark UI.

Now it's one thing - the number of performers on my main url (4040) is just one, how can I increase this to say 2 per worker node.

Also when I run a little code from spark, just use one instance, I need to make any configuration change to ensure multiple workers are used by multiple workers.

Any help is appreciated.

+3


source to share


2 answers


Set spark.master to local [k], where k is the number of threads you want to use. You are better off writing these parameters inside the spark-submit command instead of using export.



0


source


Parallel processing is based on the number of RDD parts. If your Rdd has multiple partitions, it will be processed in parallel.



Make a few modifications to ( repartion

) in your code, it should work.

0


source







All Articles