Sparks on YARN are also smaller than applied vcores
I am using Spark on a YARN (HDP 2.4) cluster with the following settings:
- 1 Masternode
- 64 GB RAM (50 GB available)
- 24 cores (19 cores available)
- 5 Slavenodes
- 64 GB of RAM (50 GB each)
- 24 cores (19 cores available)
- YARN settings
- memory of all containers (one host): 50 GB
- minimum container size = 2 GB
- maximum container size = 50 GB
- vcores = 19
- minimum # vcores / container = 1
- max # vcores / container = 19
When I run my spark application using the spark-submit --num-executors 30 --executor-cores 3 --executor-memory 7g --driver-cores 1 --driver-memory 1800m ...
YARN command, it creates 31 containers (one for each executor process + one driver process) with the following settings:
- Fix : Main container with 1 core and ~ 1800MB RAM.
- Correct : 30 slave containers with ~ 7 GB of RAM each
- BUT BUT : each slave container only works with 1 core instead of 3, according to the YARN ResourceManager UI (it only shows 31 out of 95, not 91 = 30 * 3 + 1), see screenshot below
My question is here: Why is the parameter spark-submit
--executor-cores 3
not working?
+2
source to share
1 answer
Ok, seems to be the same problem as discussed here: yarn does not honor yarn.nodemanager.resource.cpu-vcores The solution also worked for me.
+1
source to share