Get the number of available performers

I am deploying an EMR 5.4.0 cluster with Spark installed. I have a job for which performance really degrades if scheduled for performers who are not available (for example, there are about 16 performers on the w / 2 m3.xlarge core node cluster).

Is there a way to open my app to my app? I can detect the hosts by doing this: sc.range(1,100,1,100).pipe("hostname").distinct().count()

but I hope there is a better way to get an idea of ​​the cluster that Spark is running on.

+3


source to share





All Articles