Adding New Spark Workers to AWS EC2 - Access Error
I have an existing discontinuous Spark cluster that was started from a spark-ec2
script. I am trying to add a new slave following the instructions:
- Stop cluster
- On the AWS console, "run more like this" on one of the slave devices
- Start the cluster
Although the new instance is added to the same security group, and I can successfully SSH with the same private key, the call spark-ec2 ... start
cannot access this machine for some reason:
Running setup-slave on all cluster nodes to mount file systems, etc.
[1] 00:59:59 [FAILURE] xxx.compute.amazonaws.com
Exit with error code 255 Stderr: Permission denied (publickey).
obviously tons of other bugs followed when trying to deploy Spark on this instance.
The reason is that the Spark Master machine does not have access to rsync
for this new slave, but the 22nd port is open ...
source to share
The problem was that the SSH key generated on the Spark Master was not shared with this new slave. Spark-ec2 script with command start
skips this step. The solution is to use a command launch
with parameters --resume
. Then the SSH key is carried over to the new slave and everything goes smoothly.
Another solution is to add the master public key (~ / .ssh / id_rsa.pub) to the newly added slaves ~ / .ssh / authorized_keys. (Got this tip on Spark mailing list)
source to share