Can a parallel IPython controller have local and remote ipengines?

Question

Can a parallel IPython controller have local and remote ipengines?

c = Client(profile='myprofile')

or

c = Client('/path/to/my/ipcontroller-client.json')

for local ipengines (IIUC) and

c = Client('/path/to/my/ipcontroller-client.json', sshserver='me@myhub.example.com')

if my ipengines are on a different server.

But what do I need to do to have a parallel IPython controller, say, manage 8 ipengines from local node and 8 ipengines from remote node device connected via SSH?

Or is this not possible without moving to full blown HDFS, Hadoop, etc.?

My goal is to have a single client (or controller?) Interface that I can send a bunch of load balanced computations where I don't care where it runs and when.

+3

python ipython-parallel

K.-Michael Aye 09 nov. '14 at 1:29

source to share

1 answer

minrk · Accepted Answer · 2015-01-15T21:19:27+0000

sshserver arg for Client - only for cases where the controller is not directly accessible from the client (for example, a client on a laptop, a controller behind a firewall on a remote network). The customer never needs to know or care for the engines. Also, ssh tunnels are only required when machines are not available to each other. I assume you don't need ssh tunneling for simplicity.

The simplest case:

host1

is where you want to start the controller, client and 5 motors.
host2

is another computing computer on the same local network where you want to run 8 engines

No configuration

start the controller listening on all interfaces (so motors can be connected elsewhere on the LAN)

[host1] ipcontroller --ip=*

(skip if using shared filesystem) send connection files to host2

[host1] rsync -av $HOME/.ipython/profile_default/security/ host2:.ipython/profile_default/security/

run engine on host 1

[host1] ipengine
# or start multiple engines at once:
[host1] ipcluster engines -n 5

run engines on host2

[host2] ipengine
# or start multiple engines at once:
[host2] ipcluster engines -n 8

open the client on host1:

[host1] ipython
In[1]: from IPython import parallel
In[2]: rc = parallel.Client()

You should now have access to the machines on both machines.

With configuration

You can also express it all with config. To initialize configuration files:

[host1] ipython profile create --parallel

Tell ipcontroller to listen on all interfaces in ipcontroller_config.py

:

c.HubFactory.ip = '*'

Tell ipcluster to start the engines with ssh on both host1 and host2 in ipcluster_config.py

:

c.IPClusterEngines.engine_launcher_class = 'SSH'
c.SSHEngineSetLauncher.engines = {
    'host1': 5,
    'host2': 8,
}

Start everything with ipcluster

:

[host1] ipcluster start

The SSH launcher will take care of copying the connection files to the remote machines.

If you need ssh tunneling you can specify

c.IPControllerApp.ssh_server = u'host1'

in ipcontroller_config.py

. IPython should be able to tell if engines or clients are running on host1

, and skip tunneling if not needed. If he cannot figure it out, you can manually specify where the ssh server should be used and leave it out of the configuration, or put it in the config and manually specify that no ssh server should be used, whichever is which is more convenient for you.

Can a parallel IPython controller have local and remote ipengines?

No configuration

With configuration

More articles: