R Parallel - connecting to remote cores

2019-03-20 19:16发布

Working in R 2.14.1, on Windows 7

Using the package parallel in R, I'm trying to take advantage of cores outside of my local machine available on my network, where all remote hosts I am connecting to are identical Windows machines.

The basic form of the commands are as such to make the connection.

library(parallel)
#assume 8 cores per machine
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)))

Of course, trying to debug these things can be pretty tricky, but here is where I'm at with it.

If I specify the manual = TRUE flag as below

cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)), manual=TRUE)

there are no problems connecting to the remote host, and running a parallel process. The computers have identical setups to the one that I am working on. Yet, when this manual flag is not set, the connection command hangs.

This seems to indicate to me that since the manual flag bypasses ssh to make the connection to the host, that ssh is the problem when manual=FALSE.

It is not guaranteed at the moment that the remote computers have ssh on them. The question is, given that I have all the pertinent windows login information for my remote hosts, and that I cannot change the settings on the remote computers, how would I connect to cores on remote machines with the package parallel in R without specifying manual = true?

Alternatively, if ssh must be installed for this to happen, let's assume all computers have ssh on them. How would I connect to cores on the remote machines without circumventing ssh?

If you need any more information please let me know, I appreciate the time.

UPDATE 1

8-26-14

Thanks to Steve Weston for his insights. I will provide an update with the exact tools and setup I use to get my system working when it's up and running.

Feel free to comment or post if you have anything else to add as to what may be the best route to go in remote connecting to a windows machine from a windows machine via makePSOCKcluster, where the manual flag is set to FALSE.

2条回答
干净又极端
2楼-- · 2019-03-20 20:02

When creating a PSOCK cluster with manual=FALSE, the only way to start a worker on a remote machine is with "ssh", "rsh", or something command-line compatible, such as "plink" from PuTTY. The reason is that makePSOCKcluster starts the remote workers using the "system" function to execute commands of the form:

ssh -l user otherhost '/usr/lib/R/bin/Rscript' -e 'parallel:::.slaveRSOCK()' MASTER=myhost PORT=10187 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE

You can confirm this by looking at the source code for the newPSOCKnode function in the file snowSOCK.R from the parallel package.

For this to work, the ssh-compatible command must be available on the local machine and a corresponding ssh daemon must be running on each of the remote machines, otherwise makePSOCKcluster will simply hang. I've found that installing a good, working ssh daemon is the difficult part on Windows.

Unfortunately, manual=TRUE is generally the easiest way to create a PSOCK cluster on multiple Windows machines.

查看更多
【Aperson】
3楼-- · 2019-03-20 20:20

Helle everyone, I had the same problem and I managed to solve it. It is June 2018 when I'm writing this answer, my OS is windows 10 and the R version is 3.2.2. It is surprising to see this problem still exists after 4 years. I hope it can be fixed in the following release.

Before you move on, please make sure you can access the server in cmd using ssh. I didn't put any password in my code because I have the private key, you don't need to do that and you will see the reason later.

Fixing The problem

  1. File directory

Since the function makePSOCKcluster works when manually start the workers, my first trying is to let manual=TRUE, and see what's the output. Here is my result:

machineAddresses <-list(list(host='192.168.1.220',user='jeff'))
cl <- makePSOCKcluster(spec,manual = F)
> Manually start worker on 192.168.1.220 with
     "C:/PROGRA~1/R/R-32~1.2/bin/x64/Rscript" -e 
"parallel:::.slaveRSOCK()" MASTER=DESKTOP-U5JA32O PORT=11756 
OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE 

Ok, Here is the first problem. The Rscript location is incorrect(The location of Rscript in the server). Generally, it locates in C:\Program Files. In my server is C:\Program Files\R\R-3.2.2\bin. So we need to correct them by adding more option to tell this stupid code where the Rscript is:

machineAddresses <-list(list(host='192.168.1.220',
user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript"))
  1. CMD problem

Once you fix the directory problem, you will find that the code still hangs forever. Then we need to check if we can manually access the server in R, my code is:

system("ssh jeff@192.168.1.220")
> GetConsoleMode on STD_INPUT_HANDLE failed with 6

I honestly don't know what does this error mean, but we just need to fix that. Inspired by @Steve Weston, I decide to use PuTTY, so I install it, and change my code to:

machineAddresses <-list(list(host='192.168.1.220',user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript",rshcmd="plink -pw qwer"))

The option -pw means the password. Because I'm a newbie to PuTTY, I don't know how to let the private key automatically work in PuTTY. Therefore, I use the easiest way to deal with that: put your password! The above code is equivalent to the following in cmd:

plink -pw qwer jeff@192.168.1.220 Rscript -e parallel:::.slaveRSOCK() MASTER=DESKTOP-U5JA32O PORT=11063 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE

And this is exactly what we will do if we manually create the workers. For those who are new like me, you need to add the PuTTY directory in PATH in your environmental variables to run plink. Here are my final codes:

machineAddresses <-list(list(host='192.168.1.220',user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript",rshcmd="plink -pw qwer"))
cl <- makePSOCKcluster(machineAddresses,manual = F)

I run it with no problem at all. In summary, the function makePSOCKcluster makes two mistakes:

  1. Assuming a wrong R directory in the server(At least it should assume the same directory as my local computer, but it didn't! I don't know where that strange directory comes from)

  2. Using ssh command to start the connection, which does not work in R. It works well in cmd, but not in R. I don't know the reason.

If you are still not able to use makePSOCKcluster, here is one trick: Try to connect to the server in R using system function first. It can give you some error code, that may instruct you where the problem is. Here is my debugging code:

system("plink -pw qwer jeff@192.168.1.220 Rscript -e parallel:::.slaveRSOCK() MASTER=DESKTOP-U5JA32O PORT=11063 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE")
查看更多
登录 后发表回答