Errors in makeCluster (multicore): cannot open connection

I have the following question.

Why is everything working fine when submitting a job on a standard node (max. 56 cores), however when I submit the same job / code to a large_memory node (max. 128 cores) I get an error?

Parallelization code in R:

> no_cores <- detectCores() - 1

> cl <- makeCluster(no_cores, outfile=paste0('./info_parallel.log'))

      

Error

Mistake
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  cannot open the connection

Calls: <Anonymous> ... doTryCatch -> recvData -> makeSOCKmaster -> 
  socketConnection

In addition: Warning message:

In socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  localhost:11232 cannot be opened
Execution halted

Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Execution halted

Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode ->  unserialize
Execution halted

      


As I said, the R code works fine on standard nodes, so I guess this is an issue with the large_memory node. What could it be?

+4


source to share


1 answer


Finally, I tricked him.

The error was caused by the default limit for connections in R. The default for connections is 128. Here "connections" means the number of cores per node that are used in the code.

While in the code, errors occurred on this line "cl <- makeCluster ........"

no_cores <- detectCores () - 1

cl <- makeCluster (no_cores, outfile = paste0 ('./info_parallel.log))

Here the detectCores () function will get the maximum number of cores per node.

Standard cluster nodes have less than 128 cores per node, so R code can work well on standard nodes; while the number of cores per node in the large_memory section is 128 in my case. By default, it reaches the core limit. So the error is displayed as:



cannot open connection

I tried to set the number of cores to 120 to run jobs on the large_memory node (max cores = 128). No mistakes. The code works well.

cl <- makeCluster ( 120 , outfile = paste0 ('./info_parallel.log'))

Thank!

+7


source







All Articles