Errors in makeCluster (multicore): cannot open connection
I have the following question.
Why is everything working fine when submitting a job on a standard node (max. 56 cores), however when I submit the same job / code to a large_memory node (max. 128 cores) I get an error?
Parallelization code in R:
> no_cores <- detectCores() - 1
> cl <- makeCluster(no_cores, outfile=paste0('./info_parallel.log'))
Error
Mistake
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b", :
cannot open the connection
Calls: <Anonymous> ... doTryCatch -> recvData -> makeSOCKmaster ->
socketConnection
In addition: Warning message:
In socketConnection(master, port = port, blocking = TRUE, open = "a+b", :
localhost:11232 cannot be opened
Execution halted
Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Execution halted
Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Execution halted
As I said, the R code works fine on standard nodes, so I guess this is an issue with the large_memory node. What could it be?
source to share
The error was caused by the default limit for connections in R. The default for connections is 128. Here "connections" means the number of cores per node that are used in the code.
While in the code, errors occurred on this line "cl <- makeCluster ........"
no_cores <- detectCores () - 1
cl <- makeCluster (no_cores, outfile = paste0 ('./info_parallel.log))
Here the detectCores () function will get the maximum number of cores per node.
Standard cluster nodes have less than 128 cores per node, so R code can work well on standard nodes; while the number of cores per node in the large_memory section is 128 in my case. By default, it reaches the core limit. So the error is displayed as:
cannot open connection
I tried to set the number of cores to 120 to run jobs on the large_memory node (max cores = 128). No mistakes. The code works well.
cl <- makeCluster ( 120 , outfile = paste0 ('./info_parallel.log'))
Thank!
source to share