H2o model not fit driver node memory error

I ran GBM model via R-code in H2O and got below error. The same code worked great for a couple of weeks. Satisfying if this is a side-by-side H2O error or a configuration on the user's system?

water.exceptions.H2OModelBuilderIllegalArgumentException: Invalid argument for GBM model: gbm-2017-04-18-15-29-53. Details: ERRR in field: _ntrees: the tree model will not fit into the driver node memory (23.2 MB per tree x 1000> 3.32 GB) - try decreasing ntrees and / or max_depth or increasing min_rows!

+3


source to share


2 answers


The fix that worked for me was to set both the minimum and maximum memory sizes on H2O initialization. For example:

This fails without specifying the minimum or maximum memory size:

localH2O <- h2o.init(ip='localhost', nthreads=-1)

INFO: Java heap totalMemory: 1.92 GB
INFO: Java heap maxMemory: 26.67 GB
INFO: Java version: Java 1.8.0_121 (from Oracle Corporation)
INFO: JVM launch parameters: [-ea]
INFO: OS version: Linux 3.10.0-327.el7.x86_64 (amd64)
INFO: Machine physical memory: 1.476 TB

      

This fails when only specifying the maximum memory size:



localH2O <- h2o.init(ip='localhost', nthreads=-1,
                     max_mem_size='200G')

INFO: Java availableProcessors: 64
INFO: Java heap totalMemory: 1.92 GB
INFO: Java heap maxMemory: 177.78 GB
INFO: Java version: Java 1.8.0_121 (from Oracle Corporation)
INFO: JVM launch parameters: [-Xmx200G, -ea]
INFO: OS version: Linux 3.10.0-327.el7.x86_64 (amd64)
INFO: Machine physical memory: 1.476 TB

      

This is achieved by specifying both the minimum and maximum memory sizes:

localH2O <- h2o.init(ip='localhost', nthreads=-1,
                     min_mem_size='100G', max_mem_size='200G')

INFO: Java availableProcessors: 64
INFO: Java heap totalMemory: 95.83 GB
INFO: Java heap maxMemory: 177.78 GB
INFO: Java version: Java 1.8.0_121 (from Oracle Corporation)
INFO: JVM launch parameters: [-Xms100G, -Xmx200G, -ea]
INFO: OS version: Linux 3.10.0-327.el7.x86_64 (amd64)
INFO: Machine physical memory: 1.476 TB

      

+3


source


The 3.32GB number in your post is a calculated number based on activity in the H2O job. Therefore, it is difficult to test it directly without knowing what happened at your job. 40 GB per node is very different from 3.32 GB, so do the following to test it works ...

After the H2O Hadoop job completes, you can review the YARN logs to confirm that the container is indeed receiving the expected amount of memory.

Use the following command (which will be printed for you by the h2odriver output after execution completes):

yarn logs -applicationId application_nnn_nnn

      

For me, the (slightly cropped) output for one of the H2O node containers looks like this:



Container: container_e20_1487032509333_2085_01_000004 on mr-0xd4.0xdata.loc_45454
===================================================================================
LogType:stderr
Log Upload Time:Sat Apr 22 07:58:13 -0700 2017
...

LogType:stdout
Log Upload Time:Sat Apr 22 07:58:13 -0700 2017
LogLength:7517
Log Contents:
POST 0: Entered run
POST 11: After setEmbeddedH2OConfig
04-22 07:57:56.979 172.16.2.184:54323    11976  main      INFO: ----- H2O started  -----
04-22 07:57:57.011 172.16.2.184:54323    11976  main      INFO: Build git branch: rel-turing
04-22 07:57:57.011 172.16.2.184:54323    11976  main      INFO: Build git hash: 34b83da423d26dfbcc0b35c72714b31e80101d49
04-22 07:57:57.011 172.16.2.184:54323    11976  main      INFO: Build git describe: jenkins-rel-turing-8
04-22 07:57:57.011 172.16.2.184:54323    11976  main      INFO: Build project version: 3.10.0.8 (latest version: 3.10.4.5)
04-22 07:57:57.011 172.16.2.184:54323    11976  main      INFO: Build age: 6 months and 11 days
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Built by: 'jenkins'
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Built on: '2016-10-10 13:45:37'
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Java availableProcessors: 32
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Java heap totalMemory: 9.86 GB
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Java heap maxMemory: 9.86 GB
04-22 07:57:57.012 172.16.2.184:54323    11976  main      INFO: Java version: Java 1.7.0_67 (from Oracle Corporation)

      

Note that the log output of the main application container looks different, so just look for the output for any of the H2O node containers.

Find the line "Java heap maxMemory". In my case, I asked for "-mapperXmx 10g" on the command line, so it looks good. 9.86GB is close to "10g" considering the JVM's small overhead.

If this is not what you expect, you have a Hadoop configuration issue: some Hadoop settings override the amount of memory that you ask for on the command line.

+2


source







All Articles