Memory leak with TensorFlow for Java

The following memory leak test code:

private static final float[] X = new float[]{1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0};

public void testTensorFlowMemory() {
    // create a graph and session
    try (Graph g = new Graph(); Session s = new Session(g)) {
        // create a placeholder x and a const for the dimension to do a cumulative sum along
        Output x = g.opBuilder("Placeholder", "x").setAttr("dtype", DataType.FLOAT).build().output(0);
        Output dims = g.opBuilder("Const", "dims").setAttr("dtype", DataType.INT32).setAttr("value", Tensor.create(0)).build().output(0);
        Output y = g.opBuilder("Cumsum", "y").addInput(x).addInput(dims).build().output(0);
        // loop a bunch to test memory usage
        for (int i=0; i<10000000; i++){
            // create a tensor from X
            Tensor tx = Tensor.create(X);
            // run the graph and fetch the resulting y tensor
            Tensor ty = s.runner().feed("x", tx).fetch("y").run().get(0);
            // close the tensors to release their resources
            tx.close();
            ty.close();
        }

        System.out.println("non-threaded test finished");
    }
}

      

Is there something obvious I am doing wrong? The main thread is to create a graph and a session on this plot, create a placeholder and a constant to make the cumulative sum per tensor, fed as x. After running the resulting y operation, I close both x and y tensors to free up their memory resources.

Things I find so far to help:

  • This is not a Java object memory issue. The heap doesn't grow, other memory in the JVM doesn't grow - according to jvisualvm. Apparently this is not a JVM memory leak according to Java Native Memory Tracking.
  • The nearest operations help, if they are not there, the memory grows by leaps and bounds. With their help, it still grows pretty quickly, but almost as well as without them.
  • The cumsum operator is not important, it also occurs with sum and other operators
  • This happens on Mac OS with TF 1.1 and CentOS 7 with TF 1.1 and 1.2_rc0
  • Commenting out the lines Tensor ty

    removes the leak, so it appears to be there.

Any ideas? Thank you! Also, here's a Github project that demonstrates this issue , both with a threaded test (to speed up memory growth) and an unfinished test (to show it is not due to threading). It uses maven and can work with simple:

mvn test

      

+3


source to share


1 answer


I believe there is indeed a leak (in particular, there is no TF_DeleteStatus

corresponding highlighting in the JNI code ) (Thanks for the detailed instructions for reproducing)

I would encourage you to submit an issue at http://github.com/tensorflow/tensorflow/issues and hopefully it should be fixed before the final 1.2 release.



(Also, you also have a leak outside of the loop, since the object Tensor

created is Tensor.create(0)

not closed)

UPDATE . This has been fixed and 1.2.0-rc1 will no longer have this problem.

+3


source







All Articles