Memory leak with Seaborn loopback in Jupyter notebook

I am having trouble managing memory with a function that is called in a Jupyter notebook. This function contains a computation that performs some pairwise analysis of the data. During iterative processing, the loop generates some graphs and writes them to disk.

However, even after closing the graph, I seem to be accumulating about 40 MB / s in RAM.

Here's a simplified example that flows at this rate:

for x in combinations(range(m.num_parameters + 1), 2):

    _plot = sns.pairplot(data=m_df, x_vars=[x[0]], y_vars=[x[1]-1], 
                         size=5)
    _plot.savefig(directory+'{}_{}'.format([x[0]], [x[1]-1]))
    sns.plt.clf()
    sns.plt.close()

    # I can even run the following to no avail:
    del(_plot)

      

And for reference, my data structure looks something like this:

m_df = pd.DataFrame.from_dict(m.parameters)
m_df.columns = m.parameters.keys()

      

I've done a lot of memory profiling inside Jupyter using various methods like mprun, pympler, etc. The key hint came when I simply commented out the plotting procedure like in the example above and noticed that I was no longer filling up RAM.

As a clarification, none of these graphs are displayed in the notebook, as the code snippet above is a module that is imported and passes the data for processing.

I also found out that I can free memory by manually performing garbage collection on the laptop after the procedure is complete (or after the keyboard interrupts).

Does anyone have any idea why these plot objects don't seem to be closing properly?

+3


source to share





All Articles