Matplotlib animation.save for animated gif is very slow
I am animating a convergence process that I simulate in IPython 3.1 notebook. I am rendering the result of a scatter plot in a matplotlib animation that I am writing to an animated gif via ImageMagick. There are 3000 frames, each with about 5000 points.
I'm not sure exactly how matplotlib creates these animation files, but it seems to cache a bunch of frames and then write them together - when I look at CPU usage it is dominated by python at the beginning and then converting to the end.
Writing off a gif is extremely slow. It takes over an hour to write out a 70MB file to an SSD on a modern MacBook Pro. 'convert' takes the equivalent of 90% of a single core across 4 (8 hypersurface) cores.
It will take about 15 minutes to write the first 65 MB, and over 2 hours to write the last 5 MB.
I think interesting code snippets follow - if there is anything else that would be helpful please let me know.
def updateAnim(i,cg,scat,mags):
if mags[i]==0: return scat,
cg.convergeStep(mags[i])
scat.set_offsets(cg._chrgs[::2,0:2])
return scat,
fig=plt.figure(figsize=(6,10))
plt.axis('equal')
plt.xlim(-1.2,1.2);plt.ylim(-1,3)
c=np.where(co._chrgs[::2,3]>0,'blue','red')
scat=plt.scatter(co._chrgs[::2,0],co._chrgs[::2,1],s=4,color=c,marker='o',alpha=0.25);
ani=animation.FuncAnimation(fig,updateAnim,frames=mags.size,fargs=(co,scat,mags),blit=True);
ani.save('Files/Capacitance/SpherePlateAnimation.gif',writer='imagemagick',fps=30);
Any idea what the bottleneck might be or how can I speed it up? I would prefer the recording time to be short compared to the simulation time.
Version: ImageMagick 6.9.0-0 Q16 x86_64 2015-05-30 http://www.imagemagick.org Copyright: Copyright (C) 1999-2014 ImageMagick Studio LLC Features: DPC Modules Delegates (built-in): bzlib cairo djvu fftw fontconfig freetype gslib gvc jbig jng jp2 jpeg lcms lqr ltdl lzma openexr pangocairo png ps rsvg tiff webp wmf x xml zlib
ps -aef
reports: convert -size 432x720 -depth 8 -delay 3.3333333333333335 -loop 0 rgba: - Files / Capacity / SpherePlateAnimation.gif
source to share
Update
Please read the original answer below before doing anything suggested in this update.
If you want to debug this from some depth, you can detach the ImageMagick part and pinpoint where the problem is. To do this, I would find your ImageMagick program convert
like this:
which convert # result may be "/usr/local/bin/convert"
and then go to the containing directory like
cd /usr/local/bin
Now, save your original program convert
as convert.real
- you can always change it back later by changing the last two parameters below:
mv convert convert.real
Now save the following file as convert
#!/bin/bash
dd bs=128k > $HOME/plot.rgba 2> /dev/null
and make this executable by doing
chmod +x convert
Now when you run again matplotlib
, it will execute the script above, not ImageMagick
and the script will save the raw RGBA data in your login directory to a file named plot.rgba
. Then you will tell you two things: firstly, you will see if it will be matplotlib
faster since there is no more ImageMagick processing, and secondly, you will see if the file size will be about 4GB in size as I assume.
You can now use ImageMagick to process the file after completion matplotlib
using the 10GB memory limit:
convert.real -limit memory 10000000 -size 432x720 -depth 8 -delay 3.33 -loop 0 $HOME/plot.rgba Files/Capacitance/SpherePlateAnimation.gif
You might also consider splitting the file into 2 (or 4), using dd
and processing the two halves in parallel, and adding them together to see if that helps. Ask if you would like to explore this option.
Original Answer
I kind of speak loudly here in the hope that he will either help you directly, or he runs someone else's brain into a fight with a problem ...
It seems from the command line that you reported matplotlib
writing directly to stdin
the ImageMagick tool tool convert
- I can see that from the parameter RGBA:-
that tells me it passes the transparency of RGB plus Alpha as raw values ββto stdin
.
This means that there are no intermediate files that I can suggest to place on the RAM disk that I was heading from with my comment ...
Second, when sending raw pixel data, every single pixel is computed and sent matplotlib
, so it is invariant with 5000 points in your simulation, so no single point decreases or optimizes the number of points.
Another thing to note is that you are using the 16-bit version of ImageMagick quantization (Q16 in your version string). This effectively doubles the memory requirement, so if you can easily recompile ImageMagick for 8-bit quantum depth it might help.
Now let's look at this input stream, RGBA -depth 8
means 4 bytes per pixel and 432x720 pixels per frame or 1.2 MB per frame. You now have 3000 frames, which is 3.6 GB, plus an output file of 75 MB. I suspect this is almost the limit of ImageMagick's natural memory limit and so it slows down at the end, so my suggestion was to check the memory limits on ImageMagick and consider increasing them to 4GB-6GB or more if you have you have it.
To check memory and other resource limits:
identify -list resource
Resource limits:
Width: 214.7MP
Height: 214.7MP
Area: 4.295GP
Memory: 2GiB <---
Map: 4GiB
Disk: unlimited
File: 192
Thread: 1
Throttle: 0
Time: unlimited
Since you cannot increase the memory limits on the command line that runs matplotlib
, you can do so through the environment variable you export before startup matplotlib
, like this:
export MAGICK_MEMORY_LIMIT=4294967296
identify -list resource
Resource limits:
Width: 214.7MP
Height: 214.7MP
Area: 4.295GP
Memory: 4GiB <---
Map: 4GiB
Disk: unlimited
File: 192
Thread: 1
Throttle: 0
Time: unlimited
You can also change it in your file policy.xml
, but this is more important, so try this method first and ask if you are stuck!
Please provide feedback on this as I may suggest other things depending on if this works. Also run identify -list configure
and edit your question and paste the output there.
source to share