Numpy.dot is slow but blas and lapack are installed, how to fix it?
I am running on ArchLinux
, my python version 2.7.8
and the tags are set BLAS
and LAPACK
:
% pacman -Qs blas; pacman -Qs lapack
local/blas 3.5.0-1
Basic Linear Algebra Subprograms
local/lapack 3.5.0-1
Linear Algebra PACKage
Numpy is installed via sudo pip2 install numpy
and it confirms that it sees both BLAS
and LAPACK
:
>>> numpy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib64']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib64']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib64']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib64']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
openblas_lapack_info:
NOT AVAILABLE
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
However, my speed test for an operation np.dot
is over 30 seconds long when I know it runs for much less than 10 seconds on a similar machine. How do I fix the speed problem? Am I missing something when installing numpy with BLAS
and support LAPACK
?
source to share
Ok, here's the whole story. First, the initial setup was slow because it BLAS
is a reference implementation that is not designed to be fast. Again, the package BLAS
in the ArchLinux Extra repository is currently the reference implementation. See the section Presentation
here for details .
Secondly, there are optimized versions BLAS
(quite a few, actually: ATLAS, OpenBlas, Goto BLAS, MKL and many others, no doubt about it). They are quite difficult to install. I finished installing OpenBlas, here is a step-by-step overview of it in ArchLinux:
- Install package
openblas-lapack
from AUR - Install the package
python2-numpy-openblas
from the AUR As I understand it, it differs from the regularpython2-numpy
package by thesite.cfg
configuration file, which instructsnumpy
to look for the librariesopenblas
that we installed in step 1.
These actions solved the problem for me, the speed is much better now - less than 1 second for the test mentioned in the question. Also numpy shows that it was compiled with openblas:
>>> np.show_config()
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/lib']
language = f77
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/lib']
language = f77
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/lib']
language = f77
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/lib']
language = f77
blas_mkl_info:
NOT AVAILABLE
I find that the process for setting up an openblas
oriented numpy
for python3
looks very similar.
source to share