Optimizing histogram intersection kernel in MATLAB
I want to try an svm classifier using histogram intersection kernel for a dataset of 153 images, but it takes a long time. This is my code:
a = load('...'); %vectors
b = load('...'); %labels
g = dataset(a,b);
error = crossval(g,libsvc([],proxm([],'ih'),100),10,10);
error1 = crossval(g,libsvc([],proxm([],'ih'),10),10,10);
error2 = crossval(g,libsvc([],proxm([],'ih'),1),10,10);
My kernel implementation in proxm function:
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
% With the MATLAB JIT compiler the trivial implementation turns out
% to be the fastest, especially for large matrices.
D = zeros(m,n);
for i=1:m % m is number of samples of A
if (0==mod(i,1000)) fprintf('.'); end
for j=1:n % n is number of samples of B
D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
end
end
I need some matlab optimizer for this code!
source to share
You can get rid of this kernel loop to compute D
using bsxfun
based vectorized
-
D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))
Or avoid squeeze
with this modification -
D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)
If calculations D
include max
instead min
, just replace @min
with @max
there.
Explanation: The principle of operation bsxfun
is that it expands the dimensions of the Singleton and performs the operation specified in @
within its call. Now this extension is mainly about how to get vectorized solutions that replace for-loops. In singleton dimensions
arrays, we mean dimensions 1
.
In many cases the dimensions of the singleton are not yet present and for vectorization with bsxfun
we need to create singleton dimensions
. One of the tools for this is permute
. It's basically all about how the previously applied vector approach will be used.
So your kernel code is -
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
% With the MATLAB JIT compiler the trivial implementation turns out
% to be the fastest, especially for large matrices.
D = zeros(m,n);
for i=1:m % m is number of samples of A
if (0==mod(i,1000)) fprintf('.'); end
for j=1:n % n is number of samples of B
D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
end
end
boils down to -
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))
%// OR D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)
I'm guessing the line: is if (0==mod(i,1000)) fprintf('.'); end
not important to the computation as it prints some kind of message.
source to share