Optimizing histogram intersection kernel in MATLAB

I want to try an svm classifier using histogram intersection kernel for a dataset of 153 images, but it takes a long time. This is my code:

a = load('...'); %vectors
b = load('...'); %labels
g = dataset(a,b);

error = crossval(g,libsvc([],proxm([],'ih'),100),10,10);
error1 = crossval(g,libsvc([],proxm([],'ih'),10),10,10);
error2 = crossval(g,libsvc([],proxm([],'ih'),1),10,10);

      

My kernel implementation in proxm function:

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end

    % With the MATLAB JIT compiler the trivial implementation turns out
    % to be the fastest, especially for large matrices.
    D = zeros(m,n);
    for i=1:m % m is number of samples of A 
        if (0==mod(i,1000)) fprintf('.'); end
        for j=1:n % n is number of samples of B
            D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
        end            
    end

      

I need some matlab optimizer for this code!

+3


source to share


1 answer


You can get rid of this kernel loop to compute D

using bsxfun

based vectorized

-

D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))

      

Or avoid squeeze

with this modification -

D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)

      

If calculations D

include max

instead min

, just replace @min

with @max

there.


Explanation: The principle of operation bsxfun

is that it expands the dimensions of the Singleton and performs the operation specified in @

within its call. Now this extension is mainly about how to get vectorized solutions that replace for-loops. In singleton dimensions

arrays, we mean dimensions 1

.



In many cases the dimensions of the singleton are not yet present and for vectorization with bsxfun

we need to create singleton dimensions

. One of the tools for this is permute

. It's basically all about how the previously applied vector approach will be used.

So your kernel code is -

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end

    % With the MATLAB JIT compiler the trivial implementation turns out
    % to be the fastest, especially for large matrices.
    D = zeros(m,n);
    for i=1:m % m is number of samples of A 
        if (0==mod(i,1000)) fprintf('.'); end
        for j=1:n % n is number of samples of B
            D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
        end            
    end

      

boils down to -

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end
    D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))
    %// OR D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)

      

I'm guessing the line: is if (0==mod(i,1000)) fprintf('.'); end

not important to the computation as it prints some kind of message.

+3


source







All Articles