Matlab sum (XY) vs sum (X) - sum (Y)

If we have two matrices X

and Y

both two-dimensional, we can now mathematically say sum(X-Y)=sum(X)-sum(Y)


What's more efficient in Matlab? Which is faster?


source to share

3 answers


Works a little faster on my machine for small arrays, but sum(x)-sum(y)

much faster for large arrays. For comparison, I am using MATLAB R2015a on a Windows 7 computer with 32GB memory.

n = ceil(logspace(0,4,25));

for i = 1:numel(n)
    x = rand(n(i));
    y = rand(n(i));
    t1(i) = timeit(@()sum(x-y));
    t2(i) = timeit(@()sum(x)-sum(y));
    clear x y

figure; hold on
plot(n, t1)
plot(n, t2)
legend({'sum(x-y)', 'sum(x)-sum(y)'})
xlabel('n'); ylabel('time')
set(gca, 'XScale', 'log', 'YScale', 'log')


enter image description here



I'm all curious about you, and I decided to run some kind of benchmark. By the time I finished it seems like knedlsepp has this right as it sum(X-Y)

gets pretty slow for large matrices .

The crossover seems to be going around the elements 10^3



%% // Benchmark code

nElem = (1:9).'*(10.^(1:6)) ; nElem = nElem(:) ;    %'//damn pretifier
nIter = numel(nElem) ;

res = zeros(nIter,2) ;
for ii=1:nIter
    X = rand(nElem(ii) ,1) ;
    Y = rand(nElem(ii) ,1) ;

    f1 = @() sum(X-Y) ;
    f2 = @() sum(X)-sum(Y) ;

    res(ii,1) = timeit( f1 ) ;
    res(ii,2) = timeit( f2 ) ;
    clear f1 f2 X Y



I ran this several times and the results on my machine are quite consistent. I blew my memory trying to go above 10 ^ 7 elements. Feel free to extend the test, but I don't think the relationship will change much.

Specifications: Windows 8.1 Pro / Matlab R2013a: specs



Assuming both x and y have N x M = K elements, then For the sum (x) -sum (y) you have:

K memory access to read x, K memory access to read y, one memory access to write the result → 2K + 1 memory access (Assuming the sub-sum inside the sum function will be held in the CPU register).

Sum operations 2K + subtraction 1 → 2k + 1 CPU operations.

For sum (x - y): Matlab will compute x - y and store the result and then compute the sum, so we have K memory access to read x, K memory access to read y, K memory access to write the result of the subtraction , memory access K to read subtraction again the result for the sum function, then 1 memory access to write the result of the sum → 4K + 1 memory operations.

k subtractions + k summations → 2K operations.

As we can see, the sum (x - y) consumes a lot of memory, so in a large number of elements it can consume a higher time, but I have no explanation why it is faster for a small number of elements.



All Articles