Quick differences of all string pairs with NumPy

I use an algorithm that requires each example to have a matrix, for example Xi

, which is ai x b

, and that for each pair of O(n^2)

examples I find the difference between each row Xiu - Xjv

, then sum the outer products sum_u sum_v np.outer(Xiu - Xjv, Xiu - Xjv)

.

Unfortunately, this internal double sum is quite slow and leads to runtimes out of control on large datasets. Right now I'm just using loops for this. Is there some pythonic way to speed up this internal operation? I tried to think and no luck.

To clarify, for each of the examples n

there is a matrix Xi

with dimensions ai x b

, where ai

for each example is different. For each pair, (Xi, Xj)

I want to go through all the pairs of rows O(ai * bi)

between the two matrices and find Xiu - Xjv

, take the outer product of that with me, np.outer(Xiu - Xjv, Xiu - Xjv)

and finally sum all those outer products.

For example: suppose D is [[1,2],[3,4]]

and we just work with it for both matrices.

Then, for example, one step might be np.outer(D[0] - D[1], D[0] - D[1])

that would be a matrix [4,4],[4,4]

.

It's just enough that (0,0) and (1,1) are only 0-matrices, and (0,1) and (1,0) are both 4 matrices, so the final sum of all four outer products of pairs will be [[8,8],[8,8]]

.

+1


source to share


1 answer


OK, that was fun. I still can't help but think that it can all be done with one ingenious call numpy.tensordot

, but if anything, this seems to have removed all Python level loops:

import numpy

def slow( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    out = 0.0
    for ai in a:
        for bj in b:
            dij = bj - ai
            out += numpy.outer( dij, dij )
    return out

def opsum( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    RA, CA = a.shape
    RB, CB = b.shape    
    if CA != CB: raise ValueError( "input matrices should have the same number of columns" )

    out = -numpy.outer( a.sum( axis=0 ), b.sum( axis=0 ) );
    out += out.T
    out += RB * ( a.T * a )
    out += RA * ( b.T * b )
    return out

def test( a, b=None ):
    print( "ground truth:" )
    print( slow( a, b ) )
    print( "optimized:" )
    print( opsum( a, b ) )  
    print( "max abs discrepancy:" )
    print( numpy.abs( opsum( a, b ) - slow( a, b ) ).max() )
    print( "" )

# OP example
test( [[1,2], [3,4]] )

# non-symmetric example
a = [ [ 1, 2, 3 ], [-4, 5, 6 ], [7, -8, 9 ], [ 10, 11, -12 ] ]
a = numpy.matrix( a, dtype=float )
b = a[ ::2, ::-1 ] + 15
test( a, b )

# non-integer example
test( numpy.random.randn( *a.shape ), numpy.random.randn( *b.shape ) )

      



With this (rather arbitrary) example of input, the time opsum

(measured using timeit opsum(a,b)

in IPython) looks about 3-5 times better than slow

. But of course it scales much better: scale the number of data points by a factor of 100 and the number of functions by a factor of 10, and then we're looking at a speed increase of 10,000.

+2


source







All Articles