Replace specific values โ€‹โ€‹in a matrix with Python

I have an mxn matrix where each row is a pattern and each column is a class. Each line contains the soft max probabilities for each class. I want to replace the maximum value on each line 1 and others with 0. How can I do this efficiently in Python?

+3


source to share


3 answers


I think the best answer to your specific question is to use a matrix type object.

A sparse matrix should be the most efficient in terms of storing large numbers of these large matrices in a memory friendly way, given that most of the matrix is โ€‹โ€‹filled with zeros. This should be better than using numpy arrays, especially for very large matrices in both dimensions, if not in terms of computation speed, in terms of memory.

import numpy as np
import scipy       #older versions may require `import scipy.sparse`

matrix = np.matrix(np.random.randn(10, 5))
maxes = matrix.argmax(axis=1).A1           
                      # was .A[:,0], slightly faster, but .A1 seems more readable
n_rows = len(matrix)  # could do matrix.shape[0], but that slower
data = np.ones(n_rows)
row = np.arange(n_rows)
sparse_matrix = scipy.sparse.coo_matrix((data, (row, maxes)), 
                                        shape=matrix.shape, 
                                        dtype=np.int8)

      

This sparse_matrix object should be very lightweight relative to a regular matrix object that would uselessly keep track of every zero in it. To materialize it as a normal matrix:

sparse_matrix.todense()

      



returns:

matrix([[0, 0, 0, 0, 1],
        [0, 0, 1, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 0, 1],
        [1, 0, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [0, 1, 0, 0, 0],
        [1, 0, 0, 0, 0],
        [0, 0, 0, 1, 0]], dtype=int8)

      

What can we compare with matrix

:

matrix([[ 1.41049496,  0.24737968, -0.70849012,  0.24794031,  1.9231408 ],
        [-0.08323096, -0.32134873,  2.14154425, -1.30430663,  0.64934781],
        [ 0.56249379,  0.07851507,  0.63024234, -0.38683508, -1.75887624],
        [-0.41063182,  0.15657594,  0.11175805,  0.37646245,  1.58261556],
        [ 1.10421356, -0.26151637,  0.64442885, -1.23544526, -0.91119517],
        [ 0.51384883,  1.5901419 ,  1.92496778, -1.23541699,  1.00231508],
        [-2.42759787, -0.23592018, -0.33534536,  0.17577329, -1.14793293],
        [-0.06051458,  1.24004714,  1.23588228, -0.11727146, -0.02627196],
        [ 1.66071534, -0.07734444,  1.40305686, -1.02098911, -1.10752638],
        [ 0.12466003, -1.60874191,  1.81127175,  2.26257234, -1.26008476]])

      

0


source


Some compiled data:

>>> a = np.random.rand(5, 5)
>>> a
array([[ 0.06922196,  0.66444783,  0.2582146 ,  0.03886282,  0.75403153],
       [ 0.74530361,  0.36357237,  0.3689877 ,  0.71927017,  0.55944165],
       [ 0.84674582,  0.2834574 ,  0.11472191,  0.29572721,  0.03846353],
       [ 0.10322931,  0.90932896,  0.03913152,  0.50660894,  0.45083403],
       [ 0.55196367,  0.92418942,  0.38171512,  0.01016748,  0.04845774]])

      

In one line:



>>> (a == a.max(axis=1)[:, None]).astype(int)
array([[0, 0, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 1, 0, 0, 0]])

      

More efficient (and verbose) approach:

>>> b = np.zeros_like(a, dtype=int)
>>> b[np.arange(a.shape[0]), np.argmax(a, axis=1)] = 1
>>> b
array([[0, 0, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 1, 0, 0, 0]])

      

+2


source


This approach, using the core numpy and list functions, works, but is least performant. I am leaving this answer here as it might be somewhat instructive. First, we create a numpy matrix:

matrix = np.matrix(np.random.randn(2,2))

      

matrix

, eg:

matrix([[-0.84558168,  0.08836042],
        [-0.01963479,  0.35331933]])

      

Now map 1 to the new matrix if the element is max, 0 otherwise:

newmatrix = np.matrix([[1 if i == row.max() else 0 for i in row] 
                                                   for row in np.array(matrix)])

      

newmatrix

Now:

matrix([[0, 1],
        [0, 1]])

      

0


source







All Articles