Replace specific values โโin a matrix with Python

I have an mxn matrix where each row is a pattern and each column is a class. Each line contains the soft max probabilities for each class. I want to replace the maximum value on each line 1 and others with 0. How can I do this efficiently in Python?

+3

source to share

I think the best answer to your specific question is to use a matrix type object.

A sparse matrix should be the most efficient in terms of storing large numbers of these large matrices in a memory friendly way, given that most of the matrix is โโfilled with zeros. This should be better than using numpy arrays, especially for very large matrices in both dimensions, if not in terms of computation speed, in terms of memory.

``````import numpy as np
import scipy       #older versions may require `import scipy.sparse`

matrix = np.matrix(np.random.randn(10, 5))
maxes = matrix.argmax(axis=1).A1
# was .A[:,0], slightly faster, but .A1 seems more readable
n_rows = len(matrix)  # could do matrix.shape[0], but that slower
data = np.ones(n_rows)
row = np.arange(n_rows)
sparse_matrix = scipy.sparse.coo_matrix((data, (row, maxes)),
shape=matrix.shape,
dtype=np.int8)
```

```

This sparse_matrix object should be very lightweight relative to a regular matrix object that would uselessly keep track of every zero in it. To materialize it as a normal matrix:

``````sparse_matrix.todense()
```

```

returns:

``````matrix([[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0]], dtype=int8)
```

```

What can we compare with `matrix`

:

``````matrix([[ 1.41049496,  0.24737968, -0.70849012,  0.24794031,  1.9231408 ],
[-0.08323096, -0.32134873,  2.14154425, -1.30430663,  0.64934781],
[ 0.56249379,  0.07851507,  0.63024234, -0.38683508, -1.75887624],
[-0.41063182,  0.15657594,  0.11175805,  0.37646245,  1.58261556],
[ 1.10421356, -0.26151637,  0.64442885, -1.23544526, -0.91119517],
[ 0.51384883,  1.5901419 ,  1.92496778, -1.23541699,  1.00231508],
[-2.42759787, -0.23592018, -0.33534536,  0.17577329, -1.14793293],
[-0.06051458,  1.24004714,  1.23588228, -0.11727146, -0.02627196],
[ 1.66071534, -0.07734444,  1.40305686, -1.02098911, -1.10752638],
[ 0.12466003, -1.60874191,  1.81127175,  2.26257234, -1.26008476]])
```

```
0

source

Some compiled data:

``````>>> a = np.random.rand(5, 5)
>>> a
array([[ 0.06922196,  0.66444783,  0.2582146 ,  0.03886282,  0.75403153],
[ 0.74530361,  0.36357237,  0.3689877 ,  0.71927017,  0.55944165],
[ 0.84674582,  0.2834574 ,  0.11472191,  0.29572721,  0.03846353],
[ 0.10322931,  0.90932896,  0.03913152,  0.50660894,  0.45083403],
[ 0.55196367,  0.92418942,  0.38171512,  0.01016748,  0.04845774]])
```

```

In one line:

``````>>> (a == a.max(axis=1)[:, None]).astype(int)
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
```

```

More efficient (and verbose) approach:

``````>>> b = np.zeros_like(a, dtype=int)
>>> b[np.arange(a.shape[0]), np.argmax(a, axis=1)] = 1
>>> b
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
```

```
+2

source

This approach, using the core numpy and list functions, works, but is least performant. I am leaving this answer here as it might be somewhat instructive. First, we create a numpy matrix:

``````matrix = np.matrix(np.random.randn(2,2))
```

```

`matrix`

, eg:

``````matrix([[-0.84558168,  0.08836042],
[-0.01963479,  0.35331933]])
```

```

Now map 1 to the new matrix if the element is max, 0 otherwise:

``````newmatrix = np.matrix([[1 if i == row.max() else 0 for i in row]
for row in np.array(matrix)])
```

```

`newmatrix`

Now:

``````matrix([[0, 1],
[0, 1]])
```

```
0

source

All Articles