NumPy - create 1-hot tensor from 2D numpy array

I have a multidimensional 2D array with values ​​from 0 to 59.

for those familiar with DL and in particular Image Segmentation - I create an array (call it L) with a .png image, and the value of each pixel L [x, y] means the class that this pixel belongs to (out of 60 classes).

I want to create a 1-hot tensor - Lhot in which (Lhot [x, y, z] == 1) only if (L [x, y] == z) and 0 otherwise.

I want to create it with some kind of broadcast / indexing (1,2 lines) - no loops.

it should be functionally equal to this piece of code (Dtype corresponds to L):

Lhot = np.zeros((L.shape[0], L.shape[1], 60), dtype=Dtype)
for i in range(L.shape[0]):
    for j in range(L.shape[1]):
        Lhot[i,j,L[i,j]] = 1

      

Anyone have an idea? Thank!

+3


source to share


2 answers


Much faster and cleaner to use pure numpy

Lhot = np.transpose(np.eye(60)[L], (1,2,0))

      



The problem you will run into with multidimensional single points is they get really big and very sparse, and there is no good way to handle sparse arrays with more than 2D in numpy

/ scipy

( sklearn

or many other ML packages I think). Do you really need to reheat nd?

+3


source


Since a typical one-string encoding is defined for 1D vectors, all you have to do is flatten your matrix, use one hot encoder from scikit-learn (or any other library with one hot encoding), and reformat back.

from sklearn.preprocessing import OneHotEncoder
n, m = L.shape
k = 60
Lhot = np.array(OneHotEncoder(n_values=k).fit_transform(L.reshape(-1,1)).todense()).reshape(n, m, k)

      



of course you can do it manually too.

n, m = L.shape
k = 60
Lhot = np.zeros((n*m, k)) # empty, flat array
Lhot[np.arange(n*m), L.flatten()] = 1 # one-hot encoding for 1D
Lhot = Lhot.reshape(n, m, k) # reshaping back to 3D tensor

      

+3


source







All Articles