How can I convert an RGB image to a color-based one-line 3D array using numpy?

Question

How can I convert an RGB image to a color-based one-line 3D array using numpy?

Simply put, what I am trying to do is similar to this question: Convert an RGB image to an index image , but instead of a single channel index image, I want to get an n channel image where img[h, w]

is one hot coded vector. For example, if the input image [[[0, 0, 0], [255, 255, 255]]

and index 0 is assigned to black and 1 is assigned to white, then the desired output is [[[1, 0], [0, 1]]]

.

Like the previous person asked the question, I implemented it naively, but the code is quite slow and I believe that the correct solution using numpy will be significantly faster.

Also, as suggested in the previous post, I can preprocess each grayscale image and encode the image once, but I want a more general solution.

Example

Let's say I want to assign white to 0, red to 1, blue to 2, and yellow to 3:

(255, 255, 255): 0
(255, 0, 0): 1
(0, 0, 255): 2
(255, 255, 0): 3

and I have an image consisting of four colors where the image is a 3D array containing the R, G, B values for each pixel:

[
    [[255, 255, 255], [255, 255, 255], [255,   0,   0], [255,   0,   0]],
    [[  0,   0, 255], [255, 255, 255], [255,   0,   0], [255,   0,   0]],
    [[  0,   0, 255], [  0,   0, 255], [255, 255, 255], [255, 255, 255]],
    [[255, 255, 255], [255, 255, 255], [255, 255,   0], [255, 255,   0]]
]

and this is what I want to get where every pixel changes by one hot encoded index values. (Since changing a 2d array of index values to a 3d array with one hot coded value is easy, getting a 2d array of index values is fine too.)

[
    [[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
    [[0, 0, 1, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
    [[0, 0, 1, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0]],
    [[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]]
]

In this example, I've used colors where the RGB components are 255 or 0, but I don't want decisions to rely on that fact.

+3

python numpy image

JiminP May 10 '17 at 5:49 am

source to share

2 answers

Divakar · Answer 1 · 2017-05-10T09:31:45+0000

We could generate decimal equivalents for each pixel color. There would be full possibilities with each channel having 0

or 255

as a value, 8

but it seems we are only interested in four of these colors.

Then we would have two ways to solve it:

One could use unique indices from these decimal equivalents, starting from 0

to the final color, all in sequence, and finally initializing the output array and assigning to it.
Another way would be to use broadcasted

comparison of these decimal equivalents with colors.

Following are the two methods -

def indexing_based(a):
    b = (a == 255).dot([4,2,1])  # Decimal equivalents
    colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
    idx = np.empty(colors.max()+1,dtype=int)
    idx[colors] = np.arange(len(colors))
    m,n,r = a.shape
    out = np.zeros((m,n,len(colors)), dtype=int)
    out[np.arange(m)[:,None], np.arange(n), idx[b]] = 1
    return out

def broadcasting_based(a):
    b = (a == 255).dot([4,2,1])  # Decimal equivalents
    colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
    return (b[...,None] == colors).astype(int)

Example run -

>>> a = np.array([
...     [[255, 255, 255], [255, 255, 255], [255,   0,   0], [255,   0,   0]],
...     [[  0,   0, 255], [255, 255, 255], [255,   0,   0], [255,   0,   0]],
...     [[  0,   0, 255], [  0,   0, 255], [255, 255, 255], [255, 255, 255]],
...     [[255, 255, 255], [255, 255, 255], [255, 255,   0], [255, 255,   0]],
...     [[255, 255, 255], [255,   0,   0], [255, 255,   0], [255,  0 ,   0]]])
>>> indexing_based(a)
array([[[1, 0, 0, 0],
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 1, 0, 0]],

       [[0, 0, 1, 0],
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 1, 0, 0]],

       [[0, 0, 1, 0],
        [0, 0, 1, 0],
        [1, 0, 0, 0],
        [1, 0, 0, 0]],

       [[1, 0, 0, 0],
        [1, 0, 0, 0],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       [[1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 0, 1],
        [0, 1, 0, 0]]])
>>> np.allclose(broadcasting_based(a), indexing_based(a))
True

MonsterMax · Answer 2 · 2018-04-23T11:25:17+0000

My solution looks like this and should work for any colors:

color_dict = {0: (0,   255, 255),
              1: (255, 255,   0),
              ....}


def rgb_to_onehot(rgb_arr, color_dict):
    num_classes = len(color_dict)
    shape = rgb_arr.shape[:2]+(num_classes,)
    arr = np.zeros( shape, dtype=np.int8 )
    for i, cls in enumerate(color_dict):
        arr[:,:,i] = np.all(rgb_arr.reshape( (-1,3) ) == color_dict[i], axis=1).reshape(shape[:2])
    return arr


def onehot_to_rgb(onehot, color_dict):
    single_layer = np.argmax(onehot, axis=-1)
    output = np.zeros( onehot.shape[:2]+(3,) )
    for k in color_dict.keys():
        output[single_layer==k] = color_dict[k]
    return np.uint8(output)

I haven't tested it for speed yet, but at least it works :)

How can I convert an RGB image to a color-based one-line 3D array using numpy?

Example

More articles: