Convert RGB Image to Index Image
I want to convert a 3 channel RGB image to an index image using Python. He used a deep network for semantic segmentation to handle learning labels. By index image I mean it has one channel and each pixel is an index that must start at zero. And of course they must be the same size. The conversion is based on the following mapping in a Python dict:
color2index = { (255, 255, 255) : 0, (0, 0, 255) : 1, (0, 255, 255) : 2, (0, 255, 0) : 3, (255, 255, 0) : 4, (255, 0, 0) : 5 }
I have implemented a naive function:
def im2index(im):
"""
turn a 3 channel RGB image to 1 channel index image
"""
assert len(im.shape) == 3
height, width, ch = im.shape
assert ch == 3
m_lable = np.zeros((height, width, 1), dtype=np.uint8)
for w in range(width):
for h in range(height):
b, g, r = im[h, w, :]
m_lable[h, w, :] = color2index[(r, g, b)]
return m_lable
The input im
is an array of dummies created cv2.imread()
. However, this code is really slow. Since it im
is in a NumPy array, I first tried ufunc
NUMPY with something like this:
RGB2index = np.frompyfunc(lambda x: color2index(tuple(x)))
indices = RGB2index(im)
But it turns out that ufunc
it only takes one element at a time. I was unable to give the function three arguments (RGB value) once.
So are there any other ways to do the optimization? The mapping shouldn't be like this if a more efficient data structure exists. I noticed that accessing a Python dict dose is not time consuming, but converting from an array to a tuple (which can be hashed) does.
PS: One idea I got is to implement the kernel in CUDA. But that would be more difficult.
UPDATA1: Dan Mashek The answer works great. But first, we have to convert the RGB image to grayscale. This can be problematic if two colors have the same grayscale value.
I am pasting working code here. Hope this can help others.
lut = np.ones(256, dtype=np.uint8) * 255 lut[[255,29,179,150,226,76]] = np.arange(6, dtype=np.uint8) im_out = cv2.LUT(cv2.cvtColor(im, cv2.COLOR_BGR2GRAY), lut)
source to share
Have you checked the Pillow library https://python-pillow.org/ ? As far as I remember, it has several classes and methods for handling color. See: https://pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert
source to share
Here's a small utility function to convert images (np.array) to pixel labels (indices), which can also be hot-coded:
def rgb2label(img, color_codes = None, one_hot_encode=False):
if color_codes is None:
color_codes = {val:i for i,val in enumerate(set( tuple(v) for m2d in img for v in m2d ))}
n_labels = len(color_codes)
result = np.ndarray(shape=img.shape[:2], dtype=int)
result[:,:] = -1
for rgb, idx in color_codes.items():
result[(img==rgb).all(2)] = idx
if one_hot_encode:
one_hot_labels = np.zeros((img.shape[0],img.shape[1],n_labels))
# one-hot encoding
for c in range(n_labels):
one_hot_labels[: , : , c ] = (result == c ).astype(int)
result = one_hot_labels
return result, color_codes
img = cv2.imread("input_rgb_for_labels.png")
img_labels, color_codes = rgb2label(img)
print(color_codes) # e.g. to see what the codebook is
img1 = cv2.imread("another_rgb_for_labels.png")
img1_labels, _ = rgb2label(img1, color_codes) # use the same codebook
It computes (and returns) the color codebook, if supplied None
.
source to share
I implemented a naive function: ... First I tried
ufunc
NumPy something like this: ...
I suggest using an even more naive function that converts just one pixel:
def rgb2index(rgb):
"""
turn a 3 channel RGB color to 1 channel index color
"""
return color2index[tuple(rgb)]
Then it would be a good idea to use a normal procedure, but we don't need ufunc
:
np.apply_along_axis(rgb2index, 2, im)
numpy.apply_along_axis()
Used here to apply our function rgb2index()
to RGB slices along the last of the three axes (0, 1, 2) for the entire image im
.
We could even dispense with the function and just write:
np.apply_along_axis(lambda rgb: color2index[tuple(rgb)], 2, im)
source to share