Finding RGB bounding boxes in an image
I am working with a page segmentation algorithm. The output of the code records the image, with the pixels of each zone being assigned a unique color. I would like to process the image to find the bounding boxes of the zones. I need to find all colors, then find all pixels of that color, and then find their bounding box.
Below is a sample image.
Currently I start with the histograms of the R, G, B channels. The histograms tell me where the data is.
img = Image.open(imgfilename)
img.load()
r,g,b = img.split()
ra,ga,ba = [ np.asarray(p,dtype="uint8") for p in (r,g,b) ]
rhist,edges = np.histogram(ra,bins=256)
ghist,edges = np.histogram(ga,bins=256)
bhist,edges = np.histogram(ba,bins=256)
print np.nonzero(rhist)
print np.nonzero(ghist)
print np.nonzero(bhist)
Output: (array ([0, 1, 128, 205, 255]),) (array ([0, 20, 128, 186, 255]),) (array ([0, 128, 147, 150, 255] ),)
At that moment, I burst out laughing a little. On visual inspection, I have the colors (0,0,0), (1,0,0), (0,20,0), (128,128,128), etc. How to wrap non-zero outputs to pixel values โโfor np.where ()?
I am considering antialiasing 3, row, columns to 2D plane of 24-bit packed RGB values โโ(r <24 | g <16 | b) and searching for that array. It seems like brute force and inelegant. Is there a better way in Numpy to find the border margins of a color value?
source to share
There is no reason to think of this as an RGB color image, it is just a segmentation rendering that someone else did. You can easily think of it as a grayscale image and you don't need to do anything else for those specific colors.
import sys
import numpy
from PIL import Image
img = Image.open(sys.argv[1]).convert('L')
im = numpy.array(img)
colors = set(numpy.unique(im))
colors.remove(255)
for color in colors:
py, px = numpy.where(im == color)
print(px.min(), py.min(), px.max(), py.max())
If you cannot rely on convert('L')
providing unique colors (i.e. you are using other colors outside of the ones shown in this image), you can package the image and get unique colors:
...
im = numpy.array(img, dtype=int)
packed = im[:,:,0]<<16 | im[:,:,1]<<8 | im[:,:,2]
colors = set(numpy.unique(packed.ravel()))
colors.remove(255<<16 | 255<<8 | 255)
for color in colors:
py, px = numpy.where(packed == color)
print(px.min(), py.min(), px.max(), py.max())
I would also recommend removing the small connected components before finding the bounding boxes.
source to share
EDIT Putting everything together into a working program using the image you posted:
from __future__ import division
import numpy as np
import itertools
from PIL import Image
img = np.array(Image.open('test_img.png'))
def bounding_boxes(img) :
r, g, b = [np.unique(img[..., j]) for j in (0, 1, 2)]
bounding_boxes = {}
for r0, g0, b0 in itertools.product(r, g, b) :
rows, cols = np.where((img[..., 0] == r0) &
(img[..., 1] == g0) &
(img[..., 2] == b0))
if len(rows) :
bounding_boxes[(r0, g0, b0)] = (np.min(rows), np.max(rows),
np.min(cols), np.max(cols))
return bounding_boxes
In [2]: %timeit bounding_boxes(img)
1 loops, best of 3: 30.3 s per loop
In [3]: bounding_boxes(img)
Out[3]:
{(0, 0, 255): (3011, 3176, 755, 2546),
(0, 128, 0): (10, 2612, 0, 561),
(0, 128, 128): (1929, 1972, 985, 1438),
(0, 255, 0): (10, 166, 562, 868),
(0, 255, 255): (2938, 2938, 680, 682),
(1, 0, 0): (10, 357, 987, 2591),
(128, 0, 128): (417, 1873, 984, 2496),
(205, 186, 150): (11, 56, 869, 1752),
(255, 0, 0): (3214, 3223, 570, 583),
(255, 20, 147): (2020, 2615, 956, 2371),
(255, 255, 0): (3007, 3013, 600, 752),
(255, 255, 255): (0, 3299, 0, 2591)}
Not very fast, even though only a small number of colors are actually tested ...
You can find the bounding box for flowers r0
, g0
, b0
with something along the lines of
rows, cols = np.where((ra == r0) & (ga == g0) & (ba == b0))
top, bottom = np.min(rows), np.max(rows)
left, right = np.min(cols), np.max(cols)
Instead of repeating all 2**24
RGB color combinations , you can drastically reduce your search space by using only the Cartesian product of non-zero histograms:
for r0, g0, b0 in itertools.product(np.nonzero(rhist),
np.nonzero(ghist),
np.nonzero(bhist)) :
Do you have any non-existent combinations that you can filter by checking that rows
and cols
are not empty tuples. But in your example, you would reduce the search space for combinations 2**24
to 125.
source to share
It's just a solution off my head. You can sort through the pixels in the image, starting from the upper left and lower right and retain the values of top
, bottom
, left
and right
for each color. For a given color, the value top
will be the first row you see with that color, and it bottom
will be the last raw value, the value left
will be the minimum column value for pixels in that color, and the right
maximum column value you find.
Then for each color, you can draw a rectangle from top-left
to bottom-right
in the color you want .
I don't know if it qualifies as a good bounding box algorithm, but I think it's okay.
source to share