Minimum character size for OCR

Question

Minimum character size for OCR

I am planning the camera attributes that I need for the computer vision system. I have to define some alphanumeric areas in the image and then convert them with OCR using Tesseract and OpenCV. A typical example would be motorway license plate recognition (but in my project speed is not an issue).

To estimate the camera resolution, distance and focal length of the lens, I need to know which one can be the minimum height in a pixel of text in order to get a reliable OCR conversion.

With a fine lens equation, I got the relationship between my text height in mm and the text height in pixel. By changing the camera distance or focal length, I get different heights in a pixel of my text (from 10px to 40px).

Of course, I would prefer a character height of 40 pixels, but this is also the most expensive solution.

For this reason, I would like to know if the OpenCV and Tesseract libraries have set some restrictions on the minimum reliable text size for good recognition. I have read that various commercial OCRs recommend character sizes between 25 and 40 pixels. Can this range apply to Tesseract / OpenCV too?

I did a couple of tests with a smaller character size (15 px) and the OCR worked very well, but of course there were ideal conditions for light, contrast and background color.

+3

opencv ocr tesseract

gingo Dec 15. 14 at 17:34

source to share

1 answer

aviimaging · Accepted Answer · 2014-12-16T00:12:27+0000

Most number plate reading algorithms (ALPR) use edge information and classification to identify a specific character (alphabetic or other language characters). With this in mind, edges should be clearly defined in thickness and with sufficient contrast.

As the OP mentioned on commercial ALPR algorithms, at least 20px is recommended for character height. This will ensure that the edges have the fewest pixels for most of the standard fonts used for license plates. Here's an example of a license plate with a character height of about 25 pixels - edges at least 3 pixels wide. Having well-defined edges will help most ALPR algorithms. While oversharpening does not necessarily help improve ALPR performance, some blurring is done anyway to remove noise before edges and associated components are detected.

License plate image with standard font

Higher contrast (better lighting conditions) and optimal image resolution (symbol size of at least 20 pixels, but not too high) will help improve the speed of the ALPR algorithm.

Minimum character size for OCR

More articles: