OCR position matches a frame to a field on a credit card

I am developing an OCR for credit card detection.

After scanning the image, I get a list of words with its positions. Any tips / suggestions on the best approach for determining which words correspond to each field of the credit card (number, date, name)?

For example:

    position = 96.00 491.00
    text = CARDHOLDER

      

enter image description here

Thank you in advance

+3


source to share


2 answers


Your first problem is that most OCRs are not optimized for small amounts of text that take up most of the "page" (or map image in your case) in spatially separated chunks. They expect lines or pages of text from a scanned book or newspaper. Therefore, they are unlikely to be able to do this immediately when analyzing the image.

Since the font is fairly uniform, they are likely to recognize characters well, but the layout will confuse the page segmentation algorithm, so the text you choose may not be in the correct order. For example, "1234" of the card number and the smaller "1234" below it make up a single column of text, as well as two second sets of four numbers and an expiration date.

For specialized cases where you know the layout in advance, you really want to develop your own page segmentation algorithm to break the image into zones, for example. card number, cardholder name, start and end date. It shouldn't be too much , because I think the layout of these components is standardized on credit cards. Assuming good preprocessing and binarization, you can basically make a horizontal histogram and split the image at valleys.



Then, extract each zone as a separate image, containing only one line of text, and feed it to OCR.

Alternatively (quick and dirty approach)

  • Instruct OCR that what you want to recognize is a single column (i.e. prevents it from figuring out the page layout for itself). You can do this with Tesseract using a parameter -psm

    (page segmentation parameter) set to probably 6 (but try and see which gives the best results).
  • Create hOCR format for Tesseract output which you can set in config file. The hOCR format includes bounding boxes for lines that are drawn relative to the entire image.
  • write an algorithm that compares the bounding boxes in hOCR to where you know each map component should be (looking for some percentage of overlap, it won't match exactly for obvious reasons.)
+3


source


In addition to the good advice provided by Mikesname, you can greatly improve the recognition result no matter which OCR engine you are using if you use image processing to convert the image to biton (pure black and white), such as an attached copy of your image. Image converted to black and white



0


source







All Articles