OCR to text printed on a metal plate

I am working on an OCR project that aims to read a stamped serial number from a metal plate:

Example of stamped text.

I am using OpenCV to prepare an image for OCR and use Tesseract for the OCR itself. This is the ideal process:

  • In the image of the whole plate, crop to the common place of the serial number.
  • Prepare cropped image for OCR.
  • Apply OCR.

My current process:

  • Configure the serial number manually.
  • Convert to grayscale.
  • Sharpen.
  • Use Canny edge detection.
  • Launch Tesseract OCR.

However, I have very limited success. My main questions:

  • What processing optimizes OCR? Makes edge detection a good start?
  • Can I use a bullet font to my advantage?
  • Can I use the "color" of the text (as opposed to gray metal or black / white labels) to my advantage?
+3


source to share





All Articles