Tesseract Train for Specific Words - Possible?

Question

Tesseract Train for Specific Words - Possible?

I want to use Tesseract to extract 10-20 keywords from a document. The document will contain all English characters / words. I'm interested in something like "Age: 23". Here Age is the keyword I'm interested in and you want to extract 23 (the value for that).

The first approach that comes to my mind is to extract the entire page into text and then look for keywords in the recognized text. But from a tesseract learning point of view, is there a better approach if I know the keywords, which can lead to better accuracy?

I am more or less aware of the limitations of Tesseract OCR. Trying to maximize within these constraints. Thanks for all your recommendations.

+2

ocr tesseract

zolio 07 Sep 13:58

source to share

1 answer

nguyenq · Answer 1 · 2013-09-07T15:29:25+0000

Try a bazaar that matches the pattern in Tesseract.

Tesseract Train for Specific Words - Possible?

More articles: