Tesseract OCR: Parsing Table Cells

Question

Tesseract OCR: Parsing Table Cells

I am using Tesseract-OCR v4.0.0 (alpha?) From cmd to extract text from png table shown below:

I wanted Tesseract-OCR to parse what was in one cell before moving on to the next. I don't want to go to the next word in the line.

Expected:

. . . John Smith 07 March,2017 Chicago Milwaukee Detroit Pacific Ocean . . .

Actual

. . . John Smith 07 March,2017 Chicago Pacific Ocean Milwaukee Detroit . . .

I tried:

Change page segmentation using the -psm flag from 0-13. The results usually coincide with slight differences or unreadable results.

Is there any other way to configure Tesseract to read the entire content of one cell before moving on to the next? Otherwise, are there any workarounds?

+3

ocr tesseract

James A June 12. 17 at 6:43

source to share

No one has answered this question yet

Check out similar questions:

338

Simple OCR character recognition in OpenCV-Python

110

image processing to improve the accuracy of Tesseract OCR

27

OCR with Tesseract interface

sixteen

Tesseract OCR Custom Templates

fourteen

Alternative to Tesseract OCR Training?

6

Tesseract OCR text for documents with tables or lines

1

punctuation recognition in Tesseract OCR

0

Enlarge OCR faxes with Tesseract

0

Tesseract OCR Pound Symbol

Tesseract OCR: Parsing Table Cells

More articles: