Error running tesseract OCR in Linux for .jpg

I have successfully installed tesseract on my Amazon EC2 instance following this tutorial . It works fine for TIFF images, but when I try to run it on JPG I get:

Tesseract Open Source OCR Engine v3.02.02 with Leptonica
Error in pixReadStreamJpeg: function not present
Error in pixReadStream: jpeg: no pix returned
Error in pixRead: pix not read
Unsupported image type.

      

What else do I need to install / do?

+3


source to share


2 answers


I have a problem too. This is because your leptonics setup may have a problem. Try installing leptonica again:

$tar -xvf leptonica-xx.tar.gz
$cd leptonica folder
$./configure
$make
$sudo make install    

      

Once you're done, you can check if all libs are installed correctly:

$tesseract -v

      



Then it will show the 4 libraries that are installed:

tesseract 3.02.02
leptonica-1.71
libjpeg 6b : libpng 1.2.49 : libtiff 3.9.4 : zlib 1.2.3

      

Cheers :)

+4


source


I had the same problem that I had to manually install all image libraries and then re-install leptonica

Install this first

sudo apt-get install libjpeg-dev libpng-dev libtiff4-dev

      



Then reinstall leptonica

./configure && make && sudo make install

      

+1


source







All Articles