Error running tesseract OCR in Linux for .jpg
I have successfully installed tesseract on my Amazon EC2 instance following this tutorial . It works fine for TIFF images, but when I try to run it on JPG I get:
Tesseract Open Source OCR Engine v3.02.02 with Leptonica
Error in pixReadStreamJpeg: function not present
Error in pixReadStream: jpeg: no pix returned
Error in pixRead: pix not read
Unsupported image type.
What else do I need to install / do?
+3
Ray
source
to share
2 answers
I have a problem too. This is because your leptonics setup may have a problem. Try installing leptonica again:
$tar -xvf leptonica-xx.tar.gz
$cd leptonica folder
$./configure
$make
$sudo make install
Once you're done, you can check if all libs are installed correctly:
$tesseract -v
Then it will show the 4 libraries that are installed:
tesseract 3.02.02
leptonica-1.71
libjpeg 6b : libpng 1.2.49 : libtiff 3.9.4 : zlib 1.2.3
Cheers :)
+4
mmw5610
source
to share
I had the same problem that I had to manually install all image libraries and then re-install leptonica
Install this first
sudo apt-get install libjpeg-dev libpng-dev libtiff4-dev
Then reinstall leptonica
./configure && make && sudo make install
+1
iLoveCamelCase
source
to share