How to make tesseract not use TESSDATA_PREFIX

I had tesseract installed on my computer and it defined the TESSDATA_PREFIX environment variable. After completely uninstalling tesseract, I am trying to use the tesseract API this way:

if (myOCR->Init("C:/Projects/project/Release/tessdata/", "rus")) {
            fprintf(stderr, "Could not initialize tesseract.\n");
            exit(1);
        }

      

and to get

Error opening data file C:\Program Files (x86)\Tesseract-OCR\tessdata/rus.traine
ddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent d
irectory of your "tessdata" directory.
Failed loading language 'rus'
Tesseract couldn't load any languages!
Could not initialize tesseract.

      

typing TESSDATA_PREFIX in the cmd gives me there is no such variable. But tesseract remembers this (don't know how). So how can I make tesseract look for traindata in a specific folder? Thanks to

+3


source to share


3 answers


This seems useful: Tesseract - change the location of the language file

From the answer in this thread, it appears that tesseract is looking for an environment variable, but if not set, a fixed location is assumed.

The easiest way to fix this is to run "cmd" and then do:



c:\Users\alex> set TESSDATA_PREFIX="C:/Projects/project/Release/tessdata"
c:\Users\alex> cd MyOCRProgDir
c:\Users\alex\MyOCRProgDir> MyProg

      

Hope it helps!

+3


source


The same problem occured ... All I did was copy the tessdata folder to the directory where my application is running ...

Note: After that, be sure to set the tessdata properties "Copy to Output Directory" to "Copy Always". This solves the problem.,.



Refer this YouTube link.,. for a better demonstration.,. Hope this helps :)

http://www.youtube.com/watch?v=RqvvXJXuRYY

+1


source


I had the same problem with training data. Instead of not using TESSDATA_PREFIX, I found a workaround. This worked for me.

My machine is 64 bit and is creating a 32 bit copy with VS2012.

set environment variables. TESSDATA_PREFIX: C: \ Program Files (x86) \ Tesseract-OCR

here "Tesseract-OCR" is the parent directory of the "tessdata" folder.

edit the path variable. path: C: \ tess \ lib \ lib;

here "C: \ tess \ lib \ lib" is where lib and dll files are located: liblept168.dll, liblept168.lib, etc.

start a new win32 console application and set the following options. C / C ++ -> General C: \ tess \ Include \ include

here "C: \ tess \ include \ include" is the parent directory of the "tesseract" and "leptonica" folders where the include files are.

Connector → Additional features of the library C: \ tess \ Lib \ Lib

Linkers -> Additional dependencies liblept168.lib libtesseract302.lib (add them to the list)

C / C ++ -> _CRT_SECURE_NO_WARNINGS preprocessor (add this to the list)

copy the two tesseract libraries (corresponding to the library files) for debug and release folders (not from inside the root directory)

copy the tessdata folder (inside the Tesseract installation) to the above locations.

Hope you will be fine.

0


source







All Articles