I am using tesseract as the OCR engine for my ANPR application. I have trained tesseract 3.01v with the numberplate font. But I need to know:
- Which files should be included in the tessdata folder?
- Should I use the same tessdata folder where tesseract 3.01v is installed?
- I have trained with tesseract 3.01v and I am using tessnet2 in my code so will it be a problem?
Following is the code that I tried it with but it keeps exiting from the DoOcr() method.
List<tessnet2.Word> ocrText = new List<tessnet2.Word>();
tessnet2.Tesseract ocr = new tessnet2.Tesseract();
ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR\tessdata", "eng", true);
ocrText = ocr.DoOCR(bmpGrayScale, new Rectangle(rect.X, rect.Y, rect.Width, rect.Height));
foreach (tessnet2.Word word in ocrText)
Console.WriteLine("{0} : {1}", word.Confidence, word.Text);
Does anyone have an idea as to whats wrong?
"3.01 is not backwards compatible with 2.04. The data files are different."
For .NET library compatible with 3.01, look at the project at http://code.google.com/p/tesseractdotnet/ or https://github.com/charlesw/tesseract-ocr-dotnet.