I'm attempting to use Tesseract-OCR-iOS in a new Swift 3.0 project. I'm using Xcode Version 8.1 (8B62). CocoaPods is version 1.1.1.
When I attempt to use tesseract.recognize()
, my app crashes and I get the following output in the console:
actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53
I found this post, which sounds I'm using the wrong version of traineddata
. I downloaded tessdata
from the tesseract-ocr/tessdata repo, so I'm baffled as to why I'd have a mismatch on the version numbers.
Any suggestions how to get Tesseract working are greatly appreciated. Below is additional information re: my setup.
Here's what my Podfile
looks like:
# Uncomment the next line to define a global platform for your project
platform :ios, '9.0'
target 'TesseractDemo' do
# Comment the next line if you're not using Swift and don't want to use dynamic frameworks
use_frameworks!
# Pods for TesseractDemo
pod 'TesseractOCRiOS', '4.0.0'
end
I've dragged a tessdata
folder containing eng.traineddata
into the root directory of my project outside of Xcode and dragged a reference from Finder to Xcode's Project Navigator.
Everything works fine up to this point. No compiler errors, linker whining, etc. In a UIViewController
I'm importing TesseratOCR
and calling it like so:
// MARK: - OCR Methods
func scanImage(image: UIImage) {
if let tesseract = G8Tesseract(language: "eng") {
tesseract.delegate = self
tesseract.image = imageToScan?.g8_blackAndWhite()
tesseract.recognize()
textView.text = tesseract.recognizedText
}
}
Update I found a link to a repo of traineddata files for version 4.0. I nuked my old eng.traineddata file and replaced it with the one from the 4.0 repo. I get the same error referencing the same line.
I had the same problem yesterday, Ithink the problem is with the dictionary, I just change the dictionary of github for the "Lyndsey Scott's brilliant Tesseract tutorial on Ray Wenderlich" dictionary (posted lines before)and it works very well. I have xcode 9.4.1 and it recognize the lyndsey file in a different way than the github file
The current version of
eng.traineddata
linked above on GitHub will not work with the current version of the Tesseract-OCR-iOS.The installation instructions posted on GitHub work perfectly if you've got the right
<language>.traineddata
file.I discovered this after dragging the
eng.traineddata
from Lyndsey Scott's brilliant Tesseract tutorial on Ray Wenderlich.This repo contains the
eng.traineddata
file I needed to get Tesseract working. I'm not sure if that applies to all languages.