I used this tutorial to get Tesseract OCR working with Swift: http://www.piterwilson.com/blog/2014/10/18/minimal-tesseact-ocr-setup-in-swift/
It works fine if I upload the demo image and call
tesseract.image = UIImage(named: "image_sample.jpg");
But if I use my camera code and take a picture of that same image and call
tesseract.image = self.image.blackAndWhite();
the result is either gibberish like
s I 5E251 :Ec ‘-. —7.//:E*髧 a g :_{:7 IC‘ J 7 iii—1553‘ : fizzle —‘;-—:
; ~:~./: -:-‘-
‘- :~£:': _-'~‘:
: 37%; §:‘—_
: ::::E 7,;. 1f:,:~ ——,
Or it returns a BAD_EXC_ACCESS error. I haven't been able to reproduce the reasoning behind why it gives the error or the gibberish. This is the code of my camera capture (photo taken()) and the processing step (nextStepTapped()):
@IBAction func photoTaken(sender: UIButton) {
var videoConnection = stillImageOutput.connectionWithMediaType(AVMediaTypeVideo)
if videoConnection != nil {
// Show next step button
self.view.bringSubviewToFront(self.nextStep)
self.nextStep.hidden = false
// Secure image
stillImageOutput.captureStillImageAsynchronouslyFromConnection(videoConnection) {
(imageDataSampleBuffer, error) -> Void in
var imageData = AVCaptureStillImageOutput.jpegStillImageNSDataRepresentation(imageDataSampleBuffer)
self.image = UIImage(data: imageData)
//var dataProvider = CGDataProviderCreateWithCFData(imageData)
//var cgImageRef = CGImageCreateWithJPEGDataProvider(dataProvider, nil, true, kCGRenderingIntentDefault)
//self.image = UIImage(CGImage: cgImageRef, scale: 1.0, orientation: UIImageOrientation.Right)
}
// Freeze camera preview
captureSession.stopRunning()
}
}
@IBAction func nextStepTapped(sender: UIButton) {
// Save to camera roll & proceeed
//UIImageWriteToSavedPhotosAlbum(self.image.blackAndWhite(), nil, nil, nil)
//UIImageWriteToSavedPhotosAlbum(self.image, nil, nil, nil)
// OCR
var tesseract:Tesseract = Tesseract();
tesseract.language = "eng";
tesseract.delegate = self;
tesseract.image = self.image.blackAndWhite();
tesseract.recognize();
NSLog("%@", tesseract.recognizedText);
}
The image saves to the Camera Roll and is completely legible if I uncomment the commented lines. Not sure why it won't work. It has no problem reading the text on the image if it's uploaded directly into Xcode as a supporting file, but if I take a picture of the exact same image on my screen then it can't read it.
Stumbled upon this tutorial: http://www.raywenderlich.com/93276/implementing-tesseract-ocr-ios
It happened to mention scaling the image. They chose the max dimension as 640. I was taking my pictures as 640x480, so I figured I didn't need to scale them, but I think this code essentially redraws the image. For some reason now my photos OCR fairly well. I still need to work on image processing for smaller text, but it works perfectly for large text. Run my image through this scaling function and I'm good to go.