Tesseract OCR w/ iOS & Swift returns error or gibb

2019-04-01 22:00发布

问题:

I used this tutorial to get Tesseract OCR working with Swift: http://www.piterwilson.com/blog/2014/10/18/minimal-tesseact-ocr-setup-in-swift/

It works fine if I upload the demo image and call

 tesseract.image = UIImage(named: "image_sample.jpg");

But if I use my camera code and take a picture of that same image and call

 tesseract.image = self.image.blackAndWhite();

the result is either gibberish like

s I 5E251 :Ec ‘-. —7.//:E*髧 a g :_{:7 IC‘ J 7 iii—1553‘ : fizzle —‘;-—:

; ~:~./: -:-‘-

‘- :~£:': _-'~‘:

: 37%; §:‘—_

: ::::E 7,;. 1f:,:~ ——,

Or it returns a BAD_EXC_ACCESS error. I haven't been able to reproduce the reasoning behind why it gives the error or the gibberish. This is the code of my camera capture (photo taken()) and the processing step (nextStepTapped()):

 @IBAction func photoTaken(sender: UIButton) {

    var videoConnection = stillImageOutput.connectionWithMediaType(AVMediaTypeVideo)

    if videoConnection != nil {

        // Show next step button
        self.view.bringSubviewToFront(self.nextStep)
        self.nextStep.hidden = false

        // Secure image
        stillImageOutput.captureStillImageAsynchronouslyFromConnection(videoConnection) {
            (imageDataSampleBuffer, error) -> Void in
                var imageData = AVCaptureStillImageOutput.jpegStillImageNSDataRepresentation(imageDataSampleBuffer)

                self.image = UIImage(data: imageData)

                //var dataProvider = CGDataProviderCreateWithCFData(imageData)
                //var cgImageRef = CGImageCreateWithJPEGDataProvider(dataProvider, nil, true, kCGRenderingIntentDefault)
                //self.image = UIImage(CGImage: cgImageRef, scale: 1.0, orientation: UIImageOrientation.Right)

        }

        // Freeze camera preview
        captureSession.stopRunning()

    }

}

@IBAction func nextStepTapped(sender: UIButton) {

    // Save to camera roll & proceeed
    //UIImageWriteToSavedPhotosAlbum(self.image.blackAndWhite(), nil, nil, nil)
    //UIImageWriteToSavedPhotosAlbum(self.image, nil, nil, nil)

    // OCR

    var tesseract:Tesseract = Tesseract();
    tesseract.language = "eng";
    tesseract.delegate = self;
    tesseract.image = self.image.blackAndWhite();
    tesseract.recognize();

    NSLog("%@", tesseract.recognizedText);

}

The image saves to the Camera Roll and is completely legible if I uncomment the commented lines. Not sure why it won't work. It has no problem reading the text on the image if it's uploaded directly into Xcode as a supporting file, but if I take a picture of the exact same image on my screen then it can't read it.

回答1:

Stumbled upon this tutorial: http://www.raywenderlich.com/93276/implementing-tesseract-ocr-ios

It happened to mention scaling the image. They chose the max dimension as 640. I was taking my pictures as 640x480, so I figured I didn't need to scale them, but I think this code essentially redraws the image. For some reason now my photos OCR fairly well. I still need to work on image processing for smaller text, but it works perfectly for large text. Run my image through this scaling function and I'm good to go.

  func scaleImage(image: UIImage, maxDimension: CGFloat) -> UIImage {

   var scaledSize = CGSize(width: maxDimension, height: maxDimension)
   var scaleFactor: CGFloat

   if image.size.width > image.size.height {
      scaleFactor = image.size.height / image.size.width
      scaledSize.width = maxDimension
      scaledSize.height = scaledSize.width * scaleFactor
   } else {
      scaleFactor = image.size.width / image.size.height
      scaledSize.height = maxDimension
      scaledSize.width = scaledSize.height * scaleFactor
   }

   UIGraphicsBeginImageContext(scaledSize)
   image.drawInRect(CGRectMake(0, 0, scaledSize.width, scaledSize.height))
   let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
   UIGraphicsEndImageContext()

 return scaledImage
}