Swift 3 - How do I improve image quality for Tesse

2020-08-05 10:31发布

I am using Swift 3 to build a mobile app that allows the user to take a picture and run Tesseract OCR over the resulting image.

However, I've been trying to increase the quality of scan and it doesn't seem to be working much. I've segmented the photo into a more "zoomed in" region that I want to recognize and even tried making it black and white. Are there any strategies for "enhancing" or optimizing the picture quality/size so that Tesseract can recognize it better? Thanks!

tesseract.image = // the camera photo here
tesseract.recognize()
print(tesseract.recognizedText)

I got these errors and have no idea what to do:

Error in pixCreateHeader: depth must be {1, 2, 4, 8, 16, 24, 32}
Error in pixCreateNoInit: pixd not made
Error in pixCreate: pixd not made
Error in pixGetData: pix not defined
Error in pixGetWpl: pix not defined
2017-03-11 22:22:30.019717 ProjectName[34247:8754102] Cannot convert image to Pix with bpp = 64
Error in pixSetYRes: pix not defined
Error in pixGetDimensions: pix not defined
Error in pixGetColormap: pix not defined
Error in pixClone: pixs not defined
Error in pixGetDepth: pix not defined
Error in pixGetWpl: pix not defined
Error in pixGetYRes: pix not defined
Please call SetImage before attempting recognition.Please call SetImage before attempting recognition.2017-03-11 22:22:30.026605 EOB-Reader[34247:8754102] No recognized text. Check that -[Tesseract setImage:] is passed an image bigger than 0x0.

1条回答
闹够了就滚
2楼-- · 2020-08-05 10:42

ive been using tesseract fairly successfully in swift 3 using the following:

func performImageRecognition(_ image: UIImage) {

    let tesseract = G8Tesseract(language: "eng")
    var textFromImage: String?
    tesseract?.engineMode = .tesseractCubeCombined
    tesseract?.pageSegmentationMode = .singleBlock
    tesseract?.image = imageView.image
    tesseract?.recognize()
    textFromImage = tesseract?.recognizedText
    print(textFromImage!)
}

I also found pre-processing the image helped too. I added the following extension to UIImage

import UIKit import CoreImage

    extension UIImage {

        func toGrayScale() -> UIImage {

            let greyImage = UIImageView()
            greyImage.image = self
            let context = CIContext(options: nil)
            let currentFilter = CIFilter(name: "CIPhotoEffectNoir")
            currentFilter!.setValue(CIImage(image: greyImage.image!), forKey: kCIInputImageKey)
            let output = currentFilter!.outputImage
            let cgimg = context.createCGImage(output!,from: output!.extent)
            let processedImage = UIImage(cgImage: cgimg!)
            greyImage.image = processedImage

            return greyImage.image!
        }

        func binarise() -> UIImage {

            let glContext = EAGLContext(api: .openGLES2)!
            let ciContext = CIContext(eaglContext: glContext, options: [kCIContextOutputColorSpace : NSNull()])
            let filter = CIFilter(name: "CIPhotoEffectMono")
            filter!.setValue(CIImage(image: self), forKey: "inputImage")
            let outputImage = filter!.outputImage
            let cgimg = ciContext.createCGImage(outputImage!, from: (outputImage?.extent)!)

            return UIImage(cgImage: cgimg!)
        }

        func scaleImage() -> UIImage {

            let maxDimension: CGFloat = 640
            var scaledSize = CGSize(width: maxDimension, height: maxDimension)
            var scaleFactor: CGFloat

            if self.size.width > self.size.height {
                scaleFactor = self.size.height / self.size.width
                scaledSize.width = maxDimension
                scaledSize.height = scaledSize.width * scaleFactor
            } else {
                scaleFactor = self.size.width / self.size.height
                scaledSize.height = maxDimension
                scaledSize.width = scaledSize.height * scaleFactor
            }

            UIGraphicsBeginImageContext(scaledSize)
            self.draw(in: CGRect(x: 0, y: 0, width: scaledSize.width, height: scaledSize.height))
            let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
            UIGraphicsEndImageContext()

            return scaledImage!
        }

        func orientate(img: UIImage) -> UIImage {

            if (img.imageOrientation == UIImageOrientation.up) {
                return img;
            }

            UIGraphicsBeginImageContextWithOptions(img.size, false, img.scale)
            let rect = CGRect(x: 0, y: 0, width: img.size.width, height: img.size.height)
            img.draw(in: rect)

            let normalizedImage : UIImage = UIGraphicsGetImageFromCurrentImageContext()!
            UIGraphicsEndImageContext()

            return normalizedImage

        }

    }

And then called this before passing the image to performImageRecognition

func processImage() {

    self.imageView.image! = self.imageView.image!.toGrayScale()
    self.imageView.image! = self.imageView.image!.binarise()
    self.imageView.image! = self.imageView.image!.scaleImage()
}

Hope this helps

查看更多
登录 后发表回答