I am using Swift 3 to build a mobile app that allows the user to take a picture and run Tesseract OCR over the resulting image.
However, I've been trying to increase the quality of scan and it doesn't seem to be working much. I've segmented the photo into a more "zoomed in" region that I want to recognize and even tried making it black and white. Are there any strategies for "enhancing" or optimizing the picture quality/size so that Tesseract can recognize it better? Thanks!
tesseract.image = // the camera photo here
tesseract.recognize()
print(tesseract.recognizedText)
I got these errors and have no idea what to do:
Error in pixCreateHeader: depth must be {1, 2, 4, 8, 16, 24, 32}
Error in pixCreateNoInit: pixd not made
Error in pixCreate: pixd not made
Error in pixGetData: pix not defined
Error in pixGetWpl: pix not defined
2017-03-11 22:22:30.019717 ProjectName[34247:8754102] Cannot convert image to Pix with bpp = 64
Error in pixSetYRes: pix not defined
Error in pixGetDimensions: pix not defined
Error in pixGetColormap: pix not defined
Error in pixClone: pixs not defined
Error in pixGetDepth: pix not defined
Error in pixGetWpl: pix not defined
Error in pixGetYRes: pix not defined
Please call SetImage before attempting recognition.Please call SetImage before attempting recognition.2017-03-11 22:22:30.026605 EOB-Reader[34247:8754102] No recognized text. Check that -[Tesseract setImage:] is passed an image bigger than 0x0.
ive been using tesseract fairly successfully in swift 3 using the following:
func performImageRecognition(_ image: UIImage) {
let tesseract = G8Tesseract(language: "eng")
var textFromImage: String?
tesseract?.engineMode = .tesseractCubeCombined
tesseract?.pageSegmentationMode = .singleBlock
tesseract?.image = imageView.image
tesseract?.recognize()
textFromImage = tesseract?.recognizedText
print(textFromImage!)
}
I also found pre-processing the image helped too. I added the following extension to UIImage
import UIKit
import CoreImage
extension UIImage {
func toGrayScale() -> UIImage {
let greyImage = UIImageView()
greyImage.image = self
let context = CIContext(options: nil)
let currentFilter = CIFilter(name: "CIPhotoEffectNoir")
currentFilter!.setValue(CIImage(image: greyImage.image!), forKey: kCIInputImageKey)
let output = currentFilter!.outputImage
let cgimg = context.createCGImage(output!,from: output!.extent)
let processedImage = UIImage(cgImage: cgimg!)
greyImage.image = processedImage
return greyImage.image!
}
func binarise() -> UIImage {
let glContext = EAGLContext(api: .openGLES2)!
let ciContext = CIContext(eaglContext: glContext, options: [kCIContextOutputColorSpace : NSNull()])
let filter = CIFilter(name: "CIPhotoEffectMono")
filter!.setValue(CIImage(image: self), forKey: "inputImage")
let outputImage = filter!.outputImage
let cgimg = ciContext.createCGImage(outputImage!, from: (outputImage?.extent)!)
return UIImage(cgImage: cgimg!)
}
func scaleImage() -> UIImage {
let maxDimension: CGFloat = 640
var scaledSize = CGSize(width: maxDimension, height: maxDimension)
var scaleFactor: CGFloat
if self.size.width > self.size.height {
scaleFactor = self.size.height / self.size.width
scaledSize.width = maxDimension
scaledSize.height = scaledSize.width * scaleFactor
} else {
scaleFactor = self.size.width / self.size.height
scaledSize.height = maxDimension
scaledSize.width = scaledSize.height * scaleFactor
}
UIGraphicsBeginImageContext(scaledSize)
self.draw(in: CGRect(x: 0, y: 0, width: scaledSize.width, height: scaledSize.height))
let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return scaledImage!
}
func orientate(img: UIImage) -> UIImage {
if (img.imageOrientation == UIImageOrientation.up) {
return img;
}
UIGraphicsBeginImageContextWithOptions(img.size, false, img.scale)
let rect = CGRect(x: 0, y: 0, width: img.size.width, height: img.size.height)
img.draw(in: rect)
let normalizedImage : UIImage = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
return normalizedImage
}
}
And then called this before passing the image to performImageRecognition
func processImage() {
self.imageView.image! = self.imageView.image!.toGrayScale()
self.imageView.image! = self.imageView.image!.binarise()
self.imageView.image! = self.imageView.image!.scaleImage()
}
Hope this helps