I'm trying to use MODI to OCR a window's program. It works fine for screenshots I grab programmatically using win32 interop like this:
public string SaveScreenShotToFile()
{
RECT rc;
GetWindowRect(_hWnd, out rc);
int width = rc.right - rc.left;
int height = rc.bottom - rc.top;
Bitmap bmp = new Bitmap(width, height);
Graphics gfxBmp = Graphics.FromImage(bmp);
IntPtr hdcBitmap = gfxBmp.GetHdc();
PrintWindow(_hWnd, hdcBitmap, 0);
gfxBmp.ReleaseHdc(hdcBitmap);
gfxBmp.Dispose();
string fileName = @"c:\temp\screenshots\" + Guid.NewGuid().ToString() + ".bmp";
bmp.Save(fileName);
return fileName;
}
This image is then saved to a file and ran through MODI like this:
private string GetTextFromImage(string fileName)
{
MODI.Document doc = new MODI.DocumentClass();
doc.Create(fileName);
doc.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
MODI.Image img = (MODI.Image)doc.Images[0];
MODI.Layout layout = img.Layout;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < layout.Words.Count; i++)
{
MODI.Word word = (MODI.Word)layout.Words[i];
sb.Append(word.Text);
sb.Append(" ");
}
if (sb.Length > 1)
sb.Length--;
return sb.ToString();
}
This part works fine, however, I don't want to OCR the entire screenshot, just portions of it. I try cropping the image programmatically like this:
private string SaveToCroppedImage(Bitmap original)
{
Bitmap result = original.Clone(new Rectangle(0, 0, 250, 250), original.PixelFormat);
var fileName = "c:\\" + Guid.NewGuid().ToString() + ".bmp";
result.Save(fileName, original.RawFormat);
return fileName;
}
and then OCRing this smaller image, however MODI throws an exception; 'OCR running error', the error code is -959967087.
Why can MODI handle the original bitmap but not the smaller version taken from it?
I had the same issue while using the
on a tiff file that was 2400x2496. Resizing it to 50%(reducing the size) fixed the problem and the method was not throwing the exception anymore, however, it was incorrectly recognizing the text like detecting "relerence" instead of "reference" or "712017" instead of "712517". I kept trying different image sizes but they all had the same issue, until i changed the command to
which meant that i don't want it to detect the orientation and not to fix any skewing. Now the command works fine on all images including the 2400x2496 tiff.
Hope this helps out people facing the same problem
the modi ocr is working only tif with me. try to save image in "tif".
sorry my bad english
Looks as though the answer is in giving MODI a bigger canvas. I was also trying to take a screenshot of a control and OCR it and ran into the same problem. In the end I took the image of the control, copied the image into a larger bitmap and OCRed the larger bitmap.
Another issue I found was that you must have a proper extension for your image file. In other words, .tmp doesn't cut it.
I kept the work of creating a larger source inside my OCR method, which looks something like this (I deal directly with Image objects):
I'm not sure exactly what the minimum size is, but it appears as though 1024 x 768 does the trick.
what solved my situation was using a photo editor (Paint.NET) and use the sharpen effect at maximum.
I also used: doc.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, false, false);
Which means that I don't want it to detect the orientation and not fix any skewing. Now the command works fine on all images including the 2400x2496 tiff.
But image should be in .tif.
Hope this helps out people facing the same problem.
I had the same problem "OCR running problem" with some images. I re-scaled the image (in my case by 50%), i.e. reduced its size and voila! it works!