I am writing a program that when given an image of a low level math problem (e.g. 98*13) should be able to output the answer. The numbers would be black, and the background white. Not a captcha, just an image of a math problem.
The math problems would only have two numbers and one operator, and that operator would only be +, -, *, or /.
Obviously, I know how to do the calculating ;) I'm just not sure how to go about getting the text from the image.
A free library would be ideal... although If I have to write the code myself I could probably manage.
Try this post regarding using the C++ Google Tessaract OCR lib in C#
OCR with the Tesseract interface
Here is some useful sample code for C#:
Using Tesseract: Free open-source OCR application for the Windows Desktop - A modern GUI front-end for the Tesseract OCR engine. The application also includes support for reading and OCR'ing PDF files: https://github.com/A9T9/Free-Ocr-Windows-Desktop
Using Microsoft OCR: Free open-source OCR application for the Windows Store - A modern GUI front-end for the Microsoft OCR library. The application also includes support for reading and OCR'ing PDF files: https://github.com/A9T9/Free-OCR-Software
For extract words from image, I use the most accurate open source OCR engine: Tesseract. Available here or directly in your packages NuGet.
And this is my function in C#, which extract words from image passed in
sourceFilePath
. Set EngineMode to TesseractAndCube; it detect more word than the other options.I hope that helps.
You need OCR. There is the free Tesseract library from Google, but it's C code. You could use in a C++/CLI project and access via .NET.
This article gives some information on recognizing numbers (for Sudoku, but your problem is similar)
http://sudokugrab.blogspot.com/2009/07/how-does-it-all-work.html
you can use Microsoft Office Document Imaging (Interop.MODI.dll) in visaul studio and extract text of pictures