I have a problem where I have to read the time of recording from the video recorded by a surveillance camera.
The time shows up on the top-left area of the video. Below is a link to screen grab of the area which shows the time. Also, the digit color(white/black) keeps changing during the duration of the video.
http://i55.tinypic.com/2j5gca8.png
Please guide me in the direction to approach this problem. I am a Java programmer so would prefer an approach through Java.
EDIT: Thanks unhillbilly for the comment. I had looked at the Ron Cemer OCR library and its performance is much below our requirement.
Since the ocr performance is less than desired, I was planning to build a character set using the screen grabs for all the digits, and using some image/pixel comparison library to compare the frame time with the character-set which will show a probabilistic result after comparison.
So I was looking for a good image comparison library(I would be OK with a non-java library which I can run using the command-line). Also any advice on the above approach would be really helpful.
Try Tesseract from Google, there are a couple of JNI wrappers available. Ensure to read the FAQ to only pull digits.
What format is the source in (vhs, dvd, stills)? It's possible that the time stamp is encoded in the data.
Update with more detail
While I completely understand the desire to have an automated end-to-end process (especially if you're selling this app as opposed to creating an in-house tool), it'd be more efficient to have someone manually enter the start time for each video (even if there are hundreds of them ) then to spend weeks of coding getting this to work automatically.
What I'd do (failing a simple, very-fast-to-implement, super-accurate OCR solution which I don't believe exists):
Create a couple of database tables, like
video_group
might containvideo
would be prepopulated with the video filenames by an import script. Initially assign everything agroup_id
of 1 (Unassigned)Create a simple Winforms or WPF app (pardon my ASCII art):
A user (anybody could do this - secretary, janitor, even a recent CS graduate). All they have to do is read the time from the preview frame, type it into the
Start Time
field, and Click "update" or "Next" to update the database and move on to the next one. Keep the Group selection from one video to the next unless the user changes it.Assuming it takes the user 30 seconds to read, type and click next, They could complete 100-150 videos in an hour (Call it 75 for a more realistic estimate). And, interns are a lot cheaper than developer time.
If you really have "hundreds" of videos, it'll still be faster to do it this way than to screw around with OCR. If the OCR works for the most part, you'll most likely need to have someone manually inspect everything to see if the results are correct. which begs the question, why bother with the OCR?
It doesn't seem like you need a full blown OCR here.
I presume that the numbers are always in the same position in the image. You only expect digits 0-9 at each of the know positions (in either black or white).
A simple template matching at each position with each of the digits (you'll have 20 templates for the 10 digits at each color) is very fast (real-time) and should give you very accurate results.
Java OCR will work perfectly for your situation (Ron Cemer here). All you need to do is remove the background image, or make it always be less than 50% white, so that the white characters will be white and the background will be black when the image is converted to monochrome.
Train JavaOCR on the font, extract that rectangular region from the image, remove the background and you're off and running.
I suggest an algorithm which looks at r,g,b and sets everything to black where r,g,b are not exactly the same values. That will leave only pixels which are perfect shades of gray. Since the image is color and the digits are monochrome, that will leave the digits and some dust.
JavaOCR wants to see black characters on a white background, so once you've done the above, you'll also need to invert the monochrome image (white = black and vice-versa). Then run that through the JavaOCR library, passing it reference samples of all of the characters you expect it to recognize, and your problem should be (at least mostly) solved.