Video Stabilization with OpenCV

2019-01-08 06:24发布

问题:

I have a video feed which is taken with a moving camera and contains moving objects. I would like to stabilize the video, so that all stationary objects will remain stationary in the video feed. How can I do this with OpenCV?

i.e. For example, if I have two images prev_frame and next_frame, how do I transform next_frame so the video camera appears stationary?

回答1:

I can suggest one of the following solutions:

  1. Using local high level features: OpenCV includes SURF, so: for each frame, extract SURF features. Then build feature Kd-Tree (also in OpenCV), then match each two consecutive frames to find pairs of corresponding features. Feed those pairs into cvFindHomography to compute the homography between those frames. Warp frames according to (combined..) homographies to stabilize. This is, to my knowledge, a very robust and sophisticated approach, however SURF extraction and matching can be quite slow
  2. You can try to do the above with "less robust" features, if you expect only minor movement between two frames, e.g. use Harris corner detection and build pairs of corners closest to each other in both frames, feed to cvFindHomography then as above. Probably faster but less robust.
  3. If you restrict movement to translation, you might be able to replace cvFindHomography with something more...simple, to just get the translation between feature-pairs (e.g. average)
  4. Use phase-correlation (ref. http://en.wikipedia.org/wiki/Phase_correlation), if you expect only translation between two frames. OpenCV includes DFT/FFT and IFFT, see the linked wikipedia article on formulas and explanation.

EDIT Three remarks I should better mention explicitly, just in case:

  1. The homography based approach is likely very exact, so stationary object will remain stationary. However, homographies include perspective distortion and zoom as well so the result might look a bit..uncommon (or even distorted for some fast movements). Although exact, this might be less visually pleasing; so use this rather for further processing or, like, forensics. But you should try it out, could be super-pleasing for some scenes/movements as well.
  2. To my knowledge, at least several free video-stabilization tools use the phase-correlation. If you just want to "un-shake" the camera, this might be preferable.
  3. There is quite some research going on in this field. You'll find some a lot more sophisticated approaches in some papers (although they likely require more than just OpenCV).


回答2:

OpenCV has the functions estimateRigidTransform() and warpAffine() which handle this sort of problem really well.

Its pretty much as simple as this:

Mat M = estimateRigidTransform(frame1,frame2,0)
warpAffine(frame2,output,M,Size(640,480),INTER_NEAREST|WARP_INVERSE_MAP) 

Now output contains the contents of frame2 that is best aligned to fit to frame1. For large shifts, M will be a zero Matrix or it might not be a Matrix at all, depending on the version of OpenCV, so you'd have to filter those and not apply them. I'm not sure how large that is; maybe half the frame width, maybe more.

The third parameter to estimateRigidTransform is a boolean that tells it whether to also apply an arbitrary affine matrix or restrict it to translation/rotation/scaling. For the purposes of stabilizing an image from a camera you probably just want the latter. In fact, for camera image stabilization you might also want to remove any scaling from the returned matrix by normalizing it for only rotation and translation.

Also, for a moving camera, you'd probably want to sample M through time and calculate a mean.

Here are links to more info on estimateRigidTransform(), and warpAffine()



回答3:

openCV now has a video stabilization class: http://docs.opencv.org/trunk/d5/d50/group__videostab.html



回答4:

I past my answer from this one. How to stabilize Webcam video?


Yesterday I just did some works (in Python) on this subject, main steps are:

  1. use cv2.goodFeaturesToTrack to find good corners.
  2. use cv2.calcOpticalFlowPyrLK to track the corners.
  3. use cv2.findHomography to compute the homography matrix.
  4. use cv2.warpPerspective to transform video frame.

But the result is not that ideal now, may be I should choose SIFT keypoints other than goodFeatures.


Source:

Stabilize the car:



回答5:

This is a tricky problem, but I can suggest a somewhat simple situation off the top of my head.

  1. Shift/rotate next_frame by an arbitrary amount
  2. Use background subtraction threshold(abs(prev_frame-next_frame_rotated)) to find the static elements. You'll have to play around with what threshold value to use.
  3. Find min(template_match(prev_frame_background, next_frame_rotated_background))
  4. Record the shift/rotation of the closest match and apply it to next_frame

This won't work well for multiple frames over time, so you'll want to look into using a background accumulator so the background the algorithm looks for is similar over time.



回答6:

I should add the following remarks to complete zerm's answer. It will simplify your problem if one stationary object is chosen and then work with zerm's approach (1) with that single object. If you find a stationary object and apply the correction to it, I think it is safe to assume the other stationary objects will also look stable.

Although it is certainly valid for your tough problem, you will have the following problems with this approach:

  • Detection and homography estimation will sometimes fail for various reasons: occlusions, sudden moves, motion blur, severe lighting differences. You will have to search ways to handle it.

  • Your target object(s) might have occlusions, meaning its detection will fail on that frame and you will have to handle occlusions which is itself a whole research topic.

  • Depending on your hardware and the complexity of your solution, you might have some troubles achieving real-time results using SURF. You might try opencv's gpu implementation or other faster feature detectors like ORB, BRIEF or FREAK.



回答7:

Here is already good answer, but it use a little bit old algorithm and I developed the program to solve the similar problem so i add additional answer.

  1. At first, you should extract feature from image using feature extractor like SIFT, SURF algorithm. In my case, FAST+ORB algorithm is best. If you want more information, See this paper
  2. After you get the features in images, you should find matching features with images.there are several matcher but Bruteforce matcher is not bad. If Bruteforce is slow in your system, you should use a algorithm like KD-Tree.
  3. Last, you should get geometric transformation matrix which is minimize error of transformed points. You can use RANSAC algorithm in this process. You can develop all this process using OpenCV and I already developed it in mobile devices. See this repository