Which computer vision library & algorithm(s), for

2020-08-05 10:42发布

问题:

Objective: Detect / determine human actions, s.a. picking / lifting items to read label and keeping it back on rack (in a store), sitting-on, mounting/climbing-atom objects s.a. chair, bench, ladder etc.

Environment: Store / shop, which is mostly well lit. Cameras (VGA -> 1MP), fixed (i.e. not PTZ).

Constraints:

  1. Presence of known and unknown human beings.
  2. Possible rearrangement of objects (items for sale) in the store, over a period of time.
  3. Possible changes in lighting over time. For example: Frontal areas of store might get ample sunlight during day, which changes to artificial light at night. Also, during night more lights can be switched-on.

Question:

  1. While I understand that OpenCV is great for face-detection and usable for face-recognition, can it be used for analyzing "actions", s.a. the act of sitting, the act of lifting an object off the shelf ? If so, what are some of these algorithms I should dig deeper into ?

  2. Since cameras in stores are mostly at ceiling height, they generally do not have a frontal face view, but mostly a top-down view. I understand that Haar Cascade (PCA) isn't quite usable, but one needs other methods s.a. 3D Head geometry determination. Are there other libraries (other than OpenCV) which need to be used for such tasks ? Are there open-source libraries for the same ?

回答1:

From time to time some people come here and ask for help (or better, code) to solve some of the most difficult research problems in computer vision. Problems that were not solved by the most regarded academics and scientists. Sometimes, they ask for algorithms they've seen in SF movies. Then they leave frustrated, because OpenCV is "not friendly enough".

Now, seriously, if you are a team of PhDs in image processing, working on some genius project, you don't need advice from here. And if you aren't, the chance to do it is really low.

What you can do with reasonable resources and accuracy is to track people in the store: Use a moving-average background subtractor(available in OpenCV) to determine how the empty store looks like, and subtract that background from each frame, to see objects that come and dissapear. You can extract them with the blob analisys lib. A Kalman filter (or a simpler tracker) will help you keep track of the moving objects.

Good luck!



回答2:

This is pretty hard problem. From my private conversations with these guys http://www.picar.us/ I recon they have some routines that detect human actions in the video, such as dancing or skateboarding. This stuff is not included in their open source library, but they might help you if you ask them nicely.



回答3:

Human action recognition problem is usually treated using bag-of-words model representation and linear (or non-linear) supervised classification scheme using hundreds of labelled data.