可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I would like to perform face detection / tracking on a video file (e.g. an MP4 from the users gallery) using the Android Vision FaceDetector API. I can see many examples on using the CameraSource class to perform face tracking on the stream coming directly from the camera (e.g. on the android-vision github), but nothing on video files.

I tried looking at the source code for CameraSource through Android Studio, but it is obfuscated, and I couldn't see the original online. I image there are many commonalities between using the camera and using a file. Presumably I just play the video file on a Surface, and then pass that to a pipeline.

Alternatively I can see that Frame.Builder has functions setImageData and setTimestampMillis. If I was able to read in the video as ByteBuffer, how would I pass that to the FaceDetector API? I guess this question is similar, but no answers. Similarly, decode the video into Bitmap frames and pass that to setBitmap.

Ideally I don't want to render the video to the screen, and the processing should happen as fast as the FaceDetector API is capable of.

回答1:

Alternatively I can see that Frame.Builder has functions setImageData and setTimestampMillis. If I was able to read in the video as ByteBuffer, how would I pass that to the FaceDetector API?

Simply call SparseArray<Face> faces = detector.detect(frame); where detector has to be created like this:

FaceDetector detector = new FaceDetector.Builder(context)
   .setProminentFaceOnly(true)
   .build();

回答2:

If processing time is not an issue, using MediaMetadataRetriever.getFrameAtTime solves the question. As Anton suggested, you can also use FaceDetector.detect:

Bitmap bitmap;
Frame frame;
SparseArray<Face> faces;
MediaMetadataRetriever mMMR = new MediaMetadataRetriever();
mMMR.setDataSource(videoPath);
String timeMs = mMMR.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION); // video time in ms
int totalVideoTime= 1000*Integer.valueOf(timeMs); // total video time, in uS
for (int time_us=1;time_us<totalVideoTime;time_us+=deltaT){
        bitmap = mMMR.getFrameAtTime(time_us, MediaMetadataRetriever.OPTION_CLOSEST_SYNC); // extract a bitmap element from the closest key frame from the specified time_us
        if (bitmap==null) break; 
        frame = new Frame.Builder().setBitmap(bitmap).build(); // generates a "Frame" object, which can be fed to a face detector
        faces = detector.detect(frame); // detect the faces (detector is a FaceDetector)
        // TODO ... do something with "faces"
    }

where deltaT=1000000/fps, and fps is the desired number of frames per second. For example, if you want to extract 4 frames every second, deltaT=250000 (Note that faces will be overwritten on every iteration, so you should do something (store/report results) inside the loop