OpenCL Video Processing

2019-08-01 07:52发布

问题:

I'm about to write a stacking software. Therefore I want to extract the frames of one or more videofiles to an opencl buffer and then process them with an opencl kernel.

But I don't know how to load the video frames as I never worked with videos. As I use opencl my main focus is obviously high performance!

I know there are libraries like ffmpeg or opencv and more, but as I'm not into it I don't know which fits my needs best.

So can you give me advice which library/function to use which works best (fastest) in conjunction with opencl?

I haven't found something useful about this yet. Where could I start? (something like a short ducumentation or tutorial would be kind)

Thanks in advance!

I'm working under Linux (cross platform is not a need) with a nvidia card and my (preferred) programming language is c++. I prefer h264 as video format, but avi, mov, mp4, ... are also possible.

回答1:

A friend of mine has been happy using ffmpeg in an image processing framework with OpenGL, so there shouldn't any problem with OpenCL either. I'd choose that over a vendor-specific library. If you use OpenCV then keep in mind that your application may have to be shipped with an OpenCV shared library even if it doesn't need all the extra stuff, i.e. wasting HDD space on a user's computer. I found ffmpeg easy to use about 2 years back.

The only reason for using OpenCV for reading in frames is if you also need some of its image processing functions. If not, then I'd use ffmpeg.



回答2:

If you where on on Windows and using AMD GPUs try the AMD Media SDK.

From SemiAcurate web site http://semiaccurate.com/2012/06/18/amd-media-sdk-announced-at-afds/

'AMD’s Media SDK. What this SDK aims to do is enable the use of AMD’s fixed function hardware blocks and GPU acceleration abilities by exposing them through APIs and code samples. In the larger context of the competitive market place, AMD needs developers to take advantage of the GPU based capabilities in its APUs in order for APUs to offer tangible benefits for general compute work loads. To this end AMD is preparing example applications, creating APIs for developers to use in their applications, and documenting everything with guides and tutorials, as part of their effort to create this Media SDK.'

http://developer.amd.com/tools-and-sdks/heterogeneous-computing/media-sdk/

I think its still in beta but has a set of examples

http://amd.wpengine.com/app-sdk/codelisting.php?q=Media



回答3:

FFmpeg can be the right choice, it has much low-level optimizations & runs really fast.

Simplest decoding application can be found in ffmpeg samples: http://www.ffmpeg.org/doxygen/trunk/doc_2examples_2decoding_encoding_8c-example.html

Take a look at function decode_write_frame(). Decoded picture is stored in structure AVFrame: (I added 1 argument - OpenCL context in order to allocate mem objects)

static int decode_write_frame(
    const char     *outfilename, 
    AVCodecContext *avctx,
    AVFrame        *frame, 
    int            *frame_count, 
    AVPacket       *pkt, 
    int            last,
    cl_context     context)
{
    int len, got_frame;
    len = avcodec_decode_video2(avctx, frame, &got_frame, pkt);

    if (len < 0) {
        fprintf(stderr, "Error while decoding frame %d\n", *frame_count);
        return len;
    }

    cl_int ret_code;

    //frame->data[0] is Y plane
    cl_mem y_plane = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, 
        frame->width * frame->height, frame->data[0], &ret_code);

    if(ret_code != CL_SUCCESS){
        fprintf(stderr, "Error %d occured.\n", ret_code);
    }

    //frame->data[1] is Cb plane
    //frame->data[2] is Cr plane      
    //Remember, that video is usually encoded in YCbCr420, which means that
    //Cb & Cr planes are smaller than Y plane 2 times in each dimension

    if (pkt->data) {
        pkt->size -= len;
        pkt->data += len;
    }
    return 0;
}

P. S. Don't confuse codecs & containers. Avi or Mov containers can store bitstream, which is coded with MPEG4, MPEG2 & other coders.