Using ffmpeg to capture frames from webcam and aud

2020-02-26 09:50发布

问题:

For the past few weeks I've been struggling with the ffmpeg API since I can not find a clear documentation and I also find it hard to search as all the solutions I find online involve not the c API but the ffmpeg.c command line program. I am creating a program which needs to capture video from a webcam and audio, show the frames on screen and record both the audio and frames to a video file. I am also using QT as a framework for this project.

I've been able to show the frames on the screen and even record them, but my problem is the record of both the audio and video. I've decided to create a simpler program for tests, that only saves the stream to a file without showing the frames on screen, starting from the remuxing.c example on the ffmpeg documentation. My code is as follows:

//This is the variables on the .h
AVOutputFormat *ofmt;
AVFormatContext *ifmt_ctx, *ofmt_ctx;

QString cDeviceName;
QString aDeviceName;

int audioStream, videoStream;
bool done;

//The .cpp
#include "cameratest.h"
#include <QtConcurrent/QtConcurrent>
#include <QDebug>

CameraTest::CameraTest(QString cDeviceName, QString aDeviceName, QObject *parent) :
    QObject(parent)
{
    done = false;
    this->cDeviceName = cDeviceName;
    this->aDeviceName = aDeviceName;
    av_register_all();
    avdevice_register_all();
}

void CameraTest::toggleDone() {
    done = !done;
}

int CameraTest::init() {
    ofmt = NULL;
    ifmt_ctx = NULL;
    ofmt_ctx = NULL;

    QString fullDName = cDeviceName.prepend("video=") + ":" + aDeviceName.prepend("audio="); 
    qDebug() << fullDName;
    AVInputFormat *fmt = av_find_input_format("dshow");

    int ret, i;

    if (avformat_open_input(&ifmt_ctx, fullDName.toUtf8().data(), fmt, NULL) < 0) {
       fprintf(stderr, "Could not open input file '%s'", fullDName.toUtf8().data());
       return -1;
    }
    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0) {
       fprintf(stderr, "Failed to retrieve input stream information");
       return -1;
    }
    av_dump_format(ifmt_ctx, 0, fullDName.toUtf8().data(), 0);
    avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, "test.avi");
    if (!ofmt_ctx) {
       fprintf(stderr, "Could not create output context\n");
       ret = AVERROR_UNKNOWN;
       return -1;
    }
    ofmt = ofmt_ctx->oformat;

    for (i = 0; i < ifmt_ctx->nb_streams; i++) {
       AVStream *in_stream = ifmt_ctx->streams[i];
       AVStream *out_stream = avformat_new_stream(ofmt_ctx, in_stream->codec->codec);

       if (ifmt_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
           videoStream = i;
       }
       else if (ifmt_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
           audioStream = i;
       }

       if (!out_stream) {
           fprintf(stderr, "Failed allocating output stream\n");
           ret = AVERROR_UNKNOWN;
           return -1;
       }
       ret = avcodec_copy_context(out_stream->codec, in_stream->codec);
       if (ret < 0) {
           fprintf(stderr, "Failed to copy context from input to output stream codec context\n");
           return -1;
       }
       out_stream->codec->codec_tag = 0;
       if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
           out_stream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
    }
    av_dump_format(ofmt_ctx, 0, "test.avi", 1);
    if (!(ofmt->flags & AVFMT_NOFILE)) {
       ret = avio_open(&ofmt_ctx->pb, "test.avi", AVIO_FLAG_WRITE);
       if (ret < 0) {
           fprintf(stderr, "Could not open output file '%s'", "test.avi");
           return -1;
       }
    }
    ret = avformat_write_header(ofmt_ctx, NULL);
    if (ret < 0) {
       fprintf(stderr, "Error occurred when opening output file\n");
       return -1;
    }
    QtConcurrent::run(this, &CameraTest::grabFrames);
    return 0;
}

void CameraTest::grabFrames() {
    AVPacket pkt;
    int ret;
    while (av_read_frame(ifmt_ctx, &pkt) >= 0) {
        AVStream *in_stream, *out_stream;
        in_stream  = ifmt_ctx->streams[pkt.stream_index];
        out_stream = ofmt_ctx->streams[pkt.stream_index];
        /* copy packet */
        pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream->time_base, out_stream->time_base, (AVRounding) (AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream->time_base, out_stream->time_base, (AVRounding) (AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt.duration = av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
        pkt.pos = -1;
        int ret = av_interleaved_write_frame(ofmt_ctx, &pkt);
        if (ret < 0) {
           qDebug() << "Error muxing packet";
           //break;
        }
        av_free_packet(&pkt);

        if(done) break;
    }
    av_write_trailer(ofmt_ctx);

    avformat_close_input(&ifmt_ctx);
    /* close output */
    if (ofmt_ctx && !(ofmt->flags & AVFMT_NOFILE))
       avio_close(ofmt_ctx->pb);
    avformat_free_context(ofmt_ctx);
    if (ret < 0 && ret != AVERROR_EOF) {
        //return -1;
       //fprintf(stderr, "Error occurred: %s\n", av_err2str(ret));
    }
}

The av_interleaved_write_frame returns an error with the video packets. The end file shows only the first frame but the audio seems to be ok.

On the console this is what is printed:

Input #0, dshow, from 'video=Integrated Camera:audio=Microfone interno (Conexant 206':
  Duration: N/A, start: 146544.738000, bitrate: 1411 kb/s
    Stream #0:0: Video: rawvideo, bgr24, 640x480, 30 tbr, 10000k tbn, 30 tbc
    Stream #0:1: Audio: pcm_s16le, 44100 Hz, 2 channels, s16, 1411 kb/s
Output #0, avi, to 'test.avi':
    Stream #0:0: Video: rawvideo, bgr24, 640x480, q=2-31, 30 tbc
    Stream #0:1: Audio: pcm_s16le, 44100 Hz, 2 channels, s16, 1411 kb/s

[avi @ 0089f660] Using AVStream.codec.time_base as a timebase hint to the muxer is deprecated. Set AVStream.time_base instead.
[avi @ 0089f660] Using AVStream.codec.time_base as a timebase hint to the muxer is deprecated. Set AVStream.time_base instead.
[avi @ 0089f660] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 4396365 >= 4396365
[avi @ 0089f660] Too large number of skipped frames 4396359 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396360 > 60000
[avi @ 0089f660] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 4396390 >= 4396390
[avi @ 0089f660] Too large number of skipped frames 4396361 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396362 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396364 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396365 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396366 > 60000
[avi @ 0089f660] Too large number of skipped frames 4396367 > 60000

This seems to me like a simple problem to solve but I really am mostly clueless about the ffmpeg API, if someone could lead me to the right direction that would be great!

Thanks!

回答1:

Your problem seems to be somewhat specific to DirectShow. Unfortunately I don't have access to a system with DirectShow, but from the symptom it looks like the capture is not your problem. What is wrong is the muxing part. May be the format of the video packets is not directly supported in AVI, or may be the timestamps on the packets are broken.

I will recommend a few things that you should try, one at a time:

  • Try using av_write_frame instead of av_interleaved_write_frame.
  • Use a better container, like MP4 or MKV.
  • Do not try to mux the input packet to an avi file. In grabFrames take the raw video packets and dump them into a file. That should give you a file that is playable by ffplay. (You will probably have to specify resolution, pixel format and format in your ffplay command.)
  • Did the above result in a playable video file? If yes then I'd recommend that you decode the individual video packets, convert the colorspace and encode them using a common codec. (I recommend yuv420p in h264.) FFmpeg code base have two examples which should be useful - demuxing_decoding.c and decoding_encoding.c. That should give you a proper video file. (Playable in most players.)

I don't know anything about DirectShow, and I don't know your use case. So my recommendations focus on FFmpeg API. Some of it may be overkill / may not do what you want.



标签: c++ c qt video ffmpeg