I'm developing an app which is able to record video from a webcam and audio from a microphone. I've been using QT but unfortunately the camera module does not work on windows which led me to use ffmpeg to record the video/audio.
My Camera module is now working well besides a slight problem with syncing. The audio and video sometimes end up out of sync by a small difference (less than 1 second I'd say, although it might be worse with longer recordings).
When I encode the frames I add the PTS in the following way (which I took from the muxing.c example):
- For the video frames I increment the PTS one by one (starting at 0).
- For the audio frames I increment the PTS by the
nb_samples
of the audio frame (starting at 0).
I am saving the file at 25 fps and asking for the camera to give me 25 fps (which it can). I am also converting the video frames to the YUV420P
format. For the audio frames conversion I need to use a AVAudioFifo
because the microfone sends bigger samples than the mp4 stream supports, so I have to split them in chuncks. I used the transcode.c example for this.
I am out of ideas in what I should do to sync the audio and video. Do I need to use a clock or something to correctly sync up both streams?
The full code is too big to post here but should it be necessary I can add it to github for example.
Here is the code for writing a frame:
int FFCapture::writeFrame(const AVRational *time_base, AVStream *stream, AVPacket *pkt) {
/* rescale output packet timestamp values from codec to stream timebase */
av_packet_rescale_ts(pkt, *time_base, stream->time_base);
pkt->stream_index = stream->index;
/* Write the compressed frame to the media file. */
return av_interleaved_write_frame(oFormatContext, pkt);
}
Code for getting the elapsed time:
qint64 FFCapture::getElapsedTime(qint64 *previousTime) {
qint64 newTime = timer.elapsed();
if(newTime > *previousTime) {
*previousTime = newTime;
return newTime;
}
return -1;
}
Code for adding the PTS (video and audio stream, respectively):
qint64 time = getElapsedTime(&previousVideoTime);
if(time >= 0) outFrame->pts = time;
//if(time >= 0) outFrame->pts = av_rescale_q(time, outStream.videoStream->codec->time_base, outStream.videoStream->time_base);
qint64 time = getElapsedTime(&previousAudioTime);
if(time >= 0) {
AVRational aux;
aux.num = 1;
aux.den = 1000;
outFrame->pts = time;
//outFrame->pts = av_rescale_q(time, aux, outStream.audioStream->time_base);
}