I have MPEG-TS files on the device. I would like to cut a fairly-exact time off the start of the files on-device.
Using FFmpegWrapper as a base, I'm hoping to achieve this.
I'm a little lost on the C API of ffmpeg, however. Where do I start?
I tried just dropping all packets prior to a start PTS I was looking for, but this broke the video stream.
packet->pts = av_rescale_q(packet->pts, inputStream.stream->time_base, outputStream.stream->time_base);
packet->dts = av_rescale_q(packet->dts, inputStream.stream->time_base, outputStream.stream->time_base);
if(startPts == 0){
startPts = packet->pts;
}
if(packet->pts < cutTimeStartPts + startPts){
av_free_packet(packet);
continue;
}
How do I cut off part of the start of the input file without destroying the video stream? When played back to back, I want 2 cut segments to run seamlessly together.
ffmpeg -i time.ts -c:v libx264 -c:a copy -ss $CUT_POINT -map 0 -y after.ts
ffmpeg -i time.ts -c:v libx264 -c:a copy -to $CUT_POINT -map 0 -y before.ts
Seems to be what I need. I think the re-encode is needed so the video can start at any arbitrary point and not an existing keyframe. If there's a more efficient solution, that's great. If not, this is good enough.
EDIT: Here's my attempt. I'm cobbling together various pieces I don't fully understand copied from here. I'm leaving off the "cutting" piece for now to try and get audio + video encoded written without layering complexity. I get EXC_BAD_ACCESS on avcodec_encode_video2(...)
- (void)convertInputPath:(NSString *)inputPath outputPath:(NSString *)outputPath
options:(NSDictionary *)options progressBlock:(FFmpegWrapperProgressBlock)progressBlock
completionBlock:(FFmpegWrapperCompletionBlock)completionBlock {
dispatch_async(conversionQueue, ^{
FFInputFile *inputFile = nil;
FFOutputFile *outputFile = nil;
NSError *error = nil;
inputFile = [[FFInputFile alloc] initWithPath:inputPath options:options];
outputFile = [[FFOutputFile alloc] initWithPath:outputPath options:options];
[self setupDirectStreamCopyFromInputFile:inputFile outputFile:outputFile];
if (![outputFile openFileForWritingWithError:&error]) {
[self finishWithSuccess:NO error:error completionBlock:completionBlock];
return;
}
if (![outputFile writeHeaderWithError:&error]) {
[self finishWithSuccess:NO error:error completionBlock:completionBlock];
return;
}
AVRational default_timebase;
default_timebase.num = 1;
default_timebase.den = AV_TIME_BASE;
FFStream *outputVideoStream = outputFile.streams[0];
FFStream *inputVideoStream = inputFile.streams[0];
AVFrame *frame;
AVPacket inPacket, outPacket;
frame = avcodec_alloc_frame();
av_init_packet(&inPacket);
while (av_read_frame(inputFile.formatContext, &inPacket) >= 0) {
if (inPacket.stream_index == 0) {
int frameFinished;
avcodec_decode_video2(inputVideoStream.stream->codec, frame, &frameFinished, &inPacket);
// if (frameFinished && frame->pkt_pts >= starttime_int64 && frame->pkt_pts <= endtime_int64) {
if (frameFinished){
av_init_packet(&outPacket);
int output;
avcodec_encode_video2(outputVideoStream.stream->codec, &outPacket, frame, &output);
if (output) {
if (av_write_frame(outputFile.formatContext, &outPacket) != 0) {
fprintf(stderr, "convert(): error while writing video frame\n");
[self finishWithSuccess:NO error:nil completionBlock:completionBlock];
}
}
av_free_packet(&outPacket);
}
if (frame->pkt_pts > endtime_int64) {
break;
}
}
}
av_free_packet(&inPacket);
if (![outputFile writeTrailerWithError:&error]) {
[self finishWithSuccess:NO error:error completionBlock:completionBlock];
return;
}
[self finishWithSuccess:YES error:nil completionBlock:completionBlock];
});
}
The FFmpeg (libavformat/codec, in this case) API maps the ffmpeg.exe commandline arguments pretty closely. To open a file, use avformat_open_input_file(). The last two arguments can be NULL. This fills in the AVFormatContext for you. Now you start reading frames using av_read_frame() in a loop. The pkt.stream_index will tell you which stream each packet belongs to, and avformatcontext->streams[pkt.stream_index] is the accompanying stream information, which tells you what codec it uses, whether it's video/audio, etc. Use avformat_close() to shut down.
For muxing, you use the inverse, see muxing for details. Basically it's allocate, avio_open2, add streams for each existing stream in the input file (basically context->streams[]), avformat_write_header(), av_interleaved_write_frame() in a loop, av_write_trailer() to shut down (and free the allocated context in the end).
Encoding/decoding of the video stream(s) is done using libavcodec. For each AVPacket you get from the muxer, use avcodec_decode_video2(). Use avcodec_encode_video2() for encoding of the output AVFrame. Note that both will introduce delay so the first few calls to each function will not return any data and you need to flush cached data by calling each function with NULL input data to get the tail packets/frames out of it. av_interleave_write_frame will interleave packets correctly so the video/audio stream will not desync (as in: video packets of the same timestamp occur MBs after audio packets in the ts file).
If you need more detailed examples for avcodec_decode_video2, avcodec_encode_video2, av_read_frame or av_interleaved_write_frame, just Google "$function example" and you'll see full-fledged examples showing how to use them correctly. For x264 encoding, set some default parameters in the AVCodecContext when calling avcodec_open2 for encoding quality settings. In the C API, you do that using AVDictionary, e.g.:
AVDictionary opts = *NULL;
av_dict_set(&opts, "preset", "veryslow", 0);
// use either crf or b, not both! See the link above on H264 encoding options
av_dict_set_int(&opts, "b", 1000, 0);
av_dict_set_int(&opts, "crf", 10, 0);
[edit] Oh I forgot one part, the timestamping. Each AVPacket and AVFrame has a pts variable in its struct, and you can use that to decide whether to include the packet/frame in the output stream. So for audio, you'd use AVPacket.pts from the demuxing step as a delimiter, and for video, you'd use AVFrame.pts from the decoding step as a delimited. Their respective documentation tells you in what unit they are.
[edit2] I see you're still having some issues without actual code, so here's a real (working) transcoder which re-codes video and re-muxes audio. It probably has tons of bugs, leaks and lacks proper error reporting, it also doesn't deal with timestamps (I'm leaving that to you as an exercise), but it does the basic things that you asked for:
#include <stdio.h>
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
static AVFormatContext *inctx, *outctx;
#define MAX_STREAMS 16
static AVCodecContext *inavctx[MAX_STREAMS];
static AVCodecContext *outavctx[MAX_STREAMS];
static int openInputFile(const char *file) {
int res;
inctx = NULL;
res = avformat_open_input(& inctx, file, NULL, NULL);
if (res != 0)
return res;
res = avformat_find_stream_info(inctx, NULL);
if (res < 0)
return res;
return 0;
}
static void closeInputFile(void) {
int n;
for (n = 0; n < inctx->nb_streams; n++)
if (inavctx[n]) {
avcodec_close(inavctx[n]);
avcodec_free_context(&inavctx[n]);
}
avformat_close_input(&inctx);
}
static int openOutputFile(const char *file) {
int res, n;
outctx = avformat_alloc_context();
outctx->oformat = av_guess_format(NULL, file, NULL);
if ((res = avio_open2(&outctx->pb, file, AVIO_FLAG_WRITE, NULL, NULL)) < 0)
return res;
for (n = 0; n < inctx->nb_streams; n++) {
AVStream *inst = inctx->streams[n];
AVCodecContext *inc = inst->codec;
if (inc->codec_type == AVMEDIA_TYPE_VIDEO) {
// video decoder
inavctx[n] = avcodec_alloc_context3(inc->codec);
avcodec_copy_context(inavctx[n], inc);
if ((res = avcodec_open2(inavctx[n], avcodec_find_decoder(inc->codec_id), NULL)) < 0)
return res;
// video encoder
AVCodec *encoder = avcodec_find_encoder_by_name("libx264");
AVStream *outst = avformat_new_stream(outctx, encoder);
outst->codec->width = inavctx[n]->width;
outst->codec->height = inavctx[n]->height;
outst->codec->pix_fmt = inavctx[n]->pix_fmt;
AVDictionary *dict = NULL;
av_dict_set(&dict, "preset", "veryslow", 0);
av_dict_set_int(&dict, "crf", 10, 0);
outavctx[n] = avcodec_alloc_context3(encoder);
avcodec_copy_context(outavctx[n], outst->codec);
if ((res = avcodec_open2(outavctx[n], encoder, &dict)) < 0)
return res;
} else if (inc->codec_type == AVMEDIA_TYPE_AUDIO) {
avformat_new_stream(outctx, inc->codec);
inavctx[n] = outavctx[n] = NULL;
} else {
fprintf(stderr, "Don’t know what to do with stream %d\n", n);
return -1;
}
}
if ((res = avformat_write_header(outctx, NULL)) < 0)
return res;
return 0;
}
static void closeOutputFile(void) {
int n;
av_write_trailer(outctx);
for (n = 0; n < outctx->nb_streams; n++)
if (outctx->streams[n]->codec)
avcodec_close(outctx->streams[n]->codec);
avformat_free_context(outctx);
}
static int encodeFrame(int stream_index, AVFrame *frame, int *gotOutput) {
AVPacket outPacket;
int res;
av_init_packet(&outPacket);
if ((res = avcodec_encode_video2(outavctx[stream_index], &outPacket, frame, gotOutput)) < 0) {
fprintf(stderr, "Failed to encode frame\n");
return res;
}
if (*gotOutput) {
outPacket.stream_index = stream_index;
if ((res = av_interleaved_write_frame(outctx, &outPacket)) < 0) {
fprintf(stderr, "Failed to write packet\n");
return res;
}
}
av_free_packet(&outPacket);
return 0;
}
static int decodePacket(int stream_index, AVPacket *pkt, AVFrame *frame, int *frameFinished) {
int res;
if ((res = avcodec_decode_video2(inavctx[stream_index], frame,
frameFinished, pkt)) < 0) {
fprintf(stderr, "Failed to decode frame\n");
return res;
}
if (*frameFinished){
int hasOutput;
frame->pts = frame->pkt_pts;
return encodeFrame(stream_index, frame, &hasOutput);
} else {
return 0;
}
}
int main(int argc, char *argv[]) {
char *input = argv[1];
char *output = argv[2];
int res, n;
printf("Converting %s to %s\n", input, output);
av_register_all();
if ((res = openInputFile(input)) < 0) {
fprintf(stderr, "Failed to open input file %s\n", input);
return res;
}
if ((res = openOutputFile(output)) < 0) {
fprintf(stderr, "Failed to open output file %s\n", input);
return res;
}
AVFrame *frame = av_frame_alloc();
AVPacket inPacket;
av_init_packet(&inPacket);
while (av_read_frame(inctx, &inPacket) >= 0) {
if (inavctx[inPacket.stream_index] != NULL) {
int frameFinished;
if ((res = decodePacket(inPacket.stream_index, &inPacket, frame, &frameFinished)) < 0) {
return res;
}
} else {
if ((res = av_interleaved_write_frame(outctx, &inPacket)) < 0) {
fprintf(stderr, "Failed to write packet\n");
return res;
}
}
}
for (n = 0; n < inctx->nb_streams; n++) {
if (inavctx[n]) {
// flush decoder
int frameFinished;
do {
inPacket.data = NULL;
inPacket.size = 0;
if ((res = decodePacket(n, &inPacket, frame, &frameFinished)) < 0)
return res;
} while (frameFinished);
// flush encoder
int gotOutput;
do {
if ((res = encodeFrame(n, NULL, &gotOutput)) < 0)
return res;
} while (gotOutput);
}
}
av_free_packet(&inPacket);
closeInputFile();
closeOutputFile();
return 0;
}
Check out the accepted answer of this question.
Basically in short, you can use:
ffmpeg -i time.ts -c:v libx264 -c:a copy -ss $CUT_POINT -map 0 -y after.ts
ffmpeg -i time.ts -c:v libx264 -c:a copy -to $CUT_POINT -map 0 -y before.ts
Just for reference, the accepted answer of that question is:
How do I split and join files using ffmpeg
while retaining all audio tracks?
As you have discovered, a bitstream copy will select only one (audio) track, as per the stream specification documentation:
By default, ffmpeg
includes only one stream of each type (video, audio, subtitle) present in the input files and adds them to each output file. It picks the "best" of each based upon the following criteria: for video, it is the stream with the highest resolution, for audio, it is the stream with the most channels, for subtitles, it is the first subtitle stream. In the case where several streams of the same type rate equally, the stream with the lowest index is chosen.
To select all audio tracks:
ffmpeg -i InputFile.ts-c copy -ss 00:12:34.567 -t 00:34:56.789 -map 0:v -map 0:a FirstFile.ts
To select the third audio track:
ffmpeg -i InputFile.ts -c copy -ss 00:12:34.567 -t 00:34:56.789 -map 0:v -map 0:a:2 FirstFile.ts
You can read more about and see other examples of stream selection in the advanced options section of the ffmpeg
documentation.
I would also combine -vcodec copy -acodec copy
from your original command into -c copy
as above for compactness of expression.
Split:
So, combining those with what you want to achieve in the two files in terms of splitting for later re-joining:
ffmpeg -i InputOne.ts -ss 00:02:00.0 -c copy -map 0:v -map 0:a OutputOne.ts
ffmpeg -i InputTwo.ts -c copy -t 00:03:05.0 -map 0:v -map 0:a OutputTwo.ts
will give you:
OutputOne.ts
, which is everything after the first two minutes of the first input file
OutputTwo.ts
, which is the first 3 minutes and 5 seconds of the second input file
Join:
ffmpeg
supports concatenation of files without re-encoding, described extensively in its concatenation documentation.
Create your listing of files to be joined (eg join.txt
):
file '/path/to/files/OutputOne.ts'
file '/path/to/files/OutputTwo.ts'
Then your ffmpeg
command can use the concat demuxer
:
ffmpeg -f concat -i join.txt -c copy FinalOutput.ts
Since you are working with mpeg
transport streams (.ts
), you should be able to use the concat protocol as well:
ffmpeg -i "concat:OutputOne.ts|OutputTwo.ts" -c copy -bsf:a aac_adtstoasc output.mp4
Per the example on the concat page linked above. I'll leave that up to you to experiment with.