Folks, I have the following ffmpeg command:
ffmpeg
-i video1a -i video2a -i video3a -i video4a
-i video1b -i video2b -i video3b -i video4b
-i video1c
-filter_complex "
nullsrc=size=640x480 [base];
[0:v] setpts=PTS-STARTPTS+ 0/TB, scale=320x240 [1a];
[1:v] setpts=PTS-STARTPTS+ 300/TB, scale=320x240 [2a];
[2:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [3a];
[3:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [4a];
[4:v] setpts=PTS-STARTPTS+2500/TB, scale=320x240 [1b];
[5:v] setpts=PTS-STARTPTS+ 800/TB, scale=320x240 [2b];
[6:v] setpts=PTS-STARTPTS+ 700/TB, scale=320x240 [3b];
[7:v] setpts=PTS-STARTPTS+ 800/TB, scale=320x240 [4b];
[8:v] setpts=PTS-STARTPTS+3000/TB, scale=320x240 [1c];
[base][1a] overlay=eof_action=pass [o1];
[o1][1b] overlay=eof_action=pass [o1];
[o1][1c] overlay=eof_action=pass:shortest=1 [o1];
[o1][2a] overlay=eof_action=pass:x=320 [o2];
[o2][2b] overlay=eof_action=pass:x=320 [o2];
[o2][3a] overlay=eof_action=pass:y=240 [o3];
[o3][3b] overlay=eof_action=pass:y=240 [o3];
[o3][4a] overlay=eof_action=pass:x=320:y=240[o4];
[o4][4b] overlay=eof_action=pass:x=320:y=240"
-c:v libx264 output.mp4
I have just found out something regarding the files I will be processing with above command: that some mp4 files are video/audio, some mp4 files are audio alone and some mp4 files are video alone. I am already able to determine which ones have audio/video/both using ffprobe. My question is how do I modify above command to state what each file contains (video/audio/both).
This is the scenario of which file has video/audio/both:
video time
======= =========
Area 1:
video1a audio
video1b both
video1c video
Area 2:
video2a video
video2b audio
Area 3:
video3a video
video3b audio
Area 4:
video4a video
video4b both
My question is how to correctly modify command above to specify what the file has (audio/video/both). Thank you.
Update #1
I ran test as follows:
-i "video1a.flv"
-i "video1b.flv"
-i "video1c.flv"
-i "video2a.flv"
-i "video3a.flv"
-i "video4a.flv"
-i "video4b.flv"
-i "video4c.flv"
-i "video4d.flv"
-i "video4e.flv"
-filter_complex
nullsrc=size=640x480[base];
[0:v]setpts=PTS-STARTPTS+120/TB,scale=320x240[1a];
[1:v]setpts=PTS-STARTPTS+3469115/TB,scale=320x240[1b];
[2:v]setpts=PTS-STARTPTS+7739299/TB,scale=320x240[1c];
[5:v]setpts=PTS-STARTPTS+4390466/TB,scale=320x240[4a];
[6:v]setpts=PTS-STARTPTS+6803937/TB,scale=320x240[4b];
[7:v]setpts=PTS-STARTPTS+8242005/TB,scale=320x240[4c];
[8:v]setpts=PTS-STARTPTS+9811577/TB,scale=320x240[4d];
[9:v]setpts=PTS-STARTPTS+10765190/TB,scale=320x240[4e];
[base][1a]overlay=eof_action=pass[o1];
[o1][1b]overlay=eof_action=pass[o1];
[o1][1c]overlay=eof_action=pass:shortest=1[o1];
[o1][4a]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4b]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4c]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4d]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4e]overlay=eof_action=pass:x=320:y=240;
[0:a]asetpts=PTS-STARTPTS+120/TB,aresample=async=1,apad[a1a];
[1:a]asetpts=PTS-STARTPTS+3469115/TB,aresample=async=1,apad[a1b];
[2:a]asetpts=PTS-STARTPTS+7739299/TB,aresample=async=1[a1c];
[3:a]asetpts=PTS-STARTPTS+82550/TB,aresample=async=1,apad[a2a];
[4:a]asetpts=PTS-STARTPTS+2687265/TB,aresample=async=1,apad[a3a];
[a1a][a1b][a1c][a2a][a3a]amerge=inputs=5
-c:v libx264 -c:a aac -ac 2 output.mp4
This is the stream data from ffmpeg:
Input #0
Stream #0:0: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn
Stream #0:1: Audio: nellymoser, 11025 Hz, mono, flt
Input #1
Stream #1:0: Audio: nellymoser, 11025 Hz, mono, flt
Stream #1:1: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn
Input #2
Stream #2:0: Audio: nellymoser, 11025 Hz, mono, flt
Stream #2:1: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn
Input #3
Stream #3:0: Audio: nellymoser, 11025 Hz, mono, flt
Input #4
Stream #4:0: Audio: nellymoser, 11025 Hz, mono, flt
Input #5
Stream #5:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn
Input #6
Stream #6:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn
Input #7
Stream #7:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn
Input #8
Stream #8:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn
Input #9
Stream #9:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn
This is the error:
Stream mapping:
Stream #0:0 (vp6f) -> setpts
Stream #0:1 (nellymoser) -> asetpts
Stream #1:0 (nellymoser) -> asetpts
Stream #1:1 (vp6f) -> setpts
Stream #2:0 (nellymoser) -> asetpts
Stream #2:1 (vp6f) -> setpts
Stream #3:0 (nellymoser) -> asetpts
Stream #4:0 (nellymoser) -> asetpts
Stream #5:0 (vp6f) -> setpts
Stream #6:0 (vp6f) -> setpts
Stream #7:0 (vp6f) -> setpts
Stream #8:0 (vp6f) -> setpts
Stream #9:0 (vp6f) -> setpts
overlay -> Stream #0:0 (libx264)
amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
Enter command: <target>|all <time>|-1 <command>[ <argument>]
Parse error, at least 3 arguments were expected, only 1 given in string 'ho Oscar'
[Parsed_amerge_39 @ 0aa147c0] No channel layout for input 1
Last message repeated 1 times
[AVFilterGraph @ 05e01900] The following filters could not choose their formats: Parsed_amerge_39
Consider inserting the (a)format filter near their input or output.
Error reinitializing filters!
Failed to inject frame into filter network: I/O error
Error while processing the decoded data for stream #4:0
Conversion failed!
Update #2
Would it be like this:
-i "video1a.flv"
-i "video1b.flv"
-i "video1c.flv"
-i "video2a.flv"
-i "video3a.flv"
-i "video4a.flv"
-i "video4b.flv"
-i "video4c.flv"
-i "video4d.flv"
-i "video4e.flv"
-filter_complex
nullsrc=size=640x480[base];
[0:v]setpts=PTS-STARTPTS+120/TB,scale=320x240[1a];
[1:v]setpts=PTS-STARTPTS+3469115/TB,scale=320x240[1b];
[2:v]setpts=PTS-STARTPTS+7739299/TB,scale=320x240[1c];
[5:v]setpts=PTS-STARTPTS+4390466/TB,scale=320x240[4a];
[6:v]setpts=PTS-STARTPTS+6803937/TB,scale=320x240[4b];
[7:v]setpts=PTS-STARTPTS+8242005/TB,scale=320x240[4c];
[8:v]setpts=PTS-STARTPTS+9811577/TB,scale=320x240[4d];
[9:v]setpts=PTS-STARTPTS+10765190/TB,scale=320x240[4e];
[base][1a]overlay=eof_action=pass[o1];
[o1][1b]overlay=eof_action=pass[o1];
[o1][1c]overlay=eof_action=pass:shortest=1[o1];
[o1][4a]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4b]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4c]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4d]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4e]overlay=eof_action=pass:x=320:y=240;
[0:a]asetpts=PTS-STARTPTS+120/TB,aresample=async=1,pan=1c|c0=c0,apad[a1a];
[1:a]asetpts=PTS-STARTPTS+3469115/TB,aresample=async=1,pan=1c|c0=c0,apad[a1b];
[2:a]asetpts=PTS-STARTPTS+7739299/TB,aresample=async=1,pan=1c|c0=c0[a1c];
[3:a]asetpts=PTS-STARTPTS+82550/TB,aresample=async=1,pan=1c|c0=c0,apad[a2a];
[4:a]asetpts=PTS-STARTPTS+2687265/TB,aresample=async=1,pan=1c|c0=c0,apad[a3a];
[a1a][a1b][a1c][a2a][a3a]amerge=inputs=5
-c:v libx264 -c:a aac -ac 2 output.mp4
Update #3
Now getting this error:
Stream mapping:
Stream #0:0 (vp6f) -> setpts
Stream #0:1 (nellymoser) -> asetpts
Stream #1:0 (nellymoser) -> asetpts
Stream #1:1 (vp6f) -> setpts
Stream #2:0 (nellymoser) -> asetpts
Stream #2:1 (vp6f) -> setpts
Stream #3:0 (nellymoser) -> asetpts
Stream #4:0 (nellymoser) -> asetpts
Stream #5:0 (vp6f) -> setpts
Stream #6:0 (vp6f) -> setpts
Stream #7:0 (vp6f) -> setpts
Stream #8:0 (vp6f) -> setpts
Stream #9:0 (vp6f) -> setpts
overlay -> Stream #0:0 (libx264)
amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
Enter command: <target>|all <time>|-1 <command>[ <argument>]
Parse error, at least 3 arguments were expected, only 1 given in string 'ho Oscar'
[Parsed_amerge_44 @ 0a9808c0] No channel layout for input 1
[Parsed_amerge_44 @ 0a9808c0] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[Parsed_pan_27 @ 07694800] Pure channel mapping detected: 0
[Parsed_pan_31 @ 07694a80] Pure channel mapping detected: 0
[Parsed_pan_35 @ 0a980300] Pure channel mapping detected: 0
[Parsed_pan_38 @ 0a980500] Pure channel mapping detected: 0
[Parsed_pan_42 @ 0a980780] Pure channel mapping detected: 0
[libx264 @ 06ad78c0] using SAR=1/1
[libx264 @ 06ad78c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 06ad78c0] profile High, level 3.0
[libx264 @ 06ad78c0] 264 - core 155 r2901 7d0ff22 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=15 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
Metadata:
canSeekToEnd : false
encoder : Lavf58.16.100
Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 640x480 [SAR 1:1 DAR 4:3], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
Metadata:
encoder : Lavc58.19.102 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 11025 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
encoder : Lavc58.19.102 aac
...
...
Error while processing the decoded data for stream #1:1
[libx264 @ 06ad78c0] frame I:133 Avg QP: 8.58 size: 6481
[libx264 @ 06ad78c0] frame P:8358 Avg QP:17.54 size: 1386
[libx264 @ 06ad78c0] frame B:24582 Avg QP:24.27 size: 105
[libx264 @ 06ad78c0] consecutive B-frames: 0.6% 0.5% 0.7% 98.1%
[libx264 @ 06ad78c0] mb I I16..4: 78.3% 16.1% 5.6%
[libx264 @ 06ad78c0] mb P I16..4: 0.3% 0.7% 0.1% P16..4: 9.6% 3.0% 1.4% 0.0% 0.0% skip:84.9%
[libx264 @ 06ad78c0] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 2.9% 0.1% 0.0% direct: 0.2% skip:96.8% L0:47.0% L1:49.0% BI: 4.0%
[libx264 @ 06ad78c0] 8x8 transform intra:35.0% inter:70.1%
[libx264 @ 06ad78c0] coded y,uvDC,uvAC intra: 36.8% 43.7% 27.3% inter: 1.6% 3.0% 0.1%
[libx264 @ 06ad78c0] i16 v,h,dc,p: 79% 8% 4% 9%
[libx264 @ 06ad78c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 32% 20% 12% 3% 6% 8% 6% 5% 7%
[libx264 @ 06ad78c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 38% 22% 9% 4% 6% 7% 5% 5% 4%
[libx264 @ 06ad78c0] i8c dc,h,v,p: 62% 15% 16% 7%
[libx264 @ 06ad78c0] Weighted P-Frames: Y:0.6% UV:0.5%
[libx264 @ 06ad78c0] ref P L0: 65.4% 12.3% 14.3% 7.9% 0.0%
[libx264 @ 06ad78c0] ref B L0: 90.2% 7.5% 2.3%
[libx264 @ 06ad78c0] ref B L1: 96.3% 3.7%
[libx264 @ 06ad78c0] kb/s:90.81
[aac @ 06ad8480] Qavg: 65519.970
[aac @ 06ad8480] 2 frames left in the queue on closing
Conversion failed!
Use
For each video stream, the timestamp and scale filters should be applied, and finally overlaid.
For each audio stream, timestamp filters should be applied for time offset, then aresample to insert silence till the start time, then apad to extend the end of the audio with silence. The apad should be skipped for the audio stream which ends last. The amerge joins all processed audio streams and ends with the stream when the last audio ends.