Specifying audio/video for a multiple stream/multi

2019-08-27 13:27发布

Folks, I have the following ffmpeg command:

ffmpeg
    -i video1a -i video2a -i video3a -i video4a
    -i video1b -i video2b -i video3b -i video4b
    -i video1c
    -filter_complex "
        nullsrc=size=640x480 [base];
        [0:v] setpts=PTS-STARTPTS+   0/TB, scale=320x240 [1a];
        [1:v] setpts=PTS-STARTPTS+ 300/TB, scale=320x240 [2a];
        [2:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [3a];
        [3:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [4a];
        [4:v] setpts=PTS-STARTPTS+2500/TB, scale=320x240 [1b];
        [5:v] setpts=PTS-STARTPTS+ 800/TB, scale=320x240 [2b];
        [6:v] setpts=PTS-STARTPTS+ 700/TB, scale=320x240 [3b];
        [7:v] setpts=PTS-STARTPTS+ 800/TB, scale=320x240 [4b];
        [8:v] setpts=PTS-STARTPTS+3000/TB, scale=320x240 [1c];
        [base][1a] overlay=eof_action=pass [o1];
        [o1][1b] overlay=eof_action=pass [o1];
        [o1][1c] overlay=eof_action=pass:shortest=1 [o1];
        [o1][2a] overlay=eof_action=pass:x=320 [o2];
        [o2][2b] overlay=eof_action=pass:x=320 [o2];
        [o2][3a] overlay=eof_action=pass:y=240 [o3];
        [o3][3b] overlay=eof_action=pass:y=240 [o3];
        [o3][4a] overlay=eof_action=pass:x=320:y=240[o4];
        [o4][4b] overlay=eof_action=pass:x=320:y=240"
    -c:v libx264 output.mp4

I have just found out something regarding the files I will be processing with above command: that some mp4 files are video/audio, some mp4 files are audio alone and some mp4 files are video alone. I am already able to determine which ones have audio/video/both using ffprobe. My question is how do I modify above command to state what each file contains (video/audio/both).

This is the scenario of which file has video/audio/both:

video   time
======= =========
Area 1:
video1a    audio
video1b     both
video1c    video

Area 2:
video2a    video
video2b    audio

Area 3:
video3a    video
video3b    audio

Area 4:
video4a    video
video4b    both

My question is how to correctly modify command above to specify what the file has (audio/video/both). Thank you.

Update #1

I ran test as follows:

-i "video1a.flv"
-i "video1b.flv"
-i "video1c.flv"
-i "video2a.flv"
-i "video3a.flv"
-i "video4a.flv"
-i "video4b.flv"
-i "video4c.flv"
-i "video4d.flv"
-i "video4e.flv"

-filter_complex 

nullsrc=size=640x480[base];
[0:v]setpts=PTS-STARTPTS+120/TB,scale=320x240[1a];
[1:v]setpts=PTS-STARTPTS+3469115/TB,scale=320x240[1b];
[2:v]setpts=PTS-STARTPTS+7739299/TB,scale=320x240[1c];
[5:v]setpts=PTS-STARTPTS+4390466/TB,scale=320x240[4a];
[6:v]setpts=PTS-STARTPTS+6803937/TB,scale=320x240[4b];
[7:v]setpts=PTS-STARTPTS+8242005/TB,scale=320x240[4c];
[8:v]setpts=PTS-STARTPTS+9811577/TB,scale=320x240[4d];
[9:v]setpts=PTS-STARTPTS+10765190/TB,scale=320x240[4e];
[base][1a]overlay=eof_action=pass[o1];
[o1][1b]overlay=eof_action=pass[o1];
[o1][1c]overlay=eof_action=pass:shortest=1[o1];
[o1][4a]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4b]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4c]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4d]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4e]overlay=eof_action=pass:x=320:y=240;
[0:a]asetpts=PTS-STARTPTS+120/TB,aresample=async=1,apad[a1a];
[1:a]asetpts=PTS-STARTPTS+3469115/TB,aresample=async=1,apad[a1b];
[2:a]asetpts=PTS-STARTPTS+7739299/TB,aresample=async=1[a1c];
[3:a]asetpts=PTS-STARTPTS+82550/TB,aresample=async=1,apad[a2a];
[4:a]asetpts=PTS-STARTPTS+2687265/TB,aresample=async=1,apad[a3a];
[a1a][a1b][a1c][a2a][a3a]amerge=inputs=5

-c:v libx264 -c:a aac -ac 2 output.mp4

This is the stream data from ffmpeg:

Input #0
  Stream #0:0: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn
  Stream #0:1: Audio: nellymoser, 11025 Hz, mono, flt

Input #1
  Stream #1:0: Audio: nellymoser, 11025 Hz, mono, flt
  Stream #1:1: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn

Input #2
  Stream #2:0: Audio: nellymoser, 11025 Hz, mono, flt
  Stream #2:1: Video: vp6f, yuv420p, 160x128, 1k tbr, 1k tbn

Input #3
  Stream #3:0: Audio: nellymoser, 11025 Hz, mono, flt

Input #4
  Stream #4:0: Audio: nellymoser, 11025 Hz, mono, flt

Input #5
  Stream #5:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn

Input #6
  Stream #6:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn

Input #7
  Stream #7:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn

Input #8
  Stream #8:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn

Input #9
  Stream #9:0: Video: vp6f, yuv420p, 1680x1056, 1k tbr, 1k tbn

This is the error:

Stream mapping:
  Stream #0:0 (vp6f) -> setpts
  Stream #0:1 (nellymoser) -> asetpts

  Stream #1:0 (nellymoser) -> asetpts
  Stream #1:1 (vp6f) -> setpts

  Stream #2:0 (nellymoser) -> asetpts
  Stream #2:1 (vp6f) -> setpts

  Stream #3:0 (nellymoser) -> asetpts

  Stream #4:0 (nellymoser) -> asetpts

  Stream #5:0 (vp6f) -> setpts

  Stream #6:0 (vp6f) -> setpts

  Stream #7:0 (vp6f) -> setpts

  Stream #8:0 (vp6f) -> setpts

  Stream #9:0 (vp6f) -> setpts

  overlay -> Stream #0:0 (libx264)
  amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help

Enter command: <target>|all <time>|-1 <command>[ <argument>]

Parse error, at least 3 arguments were expected, only 1 given in string 'ho Oscar'
[Parsed_amerge_39 @ 0aa147c0] No channel layout for input 1
    Last message repeated 1 times
[AVFilterGraph @ 05e01900] The following filters could not choose their formats: Parsed_amerge_39
Consider inserting the (a)format filter near their input or output.
Error reinitializing filters!
Failed to inject frame into filter network: I/O error
Error while processing the decoded data for stream #4:0
Conversion failed!

Update #2

Would it be like this:

-i "video1a.flv"
-i "video1b.flv"
-i "video1c.flv"
-i "video2a.flv"
-i "video3a.flv"
-i "video4a.flv"
-i "video4b.flv"
-i "video4c.flv"
-i "video4d.flv"
-i "video4e.flv"

-filter_complex 

nullsrc=size=640x480[base];
[0:v]setpts=PTS-STARTPTS+120/TB,scale=320x240[1a];
[1:v]setpts=PTS-STARTPTS+3469115/TB,scale=320x240[1b];
[2:v]setpts=PTS-STARTPTS+7739299/TB,scale=320x240[1c];
[5:v]setpts=PTS-STARTPTS+4390466/TB,scale=320x240[4a];
[6:v]setpts=PTS-STARTPTS+6803937/TB,scale=320x240[4b];
[7:v]setpts=PTS-STARTPTS+8242005/TB,scale=320x240[4c];
[8:v]setpts=PTS-STARTPTS+9811577/TB,scale=320x240[4d];
[9:v]setpts=PTS-STARTPTS+10765190/TB,scale=320x240[4e];
[base][1a]overlay=eof_action=pass[o1];
[o1][1b]overlay=eof_action=pass[o1];
[o1][1c]overlay=eof_action=pass:shortest=1[o1];
[o1][4a]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4b]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4c]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4d]overlay=eof_action=pass:x=320:y=240[o4];
[o4][4e]overlay=eof_action=pass:x=320:y=240;
[0:a]asetpts=PTS-STARTPTS+120/TB,aresample=async=1,pan=1c|c0=c0,apad[a1a];
[1:a]asetpts=PTS-STARTPTS+3469115/TB,aresample=async=1,pan=1c|c0=c0,apad[a1b];
[2:a]asetpts=PTS-STARTPTS+7739299/TB,aresample=async=1,pan=1c|c0=c0[a1c];
[3:a]asetpts=PTS-STARTPTS+82550/TB,aresample=async=1,pan=1c|c0=c0,apad[a2a];
[4:a]asetpts=PTS-STARTPTS+2687265/TB,aresample=async=1,pan=1c|c0=c0,apad[a3a];
[a1a][a1b][a1c][a2a][a3a]amerge=inputs=5

-c:v libx264 -c:a aac -ac 2 output.mp4

Update #3

Now getting this error:

Stream mapping:
  Stream #0:0 (vp6f) -> setpts
  Stream #0:1 (nellymoser) -> asetpts
  Stream #1:0 (nellymoser) -> asetpts
  Stream #1:1 (vp6f) -> setpts
  Stream #2:0 (nellymoser) -> asetpts
  Stream #2:1 (vp6f) -> setpts
  Stream #3:0 (nellymoser) -> asetpts
  Stream #4:0 (nellymoser) -> asetpts
  Stream #5:0 (vp6f) -> setpts
  Stream #6:0 (vp6f) -> setpts
  Stream #7:0 (vp6f) -> setpts
  Stream #8:0 (vp6f) -> setpts
  Stream #9:0 (vp6f) -> setpts
  overlay -> Stream #0:0 (libx264)
  amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help

Enter command: <target>|all <time>|-1 <command>[ <argument>]

Parse error, at least 3 arguments were expected, only 1 given in string 'ho Oscar'
[Parsed_amerge_44 @ 0a9808c0] No channel layout for input 1
[Parsed_amerge_44 @ 0a9808c0] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[Parsed_pan_27 @ 07694800] Pure channel mapping detected: 0
[Parsed_pan_31 @ 07694a80] Pure channel mapping detected: 0
[Parsed_pan_35 @ 0a980300] Pure channel mapping detected: 0
[Parsed_pan_38 @ 0a980500] Pure channel mapping detected: 0
[Parsed_pan_42 @ 0a980780] Pure channel mapping detected: 0
[libx264 @ 06ad78c0] using SAR=1/1
[libx264 @ 06ad78c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 06ad78c0] profile High, level 3.0
[libx264 @ 06ad78c0] 264 - core 155 r2901 7d0ff22 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=15 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    canSeekToEnd    : false
    encoder         : Lavf58.16.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 640x480 [SAR 1:1 DAR 4:3], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.19.102 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 11025 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      encoder         : Lavc58.19.102 aac
...
...
Error while processing the decoded data for stream #1:1
[libx264 @ 06ad78c0] frame I:133   Avg QP: 8.58  size:  6481
[libx264 @ 06ad78c0] frame P:8358  Avg QP:17.54  size:  1386
[libx264 @ 06ad78c0] frame B:24582 Avg QP:24.27  size:   105
[libx264 @ 06ad78c0] consecutive B-frames:  0.6%  0.5%  0.7% 98.1%
[libx264 @ 06ad78c0] mb I  I16..4: 78.3% 16.1%  5.6%
[libx264 @ 06ad78c0] mb P  I16..4:  0.3%  0.7%  0.1%  P16..4:  9.6%  3.0%  1.4%  0.0%  0.0%    skip:84.9%
[libx264 @ 06ad78c0] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  2.9%  0.1%  0.0%  direct: 0.2%  skip:96.8%  L0:47.0% L1:49.0% BI: 4.0%
[libx264 @ 06ad78c0] 8x8 transform intra:35.0% inter:70.1%
[libx264 @ 06ad78c0] coded y,uvDC,uvAC intra: 36.8% 43.7% 27.3% inter: 1.6% 3.0% 0.1%
[libx264 @ 06ad78c0] i16 v,h,dc,p: 79%  8%  4%  9%
[libx264 @ 06ad78c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 32% 20% 12%  3%  6%  8%  6%  5%  7%
[libx264 @ 06ad78c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 38% 22%  9%  4%  6%  7%  5%  5%  4%
[libx264 @ 06ad78c0] i8c dc,h,v,p: 62% 15% 16%  7%
[libx264 @ 06ad78c0] Weighted P-Frames: Y:0.6% UV:0.5%
[libx264 @ 06ad78c0] ref P L0: 65.4% 12.3% 14.3%  7.9%  0.0%
[libx264 @ 06ad78c0] ref B L0: 90.2%  7.5%  2.3%
[libx264 @ 06ad78c0] ref B L1: 96.3%  3.7%
[libx264 @ 06ad78c0] kb/s:90.81
[aac @ 06ad8480] Qavg: 65519.970
[aac @ 06ad8480] 2 frames left in the queue on closing
Conversion failed!

标签: ffmpeg
1条回答
甜甜的少女心
2楼-- · 2019-08-27 13:56

Use

ffmpeg
    -i video1a -i video2a -i video3a -i video4a
    -i video1b -i video2b -i video3b -i video4b
    -i video1c
    -filter_complex "
        nullsrc=size=640x480 [base];
        [1:v] setpts=PTS-STARTPTS+ 300/TB, scale=320x240 [2a];
        [2:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [3a];
        [3:v] setpts=PTS-STARTPTS+ 400/TB, scale=320x240 [4a];
        [4:v] setpts=PTS-STARTPTS+2500/TB, scale=320x240 [1b];
        [7:v] setpts=PTS-STARTPTS+2500/TB, scale=320x240 [4b];
        [8:v] setpts=PTS-STARTPTS+3000/TB, scale=320x240 [1c];
        [base][1b] overlay=eof_action=pass [o1];
        [o1][1c] overlay=eof_action=pass:shortest=1 [o1];
        [o1][2a] overlay=eof_action=pass:x=320 [o2];
        [o2][3a] overlay=eof_action=pass:y=240 [o3];
        [o3][4a] overlay=eof_action=pass:x=320:y=240[o4];
        [o4][4b] overlay=eof_action=pass:x=320:y=240;
        [0:a] asetpts=PTS-STARTPTS+   0/TB, aresample=async=1, apad [a1a];
        [4:a] asetpts=PTS-STARTPTS+2500/TB, aresample=async=1 [a1b];
        [5:a] asetpts=PTS-STARTPTS+ 800/TB, aresample=async=1, apad [a2b];
        [6:a] asetpts=PTS-STARTPTS+ 700/TB, aresample=async=1, apad [a3b];
        [7:a] asetpts=PTS-STARTPTS+ 800/TB, aresample=async=1, apad [a4b];
        [a1a][a1b][a2b][a3b][a4b]amerge=inputs=5"
    -c:v libx264 -c:a aac -ac 2 output.mp4

For each video stream, the timestamp and scale filters should be applied, and finally overlaid.

For each audio stream, timestamp filters should be applied for time offset, then aresample to insert silence till the start time, then apad to extend the end of the audio with silence. The apad should be skipped for the audio stream which ends last. The amerge joins all processed audio streams and ends with the stream when the last audio ends.

查看更多
登录 后发表回答