current ffmpeg creates shortened audio stream when filter amix

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

current ffmpeg creates shortened audio stream when filter amix

S Andreason
I am getting a shortened audio stream when including the audio filters
aresample and amix, which later makes it impossible to concat the clips,
because the different stream lengths lose sync between audio and video,
with errors:
Invalid audio PTS

First, here is the output from latest ffmpeg in debian package, which
works correctly:

$ ffmpeg-3.2.14-1~deb9u1 -i 20190922_1532_3Kf-pan-right_3969_c2t14.MOV
-i Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a
-filter_complex
"[0]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'"
-filter_complex "aresample=48000,amix" -s 1024x768 -c:v h264 -b:v 4700k
-r 30 20190922_1532_ch5.1e-3.mov
ffmpeg version 3.2.14-1~deb9u1 Copyright (c) 2000-2019 the FFmpeg developers
   built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
   configuration: --prefix=/usr --extra-version='1~deb9u1'
--toolchain=hardened --libdir=/usr/lib/i386-linux-gnu
--incdir=/usr/include/i386-linux-gnu --enable-gpl --disable-stripping
--enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa
--enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca
--enable-libcdio --enable-libebur128 --enable-libflite
--enable-libfontconfig --enable-libfreetype --enable-libfribidi
--enable-libgme --enable-libgsm --enable-libmp3lame --enable-libopenjpeg
--enable-libopenmpt --enable-libopus --enable-libpulse
--enable-librubberband --enable-libshine --enable-libsnappy
--enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora
--enable-libtwolame --enable-libvorbis --enable-libvpx
--enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid
--enable-libzmq --enable-libzvbi --enable-omx --enable-openal
--enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libiec61883
--enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264
--enable-shared
   libavutil      55. 34.101 / 55. 34.101
   libavcodec     57. 64.101 / 57. 64.101
   libavformat    57. 56.101 / 57. 56.101
   libavdevice    57.  1.100 / 57.  1.100
   libavfilter     6. 65.100 /  6. 65.100
   libavresample   3.  1.  0 /  3.  1.  0
   libswscale      4.  2.100 /  4.  2.100
   libswresample   2.  3.100 /  2.  3.100
   libpostproc    54.  1.100 / 54.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
'20190922_1532_3Kf-pan-right_3969_c2t14.MOV':
   Metadata:
     major_brand     : qt
     minor_version   : 512
     compatible_brands: qt
     encoder         : Lavf57.56.101
   Duration: 00:00:14.01, start: 0.002000, bitrate: 25128 kb/s
     Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 /
0x31637661), yuvj420p(pc, bt709), 1280x720, 23587 kb/s, 29.97 fps, 29.97
tbr, 30k tbn, 60k tbc (default)
     Metadata:
       handler_name    : DataHandler
     Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
stereo, s16, 1536 kb/s (default)
     Metadata:
       handler_name    : DataHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from
'Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a':
   Metadata:
     major_brand     : M4A
     minor_version   : 512
     compatible_brands: isomiso2
     encoder         : Lavf57.56.101
   Duration: 00:00:08.02, start: 0.000000, bitrate: 220 kb/s
     Stream #1:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz,
mono, fltp, 218 kb/s (default)
     Metadata:
       handler_name    : SoundHandler
No pixel format specified, yuvj420p for H.264 encoding chosen.
Use -pix_fmt yuv420p for compatibility with outdated media players.
[libx264 @ 0x170dc20] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
AVX LZCNT BMI1 SlowPshufb
[libx264 @ 0x170dc20] profile High, level 3.1
[libx264 @ 0x170dc20] 264 - core 148 r2748 97eaef2 - H.264/MPEG-4 AVC
codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html -
options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7
psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1
8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6
lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250
keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=abr
mbtree=1 bitrate=4700 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
ip_ratio=1.40 aq=1:1.00
Output #0, mov, to '20190922_1532_ch5.1e-3.mov':
   Metadata:
     major_brand     : qt
     minor_version   : 512
     compatible_brands: qt
     encoder         : Lavf57.56.101
     Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661),
yuvj420p(pc), 1024x768, q=-1--1, 4700 kb/s, 30 fps, 15360 tbn, 30 tbc
(default)
     Metadata:
       encoder         : Lavc57.64.101 libx264
     Side data:
       cpb: bitrate max/min/avg: 0/0/4700000 buffer size: 0 vbv_delay: -1
     Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)
     Metadata:
       encoder         : Lavc57.64.101 aac
Stream mapping:
   Stream #0:0 (h264) -> crop (graph 0)
   Stream #0:1 (pcm_s16le) -> aresample (graph 1)
   Stream #1:0 (aac) -> amix:input1 (graph 1)
   drawtext (graph 0) -> Stream #0:0 (libx264)
   amix (graph 1) -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
frame=  420 fps= 12 q=-1.0 Lsize=    8303kB time=00:00:14.01
bitrate=4853.1kbits/s speed=0.417x
video:8063kB audio:224kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.198401%
[libx264 @ 0x170dc20] frame I:2     Avg QP:14.83  size:195688
[libx264 @ 0x170dc20] frame P:106   Avg QP:19.65  size: 59553
[libx264 @ 0x170dc20] frame B:312   Avg QP:25.61  size:  4972
[libx264 @ 0x170dc20] consecutive B-frames:  1.0%  0.0%  0.0% 99.0%
[libx264 @ 0x170dc20] mb I  I16..4: 27.6% 29.0% 43.4%
[libx264 @ 0x170dc20] mb P  I16..4:  1.1%  1.3%  0.6%  P16..4: 30.5%
31.5% 22.6%  0.0%  0.0%    skip:12.4%
[libx264 @ 0x170dc20] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8: 36.1% 
7.7%  1.3%  direct: 4.2%  skip:50.7%  L0:37.5% L1:38.6% BI:23.9%
[libx264 @ 0x170dc20] final ratefactor: 18.99
[libx264 @ 0x170dc20] 8x8 transform intra:37.9% inter:54.5%
[libx264 @ 0x170dc20] coded y,uvDC,uvAC intra: 56.2% 64.1% 51.9% inter:
27.2% 19.7% 1.0%
[libx264 @ 0x170dc20] i16 v,h,dc,p: 73%  9% 14%  4%
[libx264 @ 0x170dc20] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12% 11% 40% 4%  6% 
6%  6%  5% 10%
[libx264 @ 0x170dc20] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 18% 15% 6%  9% 
9%  9%  8% 11%
[libx264 @ 0x170dc20] i8c dc,h,v,p: 54% 25% 17%  5%
[libx264 @ 0x170dc20] Weighted P-Frames: Y:9.4% UV:0.0%
[libx264 @ 0x170dc20] ref P L0: 41.9% 11.1% 40.8%  6.0%  0.2%
[libx264 @ 0x170dc20] ref B L0: 93.5%  5.9%  0.6%
[libx264 @ 0x170dc20] ref B L1: 99.4%  0.6%
[libx264 @ 0x170dc20] kb/s:4717.36
[aac @ 0x170fac0] Qavg: 582.581

Next ffprobe shows the video length:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-3.mov':
     encoder         : Lavf57.56.101
   Duration: 00:00:14.03, start: 0.000000, bitrate: 4849 kb/s
     Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661),
yuvj420p(pc), 1024x768, 4717 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (defaul
       encoder         : Lavc57.64.101 libx264
     Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 131 kb/s (default)

And to get the ACTUAL audio length, I split the audio stream to it's own
file.mpa using ffmpeg, then ffprobe:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-3.m4a':
     encoder         : Lavf58.33.100
   Duration: 00:00:14.03, start: 0.000000, bitrate: 133 kb/s
     Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 131 kb/s (default)

Then I repeat the above with only the change to use ffmpeg current by git:

$ ffmpeg -i 20190922_1532_3Kf-pan-right_3969_c2t14.MOV -i
Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a -filter_complex
"[0]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'"
-filter_complex "aresample=48000,amix" -s 1024x768 -c:v h264 -b:v 4700k
-r 30 20190922_1532_ch5.1e-g.mov
ffmpeg version N-95129-g04858650b1 Copyright (c) 2000-2019 the FFmpeg
developers
   built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
   configuration: --prefix=/usr/local --enable-gpl --enable-libmp3lame
--enable-libvorbis --enable-libx264 --enable-libopenjpeg
--enable-libfreetype --disable-doc --disable-htmlpages
--disable-podpages --enable-shared --enable-libvpx
--extra-cflags=-I/usr/include --extra-ldflags=-L/usr/lib/i386-linux-gnu
--enable-libass --enable-libtesseract
   libavutil      56. 35.100 / 56. 35.100
   libavcodec     58. 59.101 / 58. 59.101
   libavformat    58. 33.100 / 58. 33.100
   libavdevice    58.  9.100 / 58.  9.100
   libavfilter     7. 59.100 /  7. 59.100
   libswscale      5.  6.100 /  5.  6.100
   libswresample   3.  6.100 /  3.  6.100
   libpostproc    55.  6.100 / 55.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
'20190922_1532_3Kf-pan-right_3969_c2t14.MOV':
   Metadata:
     major_brand     : qt
     minor_version   : 512
     compatible_brands: qt
     encoder         : Lavf57.56.101
   Duration: 00:00:14.01, start: 0.002000, bitrate: 25128 kb/s
     Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 /
0x31637661), yuvj420p(pc, bt709), 1280x720, 23587 kb/s, 29.97 fps, 29.97
tbr, 30k tbn, 60k tbc (default)
     Metadata:
       handler_name    : VideoHandler
     Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
stereo, s16, 1536 kb/s (default)
     Metadata:
       handler_name    : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from
'Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a':
   Metadata:
     major_brand     : M4A
     minor_version   : 512
     compatible_brands: isomiso2
     encoder         : Lavf57.56.101
   Duration: 00:00:08.02, start: 0.000000, bitrate: 220 kb/s
     Stream #1:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz,
mono, fltp, 218 kb/s (default)
     Metadata:
       handler_name    : SoundHandler
Stream mapping:
   Stream #0:0 (h264) -> crop (graph 0)
   Stream #0:1 (pcm_s16le) -> aresample (graph 1)
   Stream #1:0 (aac) -> amix:input1 (graph 1)
   drawtext (graph 0) -> Stream #0:0 (libx264)
   amix (graph 1) -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[libx264 @ 0x142f2c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
AVX LZCNT BMI1 SlowPshufb
[libx264 @ 0x142f2c0] profile High, level 3.1
[libx264 @ 0x142f2c0] 264 - core 148 r2748 97eaef2 - H.264/MPEG-4 AVC
codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html -
options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7
psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1
8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6
lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250
keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=abr
mbtree=1 bitrate=4700 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
ip_ratio=1.40 aq=1:1.00
Output #0, mov, to '20190922_1532_ch5.1e-g.mov':
   Metadata:
     major_brand     : qt
     minor_version   : 512
     compatible_brands: qt
     encoder         : Lavf58.33.100
     Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661),
yuvj420p(pc, progressive), 1024x768, q=-1--1, 4700 kb/s, 30 fps, 15360
tbn, 30 tbc (default)
     Metadata:
       encoder         : Lavc58.59.101 libx264
     Side data:
       cpb: bitrate max/min/avg: 0/0/4700000 buffer size: 0 vbv_delay: N/A
     Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)
     Metadata:
       encoder         : Lavc58.59.101 aac
frame=  420 fps= 14 q=-1.0 Lsize=    8270kB time=00:00:13.90
bitrate=4873.7kbits/s speed=0.45x
video:8061kB audio:193kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.185768%
[libx264 @ 0x142f2c0] frame I:2     Avg QP:14.84  size:195655
[libx264 @ 0x142f2c0] frame P:106   Avg QP:19.64  size: 59577
[libx264 @ 0x142f2c0] frame B:312   Avg QP:25.62  size:  4960
[libx264 @ 0x142f2c0] consecutive B-frames:  1.0%  0.0%  0.0% 99.0%
[libx264 @ 0x142f2c0] mb I  I16..4: 27.8% 28.6% 43.6%
[libx264 @ 0x142f2c0] mb P  I16..4:  1.2%  1.3%  0.6%  P16..4: 30.5%
31.4% 22.6%  0.0%  0.0%    skip:12.5%
[libx264 @ 0x142f2c0] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8: 36.0% 
7.7%  1.3%  direct: 4.2%  skip:50.8%  L0:37.6% L1:38.7% BI:23.8%
[libx264 @ 0x142f2c0] final ratefactor: 18.99
[libx264 @ 0x142f2c0] 8x8 transform intra:36.9% inter:54.6%
[libx264 @ 0x142f2c0] coded y,uvDC,uvAC intra: 56.3% 63.9% 51.8% inter:
27.2% 19.7% 1.0%
[libx264 @ 0x142f2c0] i16 v,h,dc,p: 73%  9% 14%  4%
[libx264 @ 0x142f2c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 13% 39% 4%  6% 
6%  6%  5% 10%
[libx264 @ 0x142f2c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 18% 14% 6%  9% 
9%  9%  8% 11%
[libx264 @ 0x142f2c0] i8c dc,h,v,p: 54% 24% 17%  5%
[libx264 @ 0x142f2c0] Weighted P-Frames: Y:9.4% UV:0.0%
[libx264 @ 0x142f2c0] ref P L0: 41.5% 11.5% 40.8%  6.0%  0.2%
[libx264 @ 0x142f2c0] ref B L0: 93.4%  6.0%  0.6%
[libx264 @ 0x142f2c0] ref B L1: 99.4%  0.6%
[libx264 @ 0x142f2c0] kb/s:4716.52
[aac @ 0x142d800] Qavg: 297.740

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-g.mov':
     encoder         : Lavf58.33.100
   Duration: 00:00:14.00, start: 0.000000, bitrate: 4838 kb/s
     Stream #0:0: Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc),
1024x768, 4716 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
       encoder         : Lavc58.59.101 libx264
     Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-g.m4a':
     encoder         : Lavf58.33.100
   Duration: 00:00:12.33, start: 0.000000, bitrate: 130 kb/s
     Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 128 kb/s (default)

The audio is 1.70 seconds shorter, always. Different video input lengths
and different audio lengths result in the same 1.70 seconds lost.

If I don't have any voice input and audio filter then the output streams
match length, since they are from the same input video.

I've also tried first resampling the voice-over audio to 48000 and
stereo first, then removing the aresample filter, leaving only the amix.
Still bad audio.
Since the next step would be to mix the audio in audacity and remux it
back together, I'll stop testing now and see what you think.

Stewart

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: current ffmpeg creates shortened audio stream when filter amix

Paul B Mahol
On 9/29/19, S Andreason <[hidden email]> wrote:

> I am getting a shortened audio stream when including the audio filters
> aresample and amix, which later makes it impossible to concat the clips,
> because the different stream lengths lose sync between audio and video,
> with errors:
> Invalid audio PTS
>
> First, here is the output from latest ffmpeg in debian package, which
> works correctly:
>
> $ ffmpeg-3.2.14-1~deb9u1 -i 20190922_1532_3Kf-pan-right_3969_c2t14.MOV
> -i Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a
> -filter_complex
> "[0]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'"
> -filter_complex "aresample=48000,amix" -s 1024x768 -c:v h264 -b:v 4700k
> -r 30 20190922_1532_ch5.1e-3.mov
> ffmpeg version 3.2.14-1~deb9u1 Copyright (c) 2000-2019 the FFmpeg developers
>    built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
>    configuration: --prefix=/usr --extra-version='1~deb9u1'
> --toolchain=hardened --libdir=/usr/lib/i386-linux-gnu
> --incdir=/usr/include/i386-linux-gnu --enable-gpl --disable-stripping
> --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa
> --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca
> --enable-libcdio --enable-libebur128 --enable-libflite
> --enable-libfontconfig --enable-libfreetype --enable-libfribidi
> --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libopenjpeg
> --enable-libopenmpt --enable-libopus --enable-libpulse
> --enable-librubberband --enable-libshine --enable-libsnappy
> --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora
> --enable-libtwolame --enable-libvorbis --enable-libvpx
> --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid
> --enable-libzmq --enable-libzvbi --enable-omx --enable-openal
> --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libiec61883
> --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264
> --enable-shared
>    libavutil      55. 34.101 / 55. 34.101
>    libavcodec     57. 64.101 / 57. 64.101
>    libavformat    57. 56.101 / 57. 56.101
>    libavdevice    57.  1.100 / 57.  1.100
>    libavfilter     6. 65.100 /  6. 65.100
>    libavresample   3.  1.  0 /  3.  1.  0
>    libswscale      4.  2.100 /  4.  2.100
>    libswresample   2.  3.100 /  2.  3.100
>    libpostproc    54.  1.100 / 54.  1.100
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
> '20190922_1532_3Kf-pan-right_3969_c2t14.MOV':
>    Metadata:
>      major_brand     : qt
>      minor_version   : 512
>      compatible_brands: qt
>      encoder         : Lavf57.56.101
>    Duration: 00:00:14.01, start: 0.002000, bitrate: 25128 kb/s
>      Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 /
> 0x31637661), yuvj420p(pc, bt709), 1280x720, 23587 kb/s, 29.97 fps, 29.97
> tbr, 30k tbn, 60k tbc (default)
>      Metadata:
>        handler_name    : DataHandler
>      Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
> stereo, s16, 1536 kb/s (default)
>      Metadata:
>        handler_name    : DataHandler
> Input #1, mov,mp4,m4a,3gp,3g2,mj2, from
> 'Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a':
>    Metadata:
>      major_brand     : M4A
>      minor_version   : 512
>      compatible_brands: isomiso2
>      encoder         : Lavf57.56.101
>    Duration: 00:00:08.02, start: 0.000000, bitrate: 220 kb/s
>      Stream #1:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz,
> mono, fltp, 218 kb/s (default)
>      Metadata:
>        handler_name    : SoundHandler
> No pixel format specified, yuvj420p for H.264 encoding chosen.
> Use -pix_fmt yuv420p for compatibility with outdated media players.
> [libx264 @ 0x170dc20] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
> AVX LZCNT BMI1 SlowPshufb
> [libx264 @ 0x170dc20] profile High, level 3.1
> [libx264 @ 0x170dc20] 264 - core 148 r2748 97eaef2 - H.264/MPEG-4 AVC
> codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html -
> options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7
> psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1
> 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6
> lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
> bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
> b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250
> keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=abr
> mbtree=1 bitrate=4700 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
> ip_ratio=1.40 aq=1:1.00
> Output #0, mov, to '20190922_1532_ch5.1e-3.mov':
>    Metadata:
>      major_brand     : qt
>      minor_version   : 512
>      compatible_brands: qt
>      encoder         : Lavf57.56.101
>      Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661),
> yuvj420p(pc), 1024x768, q=-1--1, 4700 kb/s, 30 fps, 15360 tbn, 30 tbc
> (default)
>      Metadata:
>        encoder         : Lavc57.64.101 libx264
>      Side data:
>        cpb: bitrate max/min/avg: 0/0/4700000 buffer size: 0 vbv_delay: -1
>      Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
> fltp, 128 kb/s (default)
>      Metadata:
>        encoder         : Lavc57.64.101 aac
> Stream mapping:
>    Stream #0:0 (h264) -> crop (graph 0)
>    Stream #0:1 (pcm_s16le) -> aresample (graph 1)
>    Stream #1:0 (aac) -> amix:input1 (graph 1)
>    drawtext (graph 0) -> Stream #0:0 (libx264)
>    amix (graph 1) -> Stream #0:1 (aac)
> Press [q] to stop, [?] for help
> frame=  420 fps= 12 q=-1.0 Lsize=    8303kB time=00:00:14.01
> bitrate=4853.1kbits/s speed=0.417x
> video:8063kB audio:224kB subtitle:0kB other streams:0kB global
> headers:0kB muxing overhead: 0.198401%
> [libx264 @ 0x170dc20] frame I:2     Avg QP:14.83  size:195688
> [libx264 @ 0x170dc20] frame P:106   Avg QP:19.65  size: 59553
> [libx264 @ 0x170dc20] frame B:312   Avg QP:25.61  size:  4972
> [libx264 @ 0x170dc20] consecutive B-frames:  1.0%  0.0%  0.0% 99.0%
> [libx264 @ 0x170dc20] mb I  I16..4: 27.6% 29.0% 43.4%
> [libx264 @ 0x170dc20] mb P  I16..4:  1.1%  1.3%  0.6%  P16..4: 30.5%
> 31.5% 22.6%  0.0%  0.0%    skip:12.4%
> [libx264 @ 0x170dc20] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8: 36.1%
> 7.7%  1.3%  direct: 4.2%  skip:50.7%  L0:37.5% L1:38.6% BI:23.9%
> [libx264 @ 0x170dc20] final ratefactor: 18.99
> [libx264 @ 0x170dc20] 8x8 transform intra:37.9% inter:54.5%
> [libx264 @ 0x170dc20] coded y,uvDC,uvAC intra: 56.2% 64.1% 51.9% inter:
> 27.2% 19.7% 1.0%
> [libx264 @ 0x170dc20] i16 v,h,dc,p: 73%  9% 14%  4%
> [libx264 @ 0x170dc20] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12% 11% 40% 4%  6%
> 6%  6%  5% 10%
> [libx264 @ 0x170dc20] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 18% 15% 6%  9%
> 9%  9%  8% 11%
> [libx264 @ 0x170dc20] i8c dc,h,v,p: 54% 25% 17%  5%
> [libx264 @ 0x170dc20] Weighted P-Frames: Y:9.4% UV:0.0%
> [libx264 @ 0x170dc20] ref P L0: 41.9% 11.1% 40.8%  6.0%  0.2%
> [libx264 @ 0x170dc20] ref B L0: 93.5%  5.9%  0.6%
> [libx264 @ 0x170dc20] ref B L1: 99.4%  0.6%
> [libx264 @ 0x170dc20] kb/s:4717.36
> [aac @ 0x170fac0] Qavg: 582.581
>
> Next ffprobe shows the video length:
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-3.mov':
>      encoder         : Lavf57.56.101
>    Duration: 00:00:14.03, start: 0.000000, bitrate: 4849 kb/s
>      Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661),
> yuvj420p(pc), 1024x768, 4717 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (defaul
>        encoder         : Lavc57.64.101 libx264
>      Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
> stereo, fltp, 131 kb/s (default)
>
> And to get the ACTUAL audio length, I split the audio stream to it's own
> file.mpa using ffmpeg, then ffprobe:
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-3.m4a':
>      encoder         : Lavf58.33.100
>    Duration: 00:00:14.03, start: 0.000000, bitrate: 133 kb/s
>      Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
> stereo, fltp, 131 kb/s (default)
>
> Then I repeat the above with only the change to use ffmpeg current by git:
>
> $ ffmpeg -i 20190922_1532_3Kf-pan-right_3969_c2t14.MOV -i
> Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a -filter_complex
> "[0]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'"
> -filter_complex "aresample=48000,amix" -s 1024x768 -c:v h264 -b:v 4700k
> -r 30 20190922_1532_ch5.1e-g.mov
> ffmpeg version N-95129-g04858650b1 Copyright (c) 2000-2019 the FFmpeg
> developers
>    built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
>    configuration: --prefix=/usr/local --enable-gpl --enable-libmp3lame
> --enable-libvorbis --enable-libx264 --enable-libopenjpeg
> --enable-libfreetype --disable-doc --disable-htmlpages
> --disable-podpages --enable-shared --enable-libvpx
> --extra-cflags=-I/usr/include --extra-ldflags=-L/usr/lib/i386-linux-gnu
> --enable-libass --enable-libtesseract
>    libavutil      56. 35.100 / 56. 35.100
>    libavcodec     58. 59.101 / 58. 59.101
>    libavformat    58. 33.100 / 58. 33.100
>    libavdevice    58.  9.100 / 58.  9.100
>    libavfilter     7. 59.100 /  7. 59.100
>    libswscale      5.  6.100 /  5.  6.100
>    libswresample   3.  6.100 /  3.  6.100
>    libpostproc    55.  6.100 / 55.  6.100
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
> '20190922_1532_3Kf-pan-right_3969_c2t14.MOV':
>    Metadata:
>      major_brand     : qt
>      minor_version   : 512
>      compatible_brands: qt
>      encoder         : Lavf57.56.101
>    Duration: 00:00:14.01, start: 0.002000, bitrate: 25128 kb/s
>      Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 /
> 0x31637661), yuvj420p(pc, bt709), 1280x720, 23587 kb/s, 29.97 fps, 29.97
> tbr, 30k tbn, 60k tbc (default)
>      Metadata:
>        handler_name    : VideoHandler
>      Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
> stereo, s16, 1536 kb/s (default)
>      Metadata:
>        handler_name    : SoundHandler
> Input #1, mov,mp4,m4a,3gp,3g2,mj2, from
> 'Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a':
>    Metadata:
>      major_brand     : M4A
>      minor_version   : 512
>      compatible_brands: isomiso2
>      encoder         : Lavf57.56.101
>    Duration: 00:00:08.02, start: 0.000000, bitrate: 220 kb/s
>      Stream #1:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz,
> mono, fltp, 218 kb/s (default)
>      Metadata:
>        handler_name    : SoundHandler
> Stream mapping:
>    Stream #0:0 (h264) -> crop (graph 0)
>    Stream #0:1 (pcm_s16le) -> aresample (graph 1)
>    Stream #1:0 (aac) -> amix:input1 (graph 1)
>    drawtext (graph 0) -> Stream #0:0 (libx264)
>    amix (graph 1) -> Stream #0:1 (aac)
> Press [q] to stop, [?] for help
> [libx264 @ 0x142f2c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
> AVX LZCNT BMI1 SlowPshufb
> [libx264 @ 0x142f2c0] profile High, level 3.1
> [libx264 @ 0x142f2c0] 264 - core 148 r2748 97eaef2 - H.264/MPEG-4 AVC
> codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html -
> options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7
> psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1
> 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6
> lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
> bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
> b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250
> keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=abr
> mbtree=1 bitrate=4700 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
> ip_ratio=1.40 aq=1:1.00
> Output #0, mov, to '20190922_1532_ch5.1e-g.mov':
>    Metadata:
>      major_brand     : qt
>      minor_version   : 512
>      compatible_brands: qt
>      encoder         : Lavf58.33.100
>      Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661),
> yuvj420p(pc, progressive), 1024x768, q=-1--1, 4700 kb/s, 30 fps, 15360
> tbn, 30 tbc (default)
>      Metadata:
>        encoder         : Lavc58.59.101 libx264
>      Side data:
>        cpb: bitrate max/min/avg: 0/0/4700000 buffer size: 0 vbv_delay: N/A
>      Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
> fltp, 128 kb/s (default)
>      Metadata:
>        encoder         : Lavc58.59.101 aac
> frame=  420 fps= 14 q=-1.0 Lsize=    8270kB time=00:00:13.90
> bitrate=4873.7kbits/s speed=0.45x
> video:8061kB audio:193kB subtitle:0kB other streams:0kB global
> headers:0kB muxing overhead: 0.185768%
> [libx264 @ 0x142f2c0] frame I:2     Avg QP:14.84  size:195655
> [libx264 @ 0x142f2c0] frame P:106   Avg QP:19.64  size: 59577
> [libx264 @ 0x142f2c0] frame B:312   Avg QP:25.62  size:  4960
> [libx264 @ 0x142f2c0] consecutive B-frames:  1.0%  0.0%  0.0% 99.0%
> [libx264 @ 0x142f2c0] mb I  I16..4: 27.8% 28.6% 43.6%
> [libx264 @ 0x142f2c0] mb P  I16..4:  1.2%  1.3%  0.6%  P16..4: 30.5%
> 31.4% 22.6%  0.0%  0.0%    skip:12.5%
> [libx264 @ 0x142f2c0] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8: 36.0%
> 7.7%  1.3%  direct: 4.2%  skip:50.8%  L0:37.6% L1:38.7% BI:23.8%
> [libx264 @ 0x142f2c0] final ratefactor: 18.99
> [libx264 @ 0x142f2c0] 8x8 transform intra:36.9% inter:54.6%
> [libx264 @ 0x142f2c0] coded y,uvDC,uvAC intra: 56.3% 63.9% 51.8% inter:
> 27.2% 19.7% 1.0%
> [libx264 @ 0x142f2c0] i16 v,h,dc,p: 73%  9% 14%  4%
> [libx264 @ 0x142f2c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 13% 39% 4%  6%
> 6%  6%  5% 10%
> [libx264 @ 0x142f2c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 18% 14% 6%  9%
> 9%  9%  8% 11%
> [libx264 @ 0x142f2c0] i8c dc,h,v,p: 54% 24% 17%  5%
> [libx264 @ 0x142f2c0] Weighted P-Frames: Y:9.4% UV:0.0%
> [libx264 @ 0x142f2c0] ref P L0: 41.5% 11.5% 40.8%  6.0%  0.2%
> [libx264 @ 0x142f2c0] ref B L0: 93.4%  6.0%  0.6%
> [libx264 @ 0x142f2c0] ref B L1: 99.4%  0.6%
> [libx264 @ 0x142f2c0] kb/s:4716.52
> [aac @ 0x142d800] Qavg: 297.740
>
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-g.mov':
>      encoder         : Lavf58.33.100
>    Duration: 00:00:14.00, start: 0.000000, bitrate: 4838 kb/s
>      Stream #0:0: Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc),
> 1024x768, 4716 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
>        encoder         : Lavc58.59.101 libx264
>      Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
> fltp, 128 kb/s (default)
>
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20190922_1532_ch5.1e-g.m4a':
>      encoder         : Lavf58.33.100
>    Duration: 00:00:12.33, start: 0.000000, bitrate: 130 kb/s
>      Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
> stereo, fltp, 128 kb/s (default)
>
> The audio is 1.70 seconds shorter, always. Different video input lengths
> and different audio lengths result in the same 1.70 seconds lost.
>
> If I don't have any voice input and audio filter then the output streams
> match length, since they are from the same input video.
>
> I've also tried first resampling the voice-over audio to 48000 and
> stereo first, then removing the aresample filter, leaving only the amix.
> Still bad audio.
> Since the next step would be to mix the audio in audacity and remux it
> back together, I'll stop testing now and see what you think.


There are numerous issues with your report. First how is this supposed
to work at all
if one use two filter-complex at once? Second amix gets only one input
in your command, and with no options given it accepts actually two
inputs.

So your commands should not work at all.

> Stewart
>
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> [hidden email] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: current ffmpeg creates shortened audio stream when filter amix

Moritz Barsnick
On Sun, Sep 29, 2019 at 10:38:33 +0200, Paul B Mahol wrote:

> On 9/29/19, S Andreason <[hidden email]> wrote:
> > I am getting a shortened audio stream when including the audio filters
> > aresample and amix, which later makes it impossible to concat the clips,
> > because the different stream lengths lose sync between audio and video,
> > with errors:
> > Invalid audio PTS
> There are numerous issues with your report. First how is this
> supposed to work at all if one use two filter-complex at once? Second
> amix gets only one input in your command, and with no options given
> it accepts actually two inputs.

While I wanted to agree with this, ffmpeg seems to disagree:

> > $ ffmpeg -i 20190922_1532_3Kf-pan-right_3969_c2t14.MOV -i Voice_20190922-1315_voiceOverForEMR-outroClip_c108t8.m4a -filter_complex "[0]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'" -filter_complex "aresample=48000,amix" -s 1024x768 -c:v h264 -b:v 4700k -r 30 20190922_1532_ch5.1e-g.mov
[...]
> > Stream mapping:
> >    Stream #0:0 (h264) -> crop (graph 0)
> >    Stream #0:1 (pcm_s16le) -> aresample (graph 1)
> >    Stream #1:0 (aac) -> amix:input1 (graph 1)
> >    drawtext (graph 0) -> Stream #0:0 (libx264)
> >    amix (graph 1) -> Stream #0:1 (aac)

Obviously, both filter_complex expressions are taken into
consideration, and amix seems to be able to grab two inputs.

OTOH, this is probably not what Stewart desired. 0:1 is being
aresample'd, and then amix'd with 1:0 (which is still at 44100). That
is bound to be undefined behavior(?).

Suggestion:
  -filter_complex "[0:v]crop=x=128:y=0:w=1024:h=720,pad=1024:768:0:24,drawtext='fontsize=32:fontcolor=0xa73450:bordercolor=white:shadowcolor=black:fontfile=/usr/share/fonts/TrueType/SF-Foxboro-Script-Bold.ttf:x=(w-text_w-20):y=(h-text_h-36):shadowx=2:shadowy=2:borderw=1:text=seahorseCorral.org'[v]; [1:a]aresample=48000[a1]; [0:a][a1]amix[a]" -map "[v]" -map "[a]"

I'm not saying this will fix the observed issue (as Stewart mentioned
he tried resampling externally), but it will make the behavior more
well-defined.

Stewart, it helps if you leave away the parts of the video filter which
aren't relevant to the issue (assuming you tried leaving them away on
your side first), to make the command line easier to read.

Also, it helps to explictly mark inputs and outputs in the
filter_complex chains, to make sure they do what you intend them to do.
(Implicit is for heros. I'm not one of them. ;-))

Cheers,
Moritz
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: current ffmpeg creates shortened audio stream when filter amix

S Andreason
Moritz Barsnick wrote:

> Hi Stewart,
>
> On Mon, Sep 30, 2019 at 12:48:31 -0700, S Andreason wrote:
>> Moritz Barsnick wrote:
>>> On Sun, Sep 29, 2019 at 10:38:33 +0200, Paul B Mahol wrote:
>>> Stewart, it helps if you leave away the parts of the video filter which
>>> aren't relevant to the issue (assuming you tried leaving them away on
>>> your side first), to make the command line easier to read.
>>>
>>> Also, it helps to explictly mark inputs and outputs in the
>>> filter_complex chains, to make sure they do what you intend them to do.
>>> (Implicit is for heros. I'm not one of them. ;-))
>> Yes I removed parts one or two at a time and tried only one or the other
>> of the audio filters to narrow it down.
>> I left out 90-95% of all the other inputs and drawtext, but left one in
>> so the crop-size flow was unchanged.
>> I didn't realize I left in a font reference until Paul asked for the
>> input files, then I had to choose whether to rewrite the reported
>> command line, or just provide the files.
>>
>> Thank you very much for showing the right way to mark all inputs and
>> outputs. Somehow all my googling and reading of manuals failed to cover
>> that bit, or I didn't make the leap to understanding it. I didn't
>> realize [0:a] was the solution to [0:1] for example.
> Nice. Does this mean it solved your issue though? I wasn't sure that
> would be the case.

Yes the audio question is solved. I've learned how to use [0:a] instead
of [0:1] or leaving it blank.

> Feel free to answer to the list instead of myself, in case you want
> more remarks or corrections (may not have been the case here). ;-)
>
>
>
Oh no. None of my replies went to the list...
I just hit reply and forgot to check that field.
That was not my intention.

When Paul said it shouldn't work, I replied:
When I started with one filter-complex, I could not guess how to get it
working, and splitting the audio out to be separate actually solved it,
and it worked, on the older version of ffmpeg.

This month long video project has finally finished getting rendered and
uploaded to youtube.

Stewart
seahorsecorral.org
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".