amix with atempo: Inconsistent behaviour creating m4a with aac

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

amix with atempo: Inconsistent behaviour creating m4a with aac

Jonathan Girven
I am trying to take the audio from two mp4 files, merge them together one
after the other (perhaps with some other effects such as a fade transition:
not shown), and apply atempo to the result.

I am working with the current FFmpeg git master branch. I have attached an
example file to reproduce the issue with:  BigBuckBunny_320x180_t10.mp4
<http://www.ffmpeg-archive.org/file/t377720/BigBuckBunny_320x180_t10.mp4>  .
And here is the command and output:

$ ffmpeg -y \
  -i BigBuckBunny_320x180_t10.mp4 \
  -i BigBuckBunny_320x180_t10.mp4 \
  -filter_complex "
aevalsrc=0:d=10[na1];[na1][1:a]concat=n=2:v=0:a=1[a1];
[0:a][a1]amix=inputs=2[mix_audio];
[mix_audio]asplit=3[mix_audio0][mix_audio1][mix_audio2];
[mix_audio0]atrim=0:5,asetpts=expr=PTS-STARTPTS[a_trim0];
[mix_audio1]atrim=5:10,asetpts=expr=PTS-STARTPTS[a_trim1];
[mix_audio2]atrim=10:20,asetpts=expr=PTS-STARTPTS[a_trim2];
[a_trim1]atempo=0.5[a_slomo1];
[a_trim0][a_slomo1][a_trim2]concat=n=3:v=0:a=1[com_a_slomo]
" \
  -map [com_a_slomo] \
  -c:a aac \
  output.m4a
ffmpeg version N-87286-g6ce4a63 Copyright (c) 2000-2017 the FFmpeg
developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --prefix=/usr/local --extra-cflags=-I/usr/local/include
--extra-ldflags=-L/usr/local/lib --bindir=/usr/local/bin --disable-doc
--disable-static --enable-shared --disable-ffplay --extra-libs=-ldl
--enable-version3 --enable-libfreetype --enable-libx264 --enable-gpl
--enable-openssl --enable-nonfree --disable-debug
  libavutil      55. 74.100 / 55. 74.100
  libavcodec     57.105.100 / 57.105.100
  libavformat    57. 82.100 / 57. 82.100
  libavdevice    57.  8.100 / 57.  8.100
  libavfilter     6.105.100 /  6.105.100
  libswscale      4.  7.103 /  4.  7.103
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'BigBuckBunny_320x180_t10.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    title           : Big Buck Bunny
    artist          : Blender Foundation
    composer        : Blender Foundation
    date            : 2008
    encoder         : Lavf57.68.100
  Duration: 00:00:10.01, start: 0.000000, bitrate: 277 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p,
320x180 [SAR 1:1 DAR 16:9], 143 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc
(default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 127 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'BigBuckBunny_320x180_t10.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    title           : Big Buck Bunny
    artist          : Blender Foundation
    composer        : Blender Foundation
    date            : 2008
    encoder         : Lavf57.68.100
  Duration: 00:00:10.01, start: 0.000000, bitrate: 277 kb/s
    Stream #1:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p,
320x180 [SAR 1:1 DAR 16:9], 143 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc
(default)
    Metadata:
      handler_name    : VideoHandler
    Stream #1:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 127 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:1 (aac) -> amix:input0
  Stream #1:1 (aac) -> concat:in1:a0
  concat -> Stream #0:0 (aac)
Press [q] to stop, [?] for help
Output #0, ipod, to 'output.m4a':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    title           : Big Buck Bunny
    artist          : Blender Foundation
    composer        : Blender Foundation
    date            : 2008
    encoder         : Lavf57.82.100
    Stream #0:0: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)
    Metadata:
      encoder         : Lavc57.105.100 aac
[Parsed_aevalsrc_0 @ 0x244b680] EOF timestamp not reliable
[aac @ 0x2473880] Queue input is backward in time
    Last message repeated 1 times
[ipod @ 0x2474920] Non-monotonous DTS in output stream 0:0; previous:
717824, current: -442721857768310647; changing to 717825. This may result in
incorrect timestamps in the output file.
[ipod @ 0x2474920] Non-monotonous DTS in output stream 0:0; previous:
717825, current: -442721857768309623; changing to 717826. This may result in
incorrect timestamps in the output file.
[aac @ 0x2473880] Queue input is backward in time
... <repeated combination of above two lines many times> ...
size=     399kB time=00:00:14.98 bitrate= 218.0kbits/s speed=22.4x    
video:0kB audio:393kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 1.401464%
[aac @ 0x2473880] Qavg: 1475.414

I would expect output.m4a to be 25 seconds long: 5 seconds normal playback,
10 seconds atempo=0.5, 10 seconds normal. Instead it is 15 seconds long and
doesn't contain the last 10 seconds of normal playback.




--
Sent from: http://www.ffmpeg-archive.org/
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: amix with atempo: Inconsistent behaviour creating m4a with aac

Jonathan Girven
> I would expect output.m4a to be 25 seconds long: 5 seconds normal playback,
> 10 seconds atempo=0.5, 10 seconds normal. Instead it is 15 seconds long and
> doesn't contain the last 10 seconds of normal playback.

I have experimented with some alternatives to achieve the same goal.
If I do the above process in two stages, first merge the audio tracks,
output to an intermediate file, then apply atempo to the intermediate,
e.g:

$ ffmpeg -y \
  -i BigBuckBunny_320x180_t10.mp4 \
  -i BigBuckBunny_320x180_t10.mp4 \
  -filter_complex "
aevalsrc=0:d=10[na1];
[na1][1:a]concat=n=2:v=0:a=1[a1];
[0:a][a1]amix=inputs=2[mix_audio]
" \
  -map [mix_audio] \
  -c:a aac \
  intermediate.m4a

$ ffmpeg -y \
  -i intermediate.m4a \
  -filter_complex "
[0:a]asplit=3[a0][a1][a2];
[a0]atrim=0:5,asetpts=expr=PTS-STARTPTS[a_trim0];
[a1]atrim=5:10,asetpts=expr=PTS-STARTPTS[a_trim1];
[a2]atrim=10:20,asetpts=expr=PTS-STARTPTS[a_trim2];
[a_trim1]atempo=0.5[a_slomo1];
[a_trim0][a_slomo1][a_trim2]concat=n=3:v=0:a=1[com_a_slomo]
" \
  -map [com_a_slomo] \
  -c:a aac \
  output.m4a

This works as expected.

Also, if I instead do the merge in one stage, but first ensure that
both audio tracks are the same length as the final output by appending
silent audio to the end of [0:a]:

ffmpeg -y \
  -i BigBuckBunny_320x180_t10.mp4 \
  -i BigBuckBunny_320x180_t10.mp4 \
  -filter_complex "
aevalsrc=0:d=10[na0];[0:a][na0]concat=n=2:v=0:a=1[a0];
aevalsrc=0:d=10[na1];[na1][1:a]concat=n=2:v=0:a=1[a1];
[a0][a1]amix=inputs=2[mix_audio];
[mix_audio]asplit=3[mix_audio0][mix_audio1][mix_audio2];
[mix_audio0]atrim=0:5,asetpts=expr=PTS-STARTPTS[a_trim0];
[mix_audio1]atrim=5:10,asetpts=expr=PTS-STARTPTS[a_trim1];
[mix_audio2]atrim=10:20,asetpts=expr=PTS-STARTPTS[a_trim2];
[a_trim1]atempo=0.5[a_slomo1];
[a_trim0][a_slomo1][a_trim2]concat=n=3:v=0:a=1[com_a_slomo]
" \
  -map [com_a_slomo] \
  -c:a aac \
  output.m4a

This works as expected too. Therefore, I wonder if label [mix_audio]
in the original command does not have the correct length assigned or
something? Saving to an intermediate file enforces that 25 second
length, and so does appending silent audio to make both tracks the
full length.
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: amix with atempo: Inconsistent behaviour creating m4a with aac

Pavel Koshevoy
In reply to this post by Jonathan Girven
On Sep 13, 2017 4:32 AM, "Jonathan Girven" <[hidden email]>
wrote:

I am trying to take the audio from two mp4 files, merge them together one
after the other (perhaps with some other effects such as a fade transition:
not shown), and apply atempo to the result.

I am working with the current FFmpeg git master branch. I have attached an
example file to reproduce the issue with:  BigBuckBunny_320x180_t10.mp4
<http://www.ffmpeg-archive.org/file/t377720/BigBuckBunny_320x180_t10.mp4>  .
And here is the command and output:

$ ffmpeg -y \
  -i BigBuckBunny_320x180_t10.mp4 \
  -i BigBuckBunny_320x180_t10.mp4 \
  -filter_complex "
aevalsrc=0:d=10[na1];[na1][1:a]concat=n=2:v=0:a=1[a1];
[0:a][a1]amix=inputs=2[mix_audio];
[mix_audio]asplit=3[mix_audio0][mix_audio1][mix_audio2];
[mix_audio0]atrim=0:5,asetpts=expr=PTS-STARTPTS[a_trim0];
[mix_audio1]atrim=5:10,asetpts=expr=PTS-STARTPTS[a_trim1];
[mix_audio2]atrim=10:20,asetpts=expr=PTS-STARTPTS[a_trim2];
[a_trim1]atempo=0.5[a_slomo1];
[a_trim0][a_slomo1][a_trim2]concat=n=3:v=0:a=1[com_a_slomo]
" \
  -map [com_a_slomo] \
  -c:a aac \
  output.m4a



I haven't tested such use case, but I can tell you atempo also changes
output pts (computes it based on the number of samples output so far).  So
shouldn't mix_audio2 asetpts expression reference the output of atempo for
STARTPTS?

    Pavel
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: amix with atempo: Inconsistent behaviour creating m4a with aac

Jonathan Girven
> I haven't tested such use case, but I can tell you atempo also changes
> output pts (computes it based on the number of samples output so far).  So
> shouldn't mix_audio2 asetpts expression reference the output of atempo for
> STARTPTS?

I was under the impression that the concat filter that I use to join
the final sections together (com_a_slowmo) just put the sections one
after the other. Therefore I need each section to "start from time
zero" for lack of a better expression. If I were using amix to merge
the three sections, I would agree with you that my asetpts would need
to be something more complex. Am I wrong?
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".