5% of audio samples missing when capturing audio on a mac

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

5% of audio samples missing when capturing audio on a mac

Norbert Pozar
Hi,

I am attempting to capture a webcam with audio on a MacBook pro (Catalina
10.15.6), but I am having trouble with the audio stream. The video part is
fine, but audio seems to be missing about 5% of the expected samples. This
simple command illustrates the problem:

ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav

The console output is below.

I expect to capture 10s of audio from the built-in microphone. However, the
resulting audio is ~9.5s with audible clicking that I think indicate
missing samples. (Note that if -t 100 is used, ~95s is captured, so this is
not a warm up issue.) The output says 413184 samples decoded, but I would
expect closer to 441000 =44100 Hz * 10s. Indeed, if I add -async 1 option,
silence is inserted with messages "adding 4608 audio samples of silence"
throughout.

I found a bug on the bug tracker about audio issues related to capturing
the screen:
https://trac.ffmpeg.org/ticket/4513

Could someone try reproducing the issue or pointing me in the right
direction?

I have ffmpeg installed using homebrew with --HEAD option, matching the
latest commit on master . Recording using QuickTime or OBS works fine.

This is the log output of the above command:

$ ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav
ffmpeg version git-2020-09-12-1c09456 Copyright (c) 2000-2020 the FFmpeg
developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.17)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-1c09456_1
--enable-shared --enable-pthreads --enable-version3 --enable-avresample
--cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls
--enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d
--enable-libmp3lame --enable-libopus --enable-librav1e
--enable-librubberband --enable-libsnappy --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libvidstab
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma
--enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox
--disable-libjack --disable-indev=jack
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 54.100 / 58. 54.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with
argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging
level) with argument '99'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with
argument '1'.
Reading option '-f' ... matched as option 'f' (force format) with argument
'avfoundation'.
Reading option '-i' ... matched as input url with argument ':0'.
Reading option '-t' ... matched as option 't' (record or transcode
"duration" seconds of audio/video) with argument '10'.
Reading option 'out.wav' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url :0.
Applying option f (force format) with argument avfoundation.
Successfully parsed a group of options.
Opening an input file: :0.
[avfoundation @ 0x7ff5c080a600] audio device 'Built-in Microphone' opened
[avfoundation @ 0x7ff5c080a600] All info found
[avfoundation @ 0x7ff5c080a600] stream 0: start_time: 7242.35 duration:
NOPTS
[avfoundation @ 0x7ff5c080a600] format: start_time: 7242.35 duration: NOPTS
(estimate from bit rate) bitrate=2822 kb/s
Input #0, avfoundation, from ':0':
  Duration: N/A, start: 7242.345805, bitrate: 2822 kb/s
    Stream #0:0, 1, 1/1000000: Audio: pcm_f32le, 44100 Hz, stereo, flt,
2822 kb/s
Successfully opened the file.
Parsing a group of options: output url out.wav.
Applying option t (record or transcode "duration" seconds of audio/video)
with argument 10.
Successfully parsed a group of options.
Opening an output file: out.wav.
[file @ 0x7ff5bfe438c0] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_f32le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if
it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x7ff5bfe2ac80] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x7ff5bfe2ac80] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x7ff5bfe2ac80] Setting 'sample_fmt' to value 'flt'
[graph_0_in_0_0 @ 0x7ff5bfe2ac80] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7ff5bfe2ac80] tb:1/44100 samplefmt:flt samplerate:44100
chlayout:0x3
[format_out_0_0 @ 0x7ff5bfe7c580] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 0x7ff5bfe7c580] auto-inserting filter 'auto_resampler_0'
between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x7ff5bfe54e80] query_formats: 5 queried, 9 merged, 3
already done, 0 delayed
[auto_resampler_0 @ 0x7ff5bfe523c0] [SWR @ 0x7ff5bff69000] Using fltp
internally between filters
[auto_resampler_0 @ 0x7ff5bfe523c0] ch:2 chl:stereo fmt:flt r:44100Hz ->
ch:2 chl:stereo fmt:s16 r:44100Hz
Output #0, wav, to 'out.wav':
  Metadata:
    ISFT            : Lavf58.54.100
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
[out_0_0 @ 0x7ff5bfe7a280] EOF on sink link out_0_0:default.=   1x
No more output streams to write to, finishing.
size=    1611kB time=00:00:10.00 bitrate=1319.5kbits/s speed=0.998x
video:0kB audio:1611kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.004729%
Input file #0 (:0):
  Input stream #0:0 (audio): 807 packets read (3305472 bytes); 807 frames
decoded (413184 samples);
  Total: 807 packets (3305472 bytes) demuxed
Output file #0 (out.wav):
  Output stream #0:0 (audio): 806 frames encoded (412328 samples); 806
packets muxed (1649312 bytes);
  Total: 806 packets (1649312 bytes) muxed
807 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7ff5bfe39880] Statistics: 4 seeks, 9 writeouts


Norbert
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Carl Eugen Hoyos-2
Am Sa., 12. Sept. 2020 um 10:48 Uhr schrieb Norbert Pozar
<[hidden email]>:

> I am attempting to capture a webcam with audio on a MacBook pro (Catalina
> 10.15.6), but I am having trouble with the audio stream. The video part is
> fine, but audio seems to be missing about 5% of the expected samples. This
> simple command illustrates the problem:
>
> ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav

This is missing the console output of:
ffmpeg -i out.wav -f null -

Carl Eugen
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

kumowoon1025
In reply to this post by Norbert Pozar
Hi,

You know it's confounding, I couldn't get avfoundation audio input to work at all, and then tried a bunch of options to get to the weird issue you describe (which I thought was maybe something to do with the clock source being different) but then I went back to supplying no extraneous options and it works pretty much without issue... In my case I was trying with a turntable plugged into an external interface, with ffplay I'm using to listen through my Mac seeing if it'll start stuttering again (???)

Some things I tried before it started working apparently on its own is explicitly setting decoder to pcm_f32le and setting input sample rates lower. Expectedly, it messes with pitch if you do this, but what I did not understand is it wasn't consistent, kept speeding up and back down. (And I'm sure it wasn't the spindle on the turntable)

I have no idea, other than that I think it's probably some other sound application changing system sound servers configuration with no relation to what I'm doing with ffmpeg...

Regards,
Ted Park

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Norbert Pozar
In reply to this post by Carl Eugen Hoyos-2
> > I am attempting to capture a webcam with audio on a MacBook pro
(Catalina
> > 10.15.6), but I am having trouble with the audio stream. The video part
is
> > fine, but audio seems to be missing about 5% of the expected samples.
This
> > simple command illustrates the problem:
> >
> > ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav
>
> This is missing the console output of:
> ffmpeg -i out.wav -f null -

Thanks for having a look. Sorry about that. Here is the console output of
both commands (the exact produced length changes on each run, depending on
how many samples are missing):

$ ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav
ffmpeg version git-2020-09-12-1c09456 Copyright (c) 2000-2020 the FFmpeg
developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.17)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-1c09456_1
--enable-shared --enable-pthreads --enable-version3 --enable-avresample
--cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls
--enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d
--enable-libmp3lame --enable-libopus --enable-librav1e
--enable-librubberband --enable-libsnappy --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libvidstab
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma
--enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox
--disable-libjack --disable-indev=jack
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 54.100 / 58. 54.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with
argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging
level) with argument '99'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with
argument '1'.
Reading option '-f' ... matched as option 'f' (force format) with argument
'avfoundation'.
Reading option '-i' ... matched as input url with argument ':0'.
Reading option '-t' ... matched as option 't' (record or transcode
"duration" seconds of audio/video) with argument '10'.
Reading option 'out.wav' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url :0.
Applying option f (force format) with argument avfoundation.
Successfully parsed a group of options.
Opening an input file: :0.
[avfoundation @ 0x7f872c814000] audio device 'Built-in Microphone' opened
[avfoundation @ 0x7f872c814000] All info found
[avfoundation @ 0x7f872c814000] stream 0: start_time: 38114.7 duration:
NOPTS
[avfoundation @ 0x7f872c814000] format: start_time: 38114.7 duration: NOPTS
(estimate from bit rate) bitrate=2822 kb/s
Input #0, avfoundation, from ':0':
  Duration: N/A, start: 38114.693333, bitrate: 2822 kb/s
    Stream #0:0, 1, 1/1000000: Audio: pcm_f32le, 44100 Hz, stereo, flt,
2822 kb/s
Successfully opened the file.
Parsing a group of options: output url out.wav.
Applying option t (record or transcode "duration" seconds of audio/video)
with argument 10.
Successfully parsed a group of options.
Opening an output file: out.wav.
[file @ 0x7f872be0e480] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_f32le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if
it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'sample_fmt' to value 'flt'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7f872bc392c0] tb:1/44100 samplefmt:flt samplerate:44100
chlayout:0x3
[format_out_0_0 @ 0x7f872bd3a840] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 0x7f872bd3a840] auto-inserting filter 'auto_resampler_0'
between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x7f872be2a0c0] query_formats: 5 queried, 9 merged, 3
already done, 0 delayed
[auto_resampler_0 @ 0x7f872bd25080] [SWR @ 0x7f872bf69000] Using fltp
internally between filters
[auto_resampler_0 @ 0x7f872bd25080] ch:2 chl:stereo fmt:flt r:44100Hz ->
ch:2 chl:stereo fmt:s16 r:44100Hz
Output #0, wav, to 'out.wav':
  Metadata:
    ISFT            : Lavf58.54.100
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
[out_0_0 @ 0x7f872bc394c0] EOF on sink link out_0_0:default.=   1x
No more output streams to write to, finishing.
size=    1619kB time=00:00:10.00 bitrate=1326.1kbits/s speed=0.998x
video:0kB audio:1619kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.004706%
Input file #0 (:0):
  Input stream #0:0 (audio): 811 packets read (3321856 bytes); 811 frames
decoded (415232 samples);
  Total: 811 packets (3321856 bytes) demuxed
Output file #0 (out.wav):
  Output stream #0:0 (audio): 810 frames encoded (414376 samples); 810
packets muxed (1657504 bytes);
  Total: 810 packets (1657504 bytes) muxed
811 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7f872be1db00] Statistics: 4 seeks, 9 writeouts




$ ffmpeg -i out.wav -f null -
ffmpeg version git-2020-09-12-1c09456 Copyright (c) 2000-2020 the FFmpeg
developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.17)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-1c09456_1
--enable-shared --enable-pthreads --enable-version3 --enable-avresample
--cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls
--enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d
--enable-libmp3lame --enable-libopus --enable-librav1e
--enable-librubberband --enable-libsnappy --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libvidstab
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma
--enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox
--disable-libjack --disable-indev=jack
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 54.100 / 58. 54.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'out.wav':
  Metadata:
    encoder         : Lavf58.54.100
  Duration: 00:00:09.40, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz,
stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf58.54.100
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
size=N/A time=00:00:09.39 bitrate=N/A speed=1.23e+03x
video:0kB audio:1619kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: unknown



Here I am also attaching the console output (only the relevant timings that
differ from the above) with -t 100 to illustrate that it does not seem to
be a warm-up problem.

$ ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 100 out.wav
...
size=   16163kB time=00:01:40.00 bitrate=1324.0kbits/s speed=   1x
video:0kB audio:16163kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.000471%
Input file #0 (:0):
  Input stream #0:0 (audio): 8083 packets read (33107968 bytes); 8083
frames decoded (4138496 samples);
  Total: 8083 packets (33107968 bytes) demuxed
Output file #0 (out.wav):
  Output stream #0:0 (audio): 8082 frames encoded (4137616 samples); 8082
packets muxed (16550464 bytes);
  Total: 8082 packets (16550464 bytes) muxed
8083 frames successfully decoded, 0 decoding errors

$ ffmpeg -i out.wav -f null -
...
  Metadata:
    encoder         : Lavf58.54.100
  Duration: 00:01:33.82, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz,
stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf58.54.100
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
size=N/A time=00:01:33.82 bitrate=N/A speed=2.21e+03x
video:0kB audio:16163kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: unknown
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Norbert Pozar
I have had a chance to test the issue on friends' laptops so here are two
more data points. They have only version 4.3.1 I believe, not the latest
HEAD. Their built-in mic sample rate is 48000Hz (I suppose more recent
laptops updated it?).

1) MacBook Pro (13-inch, 2018, Four Thunderbolt 3 Ports) Processor  2,3 GHz
Intel Core i5
No issues, captured audio sounds fine.

2) MacBook Pro (13-inch, 2019, Four Thunderbolt 3 Ports) Processor 2,4 GHz
Quad-Core Intel Core i5
Missing audio samples as in my case (even more so), and the captured audio
sounds even worse...

So unfortunately it looks like a problem with ffmpeg's avfoundation
implementation at this point...

Norbert
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

kumowoon1025
Hi,

Now that I try it, it works fine for some random number of seconds, then stops. Sometimes 3, sometimes 300.

Something that comes to mind is that Mojave release notes had something about a new model for security as it pertains to mic input (like to prevent a mac version of a "wiretap" type malware). I checked and I had given Terminal.app at some point, but that might not be enough, I think if you enable DevToolsSecurity you can whitelist specific binaries to run, but I don't know if that also is possible for entitlements.

> I have had a chance to test the issue on friends' laptops so here are two
> more data points. They have only version 4.3.1 I believe, not the latest
> HEAD. Their built-in mic sample rate is 48000Hz (I suppose more recent
> laptops updated it?).

Yeah, I don't remember the details but apparently it's more efficient or something?
48000 is certainly a much nicer number when you compare it with the common video framerates (24, 30/1.001, etc. all divide cleanly)

> 1) MacBook Pro (13-inch, 2018, Four Thunderbolt 3 Ports) Processor  2,3 GHz
> Intel Core i5
> No issues, captured audio sounds fine.
>
> 2) MacBook Pro (13-inch, 2019, Four Thunderbolt 3 Ports) Processor 2,4 GHz
> Quad-Core Intel Core i5
> Missing audio samples as in my case (even more so), and the captured audio
> sounds even worse...

That's interesting because I'm pretty sure that was the year they started marketing the "directional beamforming mic array" that looked like the same 3 mics as before, I wonder if they are interleaved/framed differently with a new chip?

> So unfortunately it looks like a problem with ffmpeg's avfoundation
> implementation at this point...

Well you can look at it that way, but another might be apple makes breaking changes to their system framework apis all the time :p

> I am making some recorded lectures using the webcam output of ATEM mini and so the sound capture has to be flawless. FFmpeg is such a great tool for encoding and I hoped to use it to grap 1080p webcam instead of the too simple QuickTime, which offers only an Apple prores codec with 2GB/min of data... OBS seems to have too much overhead for 1080p on my 2core laptop. Maybe I'll have to grab the audio using quicktime and video using ffmpeg and sync them up in iMovie.


Okay now I am not sure what the setup is. FFmpeg or QuickTime will record whatever input it gets if it can mux it, I think you should take another look at the controller for your capture interface (software or hardware). Since they call it "webcam" I really don't think prores would be the only output format, especially on hardware with that price tag, surely it has built in h.264, especially over usb.

Are you using the same device to record audio? I think that would be better if you're not, even if you have to add a seemingly unnecessary roundtrip.

Regards,
Ted Park

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

amindfv@mailbox.org
> 48000 is certainly a much nicer number when you compare it with the common video framerates (24, 30/1.001, etc. all divide cleanly)

Can you explain this? I'm trying to get (30/1.001) or the rounded 29.97 to divide 48k cleanly or be a clean ratio but I don't see it. Maybe that with 30/1.001 it's got a denominator of 5, which is pretty small?

Tom

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Carl Zwanzig
On 9/17/2020 7:28 PM, [hidden email] wrote:
> Can you explain this? I'm trying to get (30/1.001) or the rounded 29.97 to divide 48k cleanly or be a clean ratio but I don't see it.

It doesn't, but 48k samples per second is a nice number (and does divide
nicely by 24,25, and 30). 30000/1001 is simply messy to deal with (~
=29.97002997002997002997002997003), and unless you have specific timing
hardware pretty much impossible to run at the absolutely correct rate.

z!
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

kumowoon1025
In reply to this post by amindfv@mailbox.org
Hi,

>> 48000 is certainly a much nicer number when you compare it with the common video framerates (24, 30/1.001, etc. all divide cleanly)
>
> Can you explain this? I'm trying to get (30/1.001) or the rounded 29.97 to divide 48k cleanly or be a clean ratio but I don't see it. Maybe that with 30/1.001 it's got a denominator of 5, which is pretty small?

Compared to 44.1kHz? 48kHz is 48000 samples per second, and 29.97 (30/1.001) fps is, obviously, 30000/1001 (≈29.97) frames per second - flip that around and you get 1001/30000 seconds duration for each frame.

For each frame there are 1601.6 (16 × 1.001) samples. For 59.97fps, 800.8, for film, 2002 per frame. The 1.001 factor might seem a bit ugly, but that’s kind of why 48 whole kilohertz works much better.

if you think about an mpeg ts system clock timebase of 1/90000 for example, common video or film framerates generally come out to an integer number of 1/90000 second “ticks.” A 29.97fps frame is 3003 “ticks”, which also matches the 1601.6 samples duration. The fractions of samples might make it look like the ratio is not easy to work with, but at 48kHz, one sample has a duration of 1.875 “ticks”, or 15/8 = 30/16

If you replace 48000 with 44100, the numbers aren’t as nice. (Sometimes not even rational? Not sure what combo does that though)

I might be making up the history behind it, but 44.1kHz was basically just workable, with 20kHz assumed to be the “bandwidth” limit of sound intended for people to hear, 40kHz would be needed to encode sound signals that dense, and the extra 4.1kHz would help get rid of artifacts due to aliasing - and probably the biggest factor was the CD. I’m sure they could have pressed much more density into the medium, but the laser tech that was commercially viable at the time to put in players for the general consumer sort of made 44.1kHz a decent detent in the sampling frequency dial in an imaginary sample rate-to-cost estimating machine.

If you actually do the calculations with 44.1kHz, the ratios you get aren’t *too* bad, instead of numbers like 2^3 or 3×5, it’s something like 3×49 or something.

Regards,
Ted Park

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Carl Zwanzig
On 9/22/2020 8:29 AM, Edward Park wrote:
> I might be making up the history behind it, but 44.1kHz was basically
> just workable, with 20kHz assumed to be the “bandwidth” limit of sound
> intended for people to hear, 40kHz would be needed to encode sound
> signals that dense, and the extra 4.1kHz would help get rid of artifacts
> due to aliasing - and probably the biggest factor was the CD.

My recollection is that you're substantially correct- tradeoffs of number of
the bits on a CD, human hearing (most people can't actually hear up to
20kHz), however....

"The official Philips history says this capacity was specified by Sony
executive Norio Ohga to be able to contain the entirety of Beethoven's Ninth
Symphony on one disc.[25] This is a myth according to Kees Immink, as the
EFM code format had not yet been decided in December 1979, when the decision
to adopt the 120 mm was made. The adoption of EFM in June 1980 allowed 30
percent more playing time that would have resulted in 97 minutes for 120  mm
diameter or 74 minutes for a disc as small as 100  mm. Instead, however, the
information density was lowered by 30 percent to keep the playing time at 74
minutes"
(which some of the things I was recalling, too)


As for 30000/1001- that's an artifact of NTSC analog television trying to
fit color information into a b/w signal and then later applying SMPTE
timecode to the resulting frame rate. There's a good explanation at
https://en.wikipedia.org/wiki/SMPTE_timecode#Drop-frame_timecode

z!
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: 5% of audio samples missing when capturing audio on a mac

Carl Zwanzig
On 9/22/2020 8:42 AM, Carl Zwanzig wrote:
> (which some of the things I was recalling, too)
which -corrects-

sigh, I shouldn't post before the coffee takes hold.

z!

_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".