Trouble transcoding with cuda

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Trouble transcoding with cuda

Ray Randomnic
Hey folks,

I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264 using a
GTX 1650 nvenc and having issues with what I assume are the pixel formats
conversions on hardware. My encode speed (in fps) is pretty low (see
below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg version
is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I don't
think this is relevant). For the purposes of this experiment, let's say I'm
not concerned with lossiness with format conversions.

I'd like to know what I'm doing wrong and what commands I can issue for the
following:
decode on GPU -> format conversion (if necessary) on GPU -> encode on GPU.
I might not be understanding a few concepts.

The combination of options that I thought were available and I tried out
are:
- decoder (I mostly left this blank for auto) and encoder (always
h264_nvenc)
- hwaccel
- hwaccel_output_format
- filters (vf):
  - format
  - scale_npp (for format conversion on gpu)

I have no idea what the options pix_fmt or other filters like colorspace do
for hardware (how is pix_fmt different from hwaccel_output_format?). At
this point I'm kind of stuck. Don't know how to convert formats on the GPU
(I assume the format conversion is happening on the CPU).

Input details:
ffprobe input.mp4

Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s, SAR 1:1
DAR 16:9, 29.99 fps, ...

Summary of various combinations (- indicates left blank):
test | hwaccel | hwaccel_output_format | filter (vf)              |
encodefps | note
1    | cuda    | -                     | -                        | X
   | Failed
2    | cuda    | cuda                  | -                        | X
   | Failed
3    | cuda    | yuv420p               | -                        | 361
   | Video messed up
4    | cuda    | cuda                  | format=yuv420p           | X
   | Failed
5    | cuvid   | cuda                  | format=yuv420p           | 91
  | Not using GPU decode
6    | cuda    | -                     | format=yuv420p           | 161
   | Not using GPU format conversion
7    | cuvid   | -                     | format=yuv420p           | 91
  | Not using GPU decode
8    | cuda    | -                     | scale_npp=format=yuv420p | X
   | Failed
9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
   | Failed

I would expect a speed of around test 3 (without the screwed up video). Is
there any way to convert the pixel formats on the hardware without screwing
up the video? On a similar note, I'd love for someone to explain the
failing encodes.

Here are the details for corresponding encodes:

   1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v h264_nvenc
   output.mp4

   Fails with the following:

   [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 000001cc8740fc00] NVDEC capabilities:
   [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count: 262144
   [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
   [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
   pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
   [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
   [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
   [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 > has
   Compute SM 7.5 ]
   [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
   [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
   [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
   Error initializing output stream 0:0 -- Error while opening encoder for
   output stream #0:0 - maybe incorrect parameters such as bit_rate, rate,
   width or height

   2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
   input.mp4 -c:v h264_nvenc output.mp4

   Fails with the following:

   [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 00000240b79e37c0] NVDEC capabilities:
   [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count: 262144
   [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
   [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
   pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
   [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
   [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
   [h264_nvenc @ 00000240b7483700] Provided device doesn't support required
   NVENC features
   [h264_nvenc @ 00000240b7483700] Nvenc unloaded
   Error initializing output stream 0:0 -- Error while opening encoder for
   output stream #0:0 - maybe incorrect parameters such as bit_rate, rate,
   width or height

   Alright, so it seems that the hardware h264 encoder doesn't support 10
   bit encodes (that's coming from the decoder). So lets try changing the
   format:


   3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format yuv420p
   -i input.mp4 -c:v h264_nvenc output.mp4

   Pretty decent encode at ~ 360 fps. Alas, the video is screwed up. Colors
   are weird:

   [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 00000256cbb737c0] NVDEC capabilities:
   [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count: 262144
   [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
   [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
   pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
   [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
   [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
   [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 > has
   Compute SM 7.5 ]
   [h264_nvenc @ 00000256cb693700] supports NVENC

   Let's use a format filter to change format:

   4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
   input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4

   Fails with the following:

   [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 00000193908675c0] NVDEC capabilities:
   [hevc @ 00000193908675c0] format supported: yes, max_mb_count: 262144
   [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
   [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
   pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic' interl:0
   [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
   'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
   filter 'Parsed_format_0'
   Impossible to convert between the formats supported by the filter 'graph
   0 input from stream 0:0' and the filter 'auto_scaler_0'
   Error reinitializing filters!
   Failed to inject frame into filter network: Function not implemented
   Error while processing the decoded data for stream #0:0

   5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format cuda
   -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4

   Succeeds, but only encodes at around 91 fps, due to, I assume, not using
   GPU decoder. What is the difference between cuvid and cuda hwaccel (why did
   the previous fail and this succeed)? Here is the relevant output:

   [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 000002152ac33700] Initializing cuvid hwaccel
   [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le' is not
   supported
   [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
   cuvid hwaccel requested for input stream #0:0, but cannot be initialized.
   [hevc @ 000002152ac33700] Error parsing NAL unit #2.
   [hevc @ 000002152ac79180] Could not find ref with POC 0
   Error while decoding stream #0:0: Operation not permitted
   [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
   pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic' interl:0
   [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
   'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
   filter 'Parsed_format_0'
   [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le sar:1/1
   -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
   [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
   [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
   [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
   [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 > has
   Compute SM 7.5 ]
   [h264_nvenc @ 000002152ac31800] supports NVENC

   Take out hwaccel_output:

   6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf format=yuv420p
   -c:v h264_nvenc out.mp4

   Succeeds, encodes at 161 fps (using both hardware GPU decoder and
   encoder, but I believe the changing of format is happening on the CPU
   between the two stages).

   [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 0000025491b84900] NVDEC capabilities:
   [hevc @ 0000025491b84900] format supported: yes, max_mb_count: 262144
   [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
   [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
   pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic' interl:0
   [Parsed_format_0 @ 000002549203d840] auto-inserting filter
   'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
   filter 'Parsed_format_0'
   [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le sar:1/1 ->
   w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
   [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
   [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
   [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
   [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 > has
   Compute SM 7.5 ]
   [h264_nvenc @ 00000254920a0f40] supports NVENC


   7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf format=yuv420p
   -c:v h264_nvenc out.mp4

   Only encoding on GPU, not decoding (91 fps).

   [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
   [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le' is not
   supported
   [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
   cuvid hwaccel requested for input stream #0:0, but cannot be initialized.
   [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
   [hevc @ 000002163813d300] Could not find ref with POC 0
   Error while decoding stream #0:0: Operation not permitted
   [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
   pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic' interl:0
   [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
   'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
   filter 'Parsed_format_0'
   [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le sar:1/1
   -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
   [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
   [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
   [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
   [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 > has
   Compute SM 7.5 ]
   [h264_nvenc @ 0000021638590f40] supports NVENC

   Lets see if I can do format conversion in the GPU (instead of GPU -> CPU
   -> GPU), by using the scale_npp filter.

   8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
   scale_npp=format=yuv420p -c:v h264_nvenc output.mp4

   Fails

   [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 0000022f207d7f40] NVDEC capabilities:
   [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count: 262144
   [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
   [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
   pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic' interl:0
   [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
   'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
   filter 'Parsed_scale_npp_0'
   Impossible to convert between the formats supported by the filter 'graph
   0 input from stream 0:0' and the filter 'auto_scaler_0'
   Error reinitializing filters!
   Failed to inject frame into filter network: Function not implemented
   Error while processing the decoded data for stream #0:0


   9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
   in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4

   Fails:

   [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
   samplerate:48000 chlayout:0x3
   [hevc @ 00000200747b65c0] NVDEC capabilities:
   [hevc @ 00000200747b65c0] format supported: yes, max_mb_count: 262144
   [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
   [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
   [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
   pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
   [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format: p010le
   [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure output pad
   on Parsed_scale_npp_0
   Error reinitializing filters!
   Failed to inject frame into filter network: Function not implemented
   Error while processing the decoded data for stream #0:0


I'd appreciate any help or pointer in the right direction (even an
alternate mailing list).
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Brainiarc7
On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]> wrote:

>
> Hey folks,
>
> I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264 using a
> GTX 1650 nvenc and having issues with what I assume are the pixel formats
> conversions on hardware. My encode speed (in fps) is pretty low (see
> below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg version
> is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I don't
> think this is relevant). For the purposes of this experiment, let's say I'm
> not concerned with lossiness with format conversions.
>
> I'd like to know what I'm doing wrong and what commands I can issue for the
> following:
> decode on GPU -> format conversion (if necessary) on GPU -> encode on GPU.
> I might not be understanding a few concepts.
>
> The combination of options that I thought were available and I tried out
> are:
> - decoder (I mostly left this blank for auto) and encoder (always
> h264_nvenc)
> - hwaccel
> - hwaccel_output_format
> - filters (vf):
>   - format
>   - scale_npp (for format conversion on gpu)
>
> I have no idea what the options pix_fmt or other filters like colorspace do
> for hardware (how is pix_fmt different from hwaccel_output_format?). At
> this point I'm kind of stuck. Don't know how to convert formats on the GPU
> (I assume the format conversion is happening on the CPU).
>
> Input details:
> ffprobe input.mp4
>
> Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s, SAR 1:1
> DAR 16:9, 29.99 fps, ...
>
> Summary of various combinations (- indicates left blank):
> test | hwaccel | hwaccel_output_format | filter (vf)              |
> encodefps | note
> 1    | cuda    | -                     | -                        | X
>    | Failed
> 2    | cuda    | cuda                  | -                        | X
>    | Failed
> 3    | cuda    | yuv420p               | -                        | 361
>    | Video messed up
> 4    | cuda    | cuda                  | format=yuv420p           | X
>    | Failed
> 5    | cuvid   | cuda                  | format=yuv420p           | 91
>   | Not using GPU decode
> 6    | cuda    | -                     | format=yuv420p           | 161
>    | Not using GPU format conversion
> 7    | cuvid   | -                     | format=yuv420p           | 91
>   | Not using GPU decode
> 8    | cuda    | -                     | scale_npp=format=yuv420p | X
>    | Failed
> 9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
>    | Failed
>
> I would expect a speed of around test 3 (without the screwed up video). Is
> there any way to convert the pixel formats on the hardware without screwing
> up the video? On a similar note, I'd love for someone to explain the
> failing encodes.
>
> Here are the details for corresponding encodes:
>
>    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v h264_nvenc
>    output.mp4
>
>    Fails with the following:
>
>    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 000001cc8740fc00] NVDEC capabilities:
>    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count: 262144
>    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
>    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
>    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
>    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
>    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
>    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 > has
>    Compute SM 7.5 ]
>    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
>    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
>    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
>    Error initializing output stream 0:0 -- Error while opening encoder for
>    output stream #0:0 - maybe incorrect parameters such as bit_rate, rate,
>    width or height
>
>    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
>    input.mp4 -c:v h264_nvenc output.mp4
>
>    Fails with the following:
>
>    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 00000240b79e37c0] NVDEC capabilities:
>    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count: 262144
>    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
>    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
>    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
>    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
>    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
>    [h264_nvenc @ 00000240b7483700] Provided device doesn't support required
>    NVENC features
>    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
>    Error initializing output stream 0:0 -- Error while opening encoder for
>    output stream #0:0 - maybe incorrect parameters such as bit_rate, rate,
>    width or height
>
>    Alright, so it seems that the hardware h264 encoder doesn't support 10
>    bit encodes (that's coming from the decoder). So lets try changing the
>    format:
>
>
>    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format yuv420p
>    -i input.mp4 -c:v h264_nvenc output.mp4
>
>    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up. Colors
>    are weird:
>
>    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 00000256cbb737c0] NVDEC capabilities:
>    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count: 262144
>    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
>    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
>    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
>    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
>    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
>    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 > has
>    Compute SM 7.5 ]
>    [h264_nvenc @ 00000256cb693700] supports NVENC
>
>    Let's use a format filter to change format:
>
>    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
>    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>
>    Fails with the following:
>
>    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 00000193908675c0] NVDEC capabilities:
>    [hevc @ 00000193908675c0] format supported: yes, max_mb_count: 262144
>    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
>    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
>    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic' interl:0
>    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
>    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
>    filter 'Parsed_format_0'
>    Impossible to convert between the formats supported by the filter 'graph
>    0 input from stream 0:0' and the filter 'auto_scaler_0'
>    Error reinitializing filters!
>    Failed to inject frame into filter network: Function not implemented
>    Error while processing the decoded data for stream #0:0
>
>    5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format cuda
>    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>
>    Succeeds, but only encodes at around 91 fps, due to, I assume, not using
>    GPU decoder. What is the difference between cuvid and cuda hwaccel (why did
>    the previous fail and this succeed)? Here is the relevant output:
>
>    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
>    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le' is not
>    supported
>    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
>    cuvid hwaccel requested for input stream #0:0, but cannot be initialized.
>    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
>    [hevc @ 000002152ac79180] Could not find ref with POC 0
>    Error while decoding stream #0:0: Operation not permitted
>    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
>    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic' interl:0
>    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
>    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
>    filter 'Parsed_format_0'
>    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le sar:1/1
>    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
>    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
>    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
>    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 > has
>    Compute SM 7.5 ]
>    [h264_nvenc @ 000002152ac31800] supports NVENC
>
>    Take out hwaccel_output:
>
>    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf format=yuv420p
>    -c:v h264_nvenc out.mp4
>
>    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
>    encoder, but I believe the changing of format is happening on the CPU
>    between the two stages).
>
>    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 0000025491b84900] NVDEC capabilities:
>    [hevc @ 0000025491b84900] format supported: yes, max_mb_count: 262144
>    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
>    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
>    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic' interl:0
>    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
>    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
>    filter 'Parsed_format_0'
>    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le sar:1/1 ->
>    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
>    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
>    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
>    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 > has
>    Compute SM 7.5 ]
>    [h264_nvenc @ 00000254920a0f40] supports NVENC
>
>
>    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf format=yuv420p
>    -c:v h264_nvenc out.mp4
>
>    Only encoding on GPU, not decoding (91 fps).
>
>    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
>    [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le' is not
>    supported
>    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
>    cuvid hwaccel requested for input stream #0:0, but cannot be initialized.
>    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
>    [hevc @ 000002163813d300] Could not find ref with POC 0
>    Error while decoding stream #0:0: Operation not permitted
>    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
>    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic' interl:0
>    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
>    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
>    filter 'Parsed_format_0'
>    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le sar:1/1
>    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
>    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
>    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
>    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 > has
>    Compute SM 7.5 ]
>    [h264_nvenc @ 0000021638590f40] supports NVENC
>
>    Lets see if I can do format conversion in the GPU (instead of GPU -> CPU
>    -> GPU), by using the scale_npp filter.
>
>    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
>    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
>
>    Fails
>
>    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 0000022f207d7f40] NVDEC capabilities:
>    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count: 262144
>    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
>    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
>    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic' interl:0
>    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
>    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the
>    filter 'Parsed_scale_npp_0'
>    Impossible to convert between the formats supported by the filter 'graph
>    0 input from stream 0:0' and the filter 'auto_scaler_0'
>    Error reinitializing filters!
>    Failed to inject frame into filter network: Function not implemented
>    Error while processing the decoded data for stream #0:0
>
>
>    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda -i
>    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
>
>    Fails:
>
>    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
>    samplerate:48000 chlayout:0x3
>    [hevc @ 00000200747b65c0] NVDEC capabilities:
>    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count: 262144
>    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
>    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
>    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
>    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format: p010le
>    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure output pad
>    on Parsed_scale_npp_0
>    Error reinitializing filters!
>    Failed to inject frame into filter network: Function not implemented
>    Error while processing the decoded data for stream #0:0
>
>
> I'd appreciate any help or pointer in the right direction (even an
> alternate mailing list).


Hey there,

Could you kindly provide a download link to the sample of the input
file you're working on?
That way we can reproduce what you're seeing here, thanks!
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Ray Randomnic
Hey,

Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
S10) with the HDR10+ setting will do. A sample is posted here:
http://awakeman.redirectme.net/web/testvideo/sample.mp4

Thanks.

On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]> wrote:

> On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]> wrote:
> >
> > Hey folks,
> >
> > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264 using
> a
> > GTX 1650 nvenc and having issues with what I assume are the pixel formats
> > conversions on hardware. My encode speed (in fps) is pretty low (see
> > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
> version
> > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I don't
> > think this is relevant). For the purposes of this experiment, let's say
> I'm
> > not concerned with lossiness with format conversions.
> >
> > I'd like to know what I'm doing wrong and what commands I can issue for
> the
> > following:
> > decode on GPU -> format conversion (if necessary) on GPU -> encode on
> GPU.
> > I might not be understanding a few concepts.
> >
> > The combination of options that I thought were available and I tried out
> > are:
> > - decoder (I mostly left this blank for auto) and encoder (always
> > h264_nvenc)
> > - hwaccel
> > - hwaccel_output_format
> > - filters (vf):
> >   - format
> >   - scale_npp (for format conversion on gpu)
> >
> > I have no idea what the options pix_fmt or other filters like colorspace
> do
> > for hardware (how is pix_fmt different from hwaccel_output_format?). At
> > this point I'm kind of stuck. Don't know how to convert formats on the
> GPU
> > (I assume the format conversion is happening on the CPU).
> >
> > Input details:
> > ffprobe input.mp4
> >
> > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s, SAR
> 1:1
> > DAR 16:9, 29.99 fps, ...
> >
> > Summary of various combinations (- indicates left blank):
> > test | hwaccel | hwaccel_output_format | filter (vf)              |
> > encodefps | note
> > 1    | cuda    | -                     | -                        | X
> >    | Failed
> > 2    | cuda    | cuda                  | -                        | X
> >    | Failed
> > 3    | cuda    | yuv420p               | -                        | 361
> >    | Video messed up
> > 4    | cuda    | cuda                  | format=yuv420p           | X
> >    | Failed
> > 5    | cuvid   | cuda                  | format=yuv420p           | 91
> >   | Not using GPU decode
> > 6    | cuda    | -                     | format=yuv420p           | 161
> >    | Not using GPU format conversion
> > 7    | cuvid   | -                     | format=yuv420p           | 91
> >   | Not using GPU decode
> > 8    | cuda    | -                     | scale_npp=format=yuv420p | X
> >    | Failed
> > 9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
> >    | Failed
> >
> > I would expect a speed of around test 3 (without the screwed up video).
> Is
> > there any way to convert the pixel formats on the hardware without
> screwing
> > up the video? On a similar note, I'd love for someone to explain the
> > failing encodes.
> >
> > Here are the details for corresponding encodes:
> >
> >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v h264_nvenc
> >    output.mp4
> >
> >    Fails with the following:
> >
> >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count: 262144
> >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
> >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
> >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 > has
> >    Compute SM 7.5 ]
> >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
> >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> >    Error initializing output stream 0:0 -- Error while opening encoder
> for
> >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> rate,
> >    width or height
> >
> >    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> -i
> >    input.mp4 -c:v h264_nvenc output.mp4
> >
> >    Fails with the following:
> >
> >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count: 262144
> >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
> >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
> >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
> required
> >    NVENC features
> >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> >    Error initializing output stream 0:0 -- Error while opening encoder
> for
> >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> rate,
> >    width or height
> >
> >    Alright, so it seems that the hardware h264 encoder doesn't support 10
> >    bit encodes (that's coming from the decoder). So lets try changing the
> >    format:
> >
> >
> >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> yuv420p
> >    -i input.mp4 -c:v h264_nvenc output.mp4
> >
> >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up.
> Colors
> >    are weird:
> >
> >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count: 262144
> >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
> >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
> >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 > has
> >    Compute SM 7.5 ]
> >    [h264_nvenc @ 00000256cb693700] supports NVENC
> >
> >    Let's use a format filter to change format:
> >
> >    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> -i
> >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> >
> >    Fails with the following:
> >
> >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 00000193908675c0] NVDEC capabilities:
> >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count: 262144
> >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
> >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic' interl:0
> >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> and the
> >    filter 'Parsed_format_0'
> >    Impossible to convert between the formats supported by the filter
> 'graph
> >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> >    Error reinitializing filters!
> >    Failed to inject frame into filter network: Function not implemented
> >    Error while processing the decoded data for stream #0:0
> >
> >    5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format cuda
> >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> >
> >    Succeeds, but only encodes at around 91 fps, due to, I assume, not
> using
> >    GPU decoder. What is the difference between cuvid and cuda hwaccel
> (why did
> >    the previous fail and this succeed)? Here is the relevant output:
> >
> >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le' is
> not
> >    supported
> >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
> >    cuvid hwaccel requested for input stream #0:0, but cannot be
> initialized.
> >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> >    Error while decoding stream #0:0: Operation not permitted
> >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
> >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic' interl:0
> >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> and the
> >    filter 'Parsed_format_0'
> >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
> sar:1/1
> >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
> >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 > has
> >    Compute SM 7.5 ]
> >    [h264_nvenc @ 000002152ac31800] supports NVENC
> >
> >    Take out hwaccel_output:
> >
> >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf format=yuv420p
> >    -c:v h264_nvenc out.mp4
> >
> >    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
> >    encoder, but I believe the changing of format is happening on the CPU
> >    between the two stages).
> >
> >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 0000025491b84900] NVDEC capabilities:
> >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count: 262144
> >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
> >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic' interl:0
> >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> and the
> >    filter 'Parsed_format_0'
> >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le sar:1/1 ->
> >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
> >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 > has
> >    Compute SM 7.5 ]
> >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> >
> >
> >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> format=yuv420p
> >    -c:v h264_nvenc out.mp4
> >
> >    Only encoding on GPU, not decoding (91 fps).
> >
> >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> >    [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le' is
> not
> >    supported
> >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
> >    cuvid hwaccel requested for input stream #0:0, but cannot be
> initialized.
> >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> >    [hevc @ 000002163813d300] Could not find ref with POC 0
> >    Error while decoding stream #0:0: Operation not permitted
> >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
> >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic' interl:0
> >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> and the
> >    filter 'Parsed_format_0'
> >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
> sar:1/1
> >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
> >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 > has
> >    Compute SM 7.5 ]
> >    [h264_nvenc @ 0000021638590f40] supports NVENC
> >
> >    Lets see if I can do format conversion in the GPU (instead of GPU ->
> CPU
> >    -> GPU), by using the scale_npp filter.
> >
> >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> >
> >    Fails
> >
> >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count: 262144
> >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
> >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic' interl:0
> >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
> >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> and the
> >    filter 'Parsed_scale_npp_0'
> >    Impossible to convert between the formats supported by the filter
> 'graph
> >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> >    Error reinitializing filters!
> >    Failed to inject frame into filter network: Function not implemented
> >    Error while processing the decoded data for stream #0:0
> >
> >
> >    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> -i
> >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> >
> >    Fails:
> >
> >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
> >    samplerate:48000 chlayout:0x3
> >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count: 262144
> >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
> >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format:
> p010le
> >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure output pad
> >    on Parsed_scale_npp_0
> >    Error reinitializing filters!
> >    Failed to inject frame into filter network: Function not implemented
> >    Error while processing the decoded data for stream #0:0
> >
> >
> > I'd appreciate any help or pointer in the right direction (even an
> > alternate mailing list).
>
>
> Hey there,
>
> Could you kindly provide a download link to the sample of the input
> file you're working on?
> That way we can reproduce what you're seeing here, thanks!
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> [hidden email] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Brainiarc7
On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]> wrote:

>
> Hey,
>
> Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
> S10) with the HDR10+ setting will do. A sample is posted here:
> http://awakeman.redirectme.net/web/testvideo/sample.mp4
>
> Thanks.
>
> On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]> wrote:
>
> > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]> wrote:
> > >
> > > Hey folks,
> > >
> > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264 using
> > a
> > > GTX 1650 nvenc and having issues with what I assume are the pixel formats
> > > conversions on hardware. My encode speed (in fps) is pretty low (see
> > > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
> > version
> > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I don't
> > > think this is relevant). For the purposes of this experiment, let's say
> > I'm
> > > not concerned with lossiness with format conversions.
> > >
> > > I'd like to know what I'm doing wrong and what commands I can issue for
> > the
> > > following:
> > > decode on GPU -> format conversion (if necessary) on GPU -> encode on
> > GPU.
> > > I might not be understanding a few concepts.
> > >
> > > The combination of options that I thought were available and I tried out
> > > are:
> > > - decoder (I mostly left this blank for auto) and encoder (always
> > > h264_nvenc)
> > > - hwaccel
> > > - hwaccel_output_format
> > > - filters (vf):
> > >   - format
> > >   - scale_npp (for format conversion on gpu)
> > >
> > > I have no idea what the options pix_fmt or other filters like colorspace
> > do
> > > for hardware (how is pix_fmt different from hwaccel_output_format?). At
> > > this point I'm kind of stuck. Don't know how to convert formats on the
> > GPU
> > > (I assume the format conversion is happening on the CPU).
> > >
> > > Input details:
> > > ffprobe input.mp4
> > >
> > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s, SAR
> > 1:1
> > > DAR 16:9, 29.99 fps, ...
> > >
> > > Summary of various combinations (- indicates left blank):
> > > test | hwaccel | hwaccel_output_format | filter (vf)              |
> > > encodefps | note
> > > 1    | cuda    | -                     | -                        | X
> > >    | Failed
> > > 2    | cuda    | cuda                  | -                        | X
> > >    | Failed
> > > 3    | cuda    | yuv420p               | -                        | 361
> > >    | Video messed up
> > > 4    | cuda    | cuda                  | format=yuv420p           | X
> > >    | Failed
> > > 5    | cuvid   | cuda                  | format=yuv420p           | 91
> > >   | Not using GPU decode
> > > 6    | cuda    | -                     | format=yuv420p           | 161
> > >    | Not using GPU format conversion
> > > 7    | cuvid   | -                     | format=yuv420p           | 91
> > >   | Not using GPU decode
> > > 8    | cuda    | -                     | scale_npp=format=yuv420p | X
> > >    | Failed
> > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
> > >    | Failed
> > >
> > > I would expect a speed of around test 3 (without the screwed up video).
> > Is
> > > there any way to convert the pixel formats on the hardware without
> > screwing
> > > up the video? On a similar note, I'd love for someone to explain the
> > > failing encodes.
> > >
> > > Here are the details for corresponding encodes:
> > >
> > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v h264_nvenc
> > >    output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
> > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
> > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> > >    Error initializing output stream 0:0 -- Error while opening encoder
> > for
> > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > rate,
> > >    width or height
> > >
> > >    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    input.mp4 -c:v h264_nvenc output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
> > required
> > >    NVENC features
> > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> > >    Error initializing output stream 0:0 -- Error while opening encoder
> > for
> > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > rate,
> > >    width or height
> > >
> > >    Alright, so it seems that the hardware h264 encoder doesn't support 10
> > >    bit encodes (that's coming from the decoder). So lets try changing the
> > >    format:
> > >
> > >
> > >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> > yuv420p
> > >    -i input.mp4 -c:v h264_nvenc output.mp4
> > >
> > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up.
> > Colors
> > >    are weird:
> > >
> > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
> > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 00000256cb693700] supports NVENC
> > >
> > >    Let's use a format filter to change format:
> > >
> > >    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000193908675c0] NVDEC capabilities:
> > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    Impossible to convert between the formats supported by the filter
> > 'graph
> > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >    5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format cuda
> > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Succeeds, but only encodes at around 91 fps, due to, I assume, not
> > using
> > >    GPU decoder. What is the difference between cuvid and cuda hwaccel
> > (why did
> > >    the previous fail and this succeed)? Here is the relevant output:
> > >
> > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le' is
> > not
> > >    supported
> > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
> > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > initialized.
> > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> > >    Error while decoding stream #0:0: Operation not permitted
> > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
> > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
> > sar:1/1
> > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
> > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 000002152ac31800] supports NVENC
> > >
> > >    Take out hwaccel_output:
> > >
> > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf format=yuv420p
> > >    -c:v h264_nvenc out.mp4
> > >
> > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
> > >    encoder, but I believe the changing of format is happening on the CPU
> > >    between the two stages).
> > >
> > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 0000025491b84900] NVDEC capabilities:
> > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le sar:1/1 ->
> > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> > >
> > >
> > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> > format=yuv420p
> > >    -c:v h264_nvenc out.mp4
> > >
> > >    Only encoding on GPU, not decoding (91 fps).
> > >
> > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> > >    [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le' is
> > not
> > >    supported
> > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
> > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > initialized.
> > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> > >    [hevc @ 000002163813d300] Could not find ref with POC 0
> > >    Error while decoding stream #0:0: Operation not permitted
> > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
> > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
> > sar:1/1
> > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
> > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 0000021638590f40] supports NVENC
> > >
> > >    Lets see if I can do format conversion in the GPU (instead of GPU ->
> > CPU
> > >    -> GPU), by using the scale_npp filter.
> > >
> > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Fails
> > >
> > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_scale_npp_0'
> > >    Impossible to convert between the formats supported by the filter
> > 'graph
> > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >
> > >    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> > >
> > >    Fails:
> > >
> > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format:
> > p010le
> > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure output pad
> > >    on Parsed_scale_npp_0
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >
> > > I'd appreciate any help or pointer in the right direction (even an
> > > alternate mailing list).
> >
> >
> > Hey there,
> >
> > Could you kindly provide a download link to the sample of the input
> > file you're working on?
> > That way we can reproduce what you're seeing here, thanks!

Another thing:

What version of FFmpeg are you running?
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Brainiarc7
In reply to this post by Ray Randomnic
On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]> wrote:

>
> Hey,
>
> Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
> S10) with the HDR10+ setting will do. A sample is posted here:
> http://awakeman.redirectme.net/web/testvideo/sample.mp4
>
> Thanks.
>
> On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]> wrote:
>
> > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]> wrote:
> > >
> > > Hey folks,
> > >
> > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264 using
> > a
> > > GTX 1650 nvenc and having issues with what I assume are the pixel formats
> > > conversions on hardware. My encode speed (in fps) is pretty low (see
> > > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
> > version
> > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I don't
> > > think this is relevant). For the purposes of this experiment, let's say
> > I'm
> > > not concerned with lossiness with format conversions.
> > >
> > > I'd like to know what I'm doing wrong and what commands I can issue for
> > the
> > > following:
> > > decode on GPU -> format conversion (if necessary) on GPU -> encode on
> > GPU.
> > > I might not be understanding a few concepts.
> > >
> > > The combination of options that I thought were available and I tried out
> > > are:
> > > - decoder (I mostly left this blank for auto) and encoder (always
> > > h264_nvenc)
> > > - hwaccel
> > > - hwaccel_output_format
> > > - filters (vf):
> > >   - format
> > >   - scale_npp (for format conversion on gpu)
> > >
> > > I have no idea what the options pix_fmt or other filters like colorspace
> > do
> > > for hardware (how is pix_fmt different from hwaccel_output_format?). At
> > > this point I'm kind of stuck. Don't know how to convert formats on the
> > GPU
> > > (I assume the format conversion is happening on the CPU).
> > >
> > > Input details:
> > > ffprobe input.mp4
> > >
> > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s, SAR
> > 1:1
> > > DAR 16:9, 29.99 fps, ...
> > >
> > > Summary of various combinations (- indicates left blank):
> > > test | hwaccel | hwaccel_output_format | filter (vf)              |
> > > encodefps | note
> > > 1    | cuda    | -                     | -                        | X
> > >    | Failed
> > > 2    | cuda    | cuda                  | -                        | X
> > >    | Failed
> > > 3    | cuda    | yuv420p               | -                        | 361
> > >    | Video messed up
> > > 4    | cuda    | cuda                  | format=yuv420p           | X
> > >    | Failed
> > > 5    | cuvid   | cuda                  | format=yuv420p           | 91
> > >   | Not using GPU decode
> > > 6    | cuda    | -                     | format=yuv420p           | 161
> > >    | Not using GPU format conversion
> > > 7    | cuvid   | -                     | format=yuv420p           | 91
> > >   | Not using GPU decode
> > > 8    | cuda    | -                     | scale_npp=format=yuv420p | X
> > >    | Failed
> > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
> > >    | Failed
> > >
> > > I would expect a speed of around test 3 (without the screwed up video).
> > Is
> > > there any way to convert the pixel formats on the hardware without
> > screwing
> > > up the video? On a similar note, I'd love for someone to explain the
> > > failing encodes.
> > >
> > > Here are the details for corresponding encodes:
> > >
> > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v h264_nvenc
> > >    output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
> > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
> > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> > >    Error initializing output stream 0:0 -- Error while opening encoder
> > for
> > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > rate,
> > >    width or height
> > >
> > >    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    input.mp4 -c:v h264_nvenc output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
> > required
> > >    NVENC features
> > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> > >    Error initializing output stream 0:0 -- Error while opening encoder
> > for
> > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > rate,
> > >    width or height
> > >
> > >    Alright, so it seems that the hardware h264 encoder doesn't support 10
> > >    bit encodes (that's coming from the decoder). So lets try changing the
> > >    format:
> > >
> > >
> > >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> > yuv420p
> > >    -i input.mp4 -c:v h264_nvenc output.mp4
> > >
> > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up.
> > Colors
> > >    are weird:
> > >
> > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
> > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 00000256cb693700] supports NVENC
> > >
> > >    Let's use a format filter to change format:
> > >
> > >    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Fails with the following:
> > >
> > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000193908675c0] NVDEC capabilities:
> > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    Impossible to convert between the formats supported by the filter
> > 'graph
> > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >    5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format cuda
> > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Succeeds, but only encodes at around 91 fps, due to, I assume, not
> > using
> > >    GPU decoder. What is the difference between cuvid and cuda hwaccel
> > (why did
> > >    the previous fail and this succeed)? Here is the relevant output:
> > >
> > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le' is
> > not
> > >    supported
> > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
> > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > initialized.
> > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> > >    Error while decoding stream #0:0: Operation not permitted
> > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
> > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
> > sar:1/1
> > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
> > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 000002152ac31800] supports NVENC
> > >
> > >    Take out hwaccel_output:
> > >
> > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf format=yuv420p
> > >    -c:v h264_nvenc out.mp4
> > >
> > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
> > >    encoder, but I believe the changing of format is happening on the CPU
> > >    between the two stages).
> > >
> > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 0000025491b84900] NVDEC capabilities:
> > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le sar:1/1 ->
> > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
> > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> > >
> > >
> > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> > format=yuv420p
> > >    -c:v h264_nvenc out.mp4
> > >
> > >    Only encoding on GPU, not decoding (91 fps).
> > >
> > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> > >    [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le' is
> > not
> > >    supported
> > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
> > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > initialized.
> > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> > >    [hevc @ 000002163813d300] Could not find ref with POC 0
> > >    Error while decoding stream #0:0: Operation not permitted
> > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
> > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_format_0'
> > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
> > sar:1/1
> > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
> > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 > has
> > >    Compute SM 7.5 ]
> > >    [h264_nvenc @ 0000021638590f40] supports NVENC
> > >
> > >    Lets see if I can do format conversion in the GPU (instead of GPU ->
> > CPU
> > >    -> GPU), by using the scale_npp filter.
> > >
> > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> > >
> > >    Fails
> > >
> > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
> > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic' interl:0
> > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
> > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > and the
> > >    filter 'Parsed_scale_npp_0'
> > >    Impossible to convert between the formats supported by the filter
> > 'graph
> > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >
> > >    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format cuda
> > -i
> > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> > >
> > >    Fails:
> > >
> > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
> > >    samplerate:48000 chlayout:0x3
> > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count: 262144
> > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
> > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format:
> > p010le
> > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure output pad
> > >    on Parsed_scale_npp_0
> > >    Error reinitializing filters!
> > >    Failed to inject frame into filter network: Function not implemented
> > >    Error while processing the decoded data for stream #0:0
> > >
> > >
> > > I'd appreciate any help or pointer in the right direction (even an
> > > alternate mailing list).
> >
> >
> > Hey there,
> >
> > Could you kindly provide a download link to the sample of the input
> > file you're working on?
> > That way we can reproduce what you're seeing here, thanks!

The link you provided for the sample is dead.
I'll try to reproduce this on my end with clips recorded from a Samsung S8.
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Ray Randomnic
In reply to this post by Brainiarc7
Link for the sample is still alive. I am able to download it. Right-click
and click save (or use wget or curl?)?
http://awakeman.redirectme.net/web/testvideo/sample.mp4

As mentioned in the first email, the ffmpeg version is
N-94578-gd6bd902599-gcff309097a+3

It's compiled from the latest source as of a week ago.



On Wed, Sep 4, 2019 at 11:39 AM Dennis Mungai <[hidden email]> wrote:

> On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]> wrote:
> >
> > Hey,
> >
> > Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
> > S10) with the HDR10+ setting will do. A sample is posted here:
> > http://awakeman.redirectme.net/web/testvideo/sample.mp4
> >
> > Thanks.
> >
> > On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]> wrote:
> >
> > > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]>
> wrote:
> > > >
> > > > Hey folks,
> > > >
> > > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264
> using
> > > a
> > > > GTX 1650 nvenc and having issues with what I assume are the pixel
> formats
> > > > conversions on hardware. My encode speed (in fps) is pretty low (see
> > > > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
> > > version
> > > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I
> don't
> > > > think this is relevant). For the purposes of this experiment, let's
> say
> > > I'm
> > > > not concerned with lossiness with format conversions.
> > > >
> > > > I'd like to know what I'm doing wrong and what commands I can issue
> for
> > > the
> > > > following:
> > > > decode on GPU -> format conversion (if necessary) on GPU -> encode on
> > > GPU.
> > > > I might not be understanding a few concepts.
> > > >
> > > > The combination of options that I thought were available and I tried
> out
> > > > are:
> > > > - decoder (I mostly left this blank for auto) and encoder (always
> > > > h264_nvenc)
> > > > - hwaccel
> > > > - hwaccel_output_format
> > > > - filters (vf):
> > > >   - format
> > > >   - scale_npp (for format conversion on gpu)
> > > >
> > > > I have no idea what the options pix_fmt or other filters like
> colorspace
> > > do
> > > > for hardware (how is pix_fmt different from hwaccel_output_format?).
> At
> > > > this point I'm kind of stuck. Don't know how to convert formats on
> the
> > > GPU
> > > > (I assume the format conversion is happening on the CPU).
> > > >
> > > > Input details:
> > > > ffprobe input.mp4
> > > >
> > > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> > > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s,
> SAR
> > > 1:1
> > > > DAR 16:9, 29.99 fps, ...
> > > >
> > > > Summary of various combinations (- indicates left blank):
> > > > test | hwaccel | hwaccel_output_format | filter (vf)              |
> > > > encodefps | note
> > > > 1    | cuda    | -                     | -                        | X
> > > >    | Failed
> > > > 2    | cuda    | cuda                  | -                        | X
> > > >    | Failed
> > > > 3    | cuda    | yuv420p               | -                        |
> 361
> > > >    | Video messed up
> > > > 4    | cuda    | cuda                  | format=yuv420p           | X
> > > >    | Failed
> > > > 5    | cuvid   | cuda                  | format=yuv420p           |
> 91
> > > >   | Not using GPU decode
> > > > 6    | cuda    | -                     | format=yuv420p           |
> 161
> > > >    | Not using GPU format conversion
> > > > 7    | cuvid   | -                     | format=yuv420p           |
> 91
> > > >   | Not using GPU decode
> > > > 8    | cuda    | -                     | scale_npp=format=yuv420p | X
> > > >    | Failed
> > > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p | X
> > > >    | Failed
> > > >
> > > > I would expect a speed of around test 3 (without the screwed up
> video).
> > > Is
> > > > there any way to convert the pixel formats on the hardware without
> > > screwing
> > > > up the video? On a similar note, I'd love for someone to explain the
> > > > failing encodes.
> > > >
> > > > Here are the details for corresponding encodes:
> > > >
> > > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v
> h264_nvenc
> > > >    output.mp4
> > > >
> > > >    Fails with the following:
> > > >
> > > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> > > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> > > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
> > > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> > > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 >
> has
> > > >    Compute SM 7.5 ]
> > > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> > > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> > > >    Error initializing output stream 0:0 -- Error while opening
> encoder
> > > for
> > > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > > rate,
> > > >    width or height
> > > >
> > > >    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> cuda
> > > -i
> > > >    input.mp4 -c:v h264_nvenc output.mp4
> > > >
> > > >    Fails with the following:
> > > >
> > > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> > > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> > > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
> > > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> > > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
> > > required
> > > >    NVENC features
> > > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> > > >    Error initializing output stream 0:0 -- Error while opening
> encoder
> > > for
> > > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
> > > rate,
> > > >    width or height
> > > >
> > > >    Alright, so it seems that the hardware h264 encoder doesn't
> support 10
> > > >    bit encodes (that's coming from the decoder). So lets try
> changing the
> > > >    format:
> > > >
> > > >
> > > >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> > > yuv420p
> > > >    -i input.mp4 -c:v h264_nvenc output.mp4
> > > >
> > > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up.
> > > Colors
> > > >    are weird:
> > > >
> > > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> > > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> > > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
> > > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
> > > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> > > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 >
> has
> > > >    Compute SM 7.5 ]
> > > >    [h264_nvenc @ 00000256cb693700] supports NVENC
> > > >
> > > >    Let's use a format filter to change format:
> > > >
> > > >    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> cuda
> > > -i
> > > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > > >
> > > >    Fails with the following:
> > > >
> > > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 00000193908675c0] NVDEC capabilities:
> > > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> > > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic'
> interl:0
> > > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > > and the
> > > >    filter 'Parsed_format_0'
> > > >    Impossible to convert between the formats supported by the filter
> > > 'graph
> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > > >    Error reinitializing filters!
> > > >    Failed to inject frame into filter network: Function not
> implemented
> > > >    Error while processing the decoded data for stream #0:0
> > > >
> > > >    5. ffmpeg -loglevel verbose -hwaccel cuvid -hwaccel_output_format
> cuda
> > > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > > >
> > > >    Succeeds, but only encodes at around 91 fps, due to, I assume, not
> > > using
> > > >    GPU decoder. What is the difference between cuvid and cuda hwaccel
> > > (why did
> > > >    the previous fail and this succeed)? Here is the relevant output:
> > > >
> > > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> > > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format 'yuv420p10le'
> is
> > > not
> > > >    supported
> > > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > > initialized.
> > > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> > > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> > > >    Error while decoding stream #0:0: Operation not permitted
> > > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic'
> interl:0
> > > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > > and the
> > > >    filter 'Parsed_format_0'
> > > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
> > > sar:1/1
> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
> > > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> > > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 >
> has
> > > >    Compute SM 7.5 ]
> > > >    [h264_nvenc @ 000002152ac31800] supports NVENC
> > > >
> > > >    Take out hwaccel_output:
> > > >
> > > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf
> format=yuv420p
> > > >    -c:v h264_nvenc out.mp4
> > > >
> > > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
> > > >    encoder, but I believe the changing of format is happening on the
> CPU
> > > >    between the two stages).
> > > >
> > > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 0000025491b84900] NVDEC capabilities:
> > > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> > > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic'
> interl:0
> > > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > > and the
> > > >    filter 'Parsed_format_0'
> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le
> sar:1/1 ->
> > > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
> > > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> > > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 >
> has
> > > >    Compute SM 7.5 ]
> > > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> > > >
> > > >
> > > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> > > format=yuv420p
> > > >    -c:v h264_nvenc out.mp4
> > > >
> > > >    Only encoding on GPU, not decoding (91 fps).
> > > >
> > > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> > > >    [AVHWFramesContext @ 00000216387fc300] Pixel format 'yuv420p10le'
> is
> > > not
> > > >    supported
> > > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > > initialized.
> > > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> > > >    [hevc @ 000002163813d300] Could not find ref with POC 0
> > > >    Error while decoding stream #0:0: Operation not permitted
> > > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic'
> interl:0
> > > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > > and the
> > > >    filter 'Parsed_format_0'
> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
> > > sar:1/1
> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> > > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
> > > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> > > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 >
> has
> > > >    Compute SM 7.5 ]
> > > >    [h264_nvenc @ 0000021638590f40] supports NVENC
> > > >
> > > >    Lets see if I can do format conversion in the GPU (instead of GPU
> ->
> > > CPU
> > > >    -> GPU), by using the scale_npp filter.
> > > >
> > > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> > > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> > > >
> > > >    Fails
> > > >
> > > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> > > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> > > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic'
> interl:0
> > > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream 0:0'
> > > and the
> > > >    filter 'Parsed_scale_npp_0'
> > > >    Impossible to convert between the formats supported by the filter
> > > 'graph
> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > > >    Error reinitializing filters!
> > > >    Failed to inject frame into filter network: Function not
> implemented
> > > >    Error while processing the decoded data for stream #0:0
> > > >
> > > >
> > > >    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
> cuda
> > > -i
> > > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> > > >
> > > >    Fails:
> > > >
> > > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
> > > >    samplerate:48000 chlayout:0x3
> > > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> > > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count:
> 262144
> > > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> > > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> > > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format:
> > > p010le
> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure
> output pad
> > > >    on Parsed_scale_npp_0
> > > >    Error reinitializing filters!
> > > >    Failed to inject frame into filter network: Function not
> implemented
> > > >    Error while processing the decoded data for stream #0:0
> > > >
> > > >
> > > > I'd appreciate any help or pointer in the right direction (even an
> > > > alternate mailing list).
> > >
> > >
> > > Hey there,
> > >
> > > Could you kindly provide a download link to the sample of the input
> > > file you're working on?
> > > That way we can reproduce what you're seeing here, thanks!
>
> Another thing:
>
> What version of FFmpeg are you running?
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> [hidden email] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Ray Randomnic
ffmpeg version N-94578-gd6bd902599-gcff309097a+3 Copyright (c) 2000-2019
the FFmpeg developers
  built with gcc 9.2.0 (Rev1, Built by MSYS2 project)
  configuration:  --disable-autodetect --enable-amf --enable-bzlib
--enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv
--enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec
--enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus
--enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
--enable-libdav1d --disable-debug --enable-fontconfig --enable-libass
--enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora
--enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc
--enable-libwavpack --enable-libwebp --enable-libxml2 --enable-libzimg
--enable-libshine --enable-gpl --enable-avisynth --enable-libxvid
--enable-libaom --enable-libopenmpt --enable-version3 --enable-chromaprint
--enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca
--enable-libcdio --enable-libfdk-aac --enable-libflite --enable-libfribidi
--enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc
--enable-libkvazaar --enable-libmodplug --enable-librtmp
--enable-librubberband --enable-libssh --enable-libtesseract
--enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal
--enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa
--enable-opencl --enable-opengl --enable-libnpp --enable-libopenh264
--enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
--extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
--extra-cflags=-DLIBSSH_STATIC
--extra-ldflags='-Wl,--allow-multiple-definition'
--extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
--extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
--extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
--extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads
--extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
--extra-cflags='-IC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/include'
--extra-ldflags='-LC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/lib/x64'
  libavutil      56. 33.100 / 56. 33.100
  libavcodec     58. 55.100 / 58. 55.100
  libavformat    58. 31.101 / 58. 31.101
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.100 /  7. 58.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Hyper fast Audio and Video encoder

On Wed, Sep 4, 2019 at 2:05 PM Ray Randomnic <[hidden email]> wrote:

> Link for the sample is still alive. I am able to download it. Right-click
> and click save (or use wget or curl?)?
> http://awakeman.redirectme.net/web/testvideo/sample.mp4
>
> As mentioned in the first email, the ffmpeg version is
> N-94578-gd6bd902599-gcff309097a+3
>
> It's compiled from the latest source as of a week ago.
>
>
>
> On Wed, Sep 4, 2019 at 11:39 AM Dennis Mungai <[hidden email]> wrote:
>
>> On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]>
>> wrote:
>> >
>> > Hey,
>> >
>> > Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
>> > S10) with the HDR10+ setting will do. A sample is posted here:
>> > http://awakeman.redirectme.net/web/testvideo/sample.mp4
>> >
>> > Thanks.
>> >
>> > On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]>
>> wrote:
>> >
>> > > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]>
>> wrote:
>> > > >
>> > > > Hey folks,
>> > > >
>> > > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264
>> using
>> > > a
>> > > > GTX 1650 nvenc and having issues with what I assume are the pixel
>> formats
>> > > > conversions on hardware. My encode speed (in fps) is pretty low (see
>> > > > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
>> > > version
>> > > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I
>> don't
>> > > > think this is relevant). For the purposes of this experiment, let's
>> say
>> > > I'm
>> > > > not concerned with lossiness with format conversions.
>> > > >
>> > > > I'd like to know what I'm doing wrong and what commands I can issue
>> for
>> > > the
>> > > > following:
>> > > > decode on GPU -> format conversion (if necessary) on GPU -> encode
>> on
>> > > GPU.
>> > > > I might not be understanding a few concepts.
>> > > >
>> > > > The combination of options that I thought were available and I
>> tried out
>> > > > are:
>> > > > - decoder (I mostly left this blank for auto) and encoder (always
>> > > > h264_nvenc)
>> > > > - hwaccel
>> > > > - hwaccel_output_format
>> > > > - filters (vf):
>> > > >   - format
>> > > >   - scale_npp (for format conversion on gpu)
>> > > >
>> > > > I have no idea what the options pix_fmt or other filters like
>> colorspace
>> > > do
>> > > > for hardware (how is pix_fmt different from
>> hwaccel_output_format?). At
>> > > > this point I'm kind of stuck. Don't know how to convert formats on
>> the
>> > > GPU
>> > > > (I assume the format conversion is happening on the CPU).
>> > > >
>> > > > Input details:
>> > > > ffprobe input.mp4
>> > > >
>> > > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
>> > > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s,
>> SAR
>> > > 1:1
>> > > > DAR 16:9, 29.99 fps, ...
>> > > >
>> > > > Summary of various combinations (- indicates left blank):
>> > > > test | hwaccel | hwaccel_output_format | filter (vf)              |
>> > > > encodefps | note
>> > > > 1    | cuda    | -                     | -                        |
>> X
>> > > >    | Failed
>> > > > 2    | cuda    | cuda                  | -                        |
>> X
>> > > >    | Failed
>> > > > 3    | cuda    | yuv420p               | -                        |
>> 361
>> > > >    | Video messed up
>> > > > 4    | cuda    | cuda                  | format=yuv420p           |
>> X
>> > > >    | Failed
>> > > > 5    | cuvid   | cuda                  | format=yuv420p           |
>> 91
>> > > >   | Not using GPU decode
>> > > > 6    | cuda    | -                     | format=yuv420p           |
>> 161
>> > > >    | Not using GPU format conversion
>> > > > 7    | cuvid   | -                     | format=yuv420p           |
>> 91
>> > > >   | Not using GPU decode
>> > > > 8    | cuda    | -                     | scale_npp=format=yuv420p |
>> X
>> > > >    | Failed
>> > > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p |
>> X
>> > > >    | Failed
>> > > >
>> > > > I would expect a speed of around test 3 (without the screwed up
>> video).
>> > > Is
>> > > > there any way to convert the pixel formats on the hardware without
>> > > screwing
>> > > > up the video? On a similar note, I'd love for someone to explain the
>> > > > failing encodes.
>> > > >
>> > > > Here are the details for corresponding encodes:
>> > > >
>> > > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v
>> h264_nvenc
>> > > >    output.mp4
>> > > >
>> > > >    Fails with the following:
>> > > >
>> > > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
>> > > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
>> > > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
>> > > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 >
>> has
>> > > >    Compute SM 7.5 ]
>> > > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
>> > > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
>> > > >    Error initializing output stream 0:0 -- Error while opening
>> encoder
>> > > for
>> > > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
>> > > rate,
>> > > >    width or height
>> > > >
>> > > >    2. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
>> cuda
>> > > -i
>> > > >    input.mp4 -c:v h264_nvenc output.mp4
>> > > >
>> > > >    Fails with the following:
>> > > >
>> > > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
>> > > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
>> > > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
>> > > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
>> > > required
>> > > >    NVENC features
>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
>> > > >    Error initializing output stream 0:0 -- Error while opening
>> encoder
>> > > for
>> > > >    output stream #0:0 - maybe incorrect parameters such as bit_rate,
>> > > rate,
>> > > >    width or height
>> > > >
>> > > >    Alright, so it seems that the hardware h264 encoder doesn't
>> support 10
>> > > >    bit encodes (that's coming from the decoder). So lets try
>> changing the
>> > > >    format:
>> > > >
>> > > >
>> > > >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
>> > > yuv420p
>> > > >    -i input.mp4 -c:v h264_nvenc output.mp4
>> > > >
>> > > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed up.
>> > > Colors
>> > > >    are weird:
>> > > >
>> > > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
>> > > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
>> > > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
>> > > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
>> > > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 >
>> has
>> > > >    Compute SM 7.5 ]
>> > > >    [h264_nvenc @ 00000256cb693700] supports NVENC
>> > > >
>> > > >    Let's use a format filter to change format:
>> > > >
>> > > >    4. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
>> cuda
>> > > -i
>> > > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>> > > >
>> > > >    Fails with the following:
>> > > >
>> > > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 00000193908675c0] NVDEC capabilities:
>> > > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
>> > > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic'
>> interl:0
>> > > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>> 0:0'
>> > > and the
>> > > >    filter 'Parsed_format_0'
>> > > >    Impossible to convert between the formats supported by the filter
>> > > 'graph
>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
>> > > >    Error reinitializing filters!
>> > > >    Failed to inject frame into filter network: Function not
>> implemented
>> > > >    Error while processing the decoded data for stream #0:0
>> > > >
>> > > >    5. ffmpeg -loglevel verbose -hwaccel cuvid
>> -hwaccel_output_format cuda
>> > > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>> > > >
>> > > >    Succeeds, but only encodes at around 91 fps, due to, I assume,
>> not
>> > > using
>> > > >    GPU decoder. What is the difference between cuvid and cuda
>> hwaccel
>> > > (why did
>> > > >    the previous fail and this succeed)? Here is the relevant output:
>> > > >
>> > > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
>> > > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format
>> 'yuv420p10le' is
>> > > not
>> > > >    supported
>> > > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
>> > > initialized.
>> > > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
>> > > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
>> > > >    Error while decoding stream #0:0: Operation not permitted
>> > > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic'
>> interl:0
>> > > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>> 0:0'
>> > > and the
>> > > >    filter 'Parsed_format_0'
>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
>> > > sar:1/1
>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>> > > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
>> > > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 >
>> has
>> > > >    Compute SM 7.5 ]
>> > > >    [h264_nvenc @ 000002152ac31800] supports NVENC
>> > > >
>> > > >    Take out hwaccel_output:
>> > > >
>> > > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf
>> format=yuv420p
>> > > >    -c:v h264_nvenc out.mp4
>> > > >
>> > > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder and
>> > > >    encoder, but I believe the changing of format is happening on
>> the CPU
>> > > >    between the two stages).
>> > > >
>> > > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 0000025491b84900] NVDEC capabilities:
>> > > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
>> > > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic'
>> interl:0
>> > > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>> 0:0'
>> > > and the
>> > > >    filter 'Parsed_format_0'
>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le
>> sar:1/1 ->
>> > > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>> > > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
>> > > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 >
>> has
>> > > >    Compute SM 7.5 ]
>> > > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
>> > > >
>> > > >
>> > > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
>> > > format=yuv420p
>> > > >    -c:v h264_nvenc out.mp4
>> > > >
>> > > >    Only encoding on GPU, not decoding (91 fps).
>> > > >
>> > > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
>> > > >    [AVHWFramesContext @ 00000216387fc300] Pixel format
>> 'yuv420p10le' is
>> > > not
>> > > >    supported
>> > > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
>> > > initialized.
>> > > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
>> > > >    [hevc @ 000002163813d300] Could not find ref with POC 0
>> > > >    Error while decoding stream #0:0: Operation not permitted
>> > > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic'
>> interl:0
>> > > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>> 0:0'
>> > > and the
>> > > >    filter 'Parsed_format_0'
>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
>> > > sar:1/1
>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>> > > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
>> > > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
>> > > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
>> > > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 >
>> has
>> > > >    Compute SM 7.5 ]
>> > > >    [h264_nvenc @ 0000021638590f40] supports NVENC
>> > > >
>> > > >    Lets see if I can do format conversion in the GPU (instead of
>> GPU ->
>> > > CPU
>> > > >    -> GPU), by using the scale_npp filter.
>> > > >
>> > > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
>> > > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
>> > > >
>> > > >    Fails
>> > > >
>> > > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
>> > > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
>> > > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic'
>> interl:0
>> > > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>> 0:0'
>> > > and the
>> > > >    filter 'Parsed_scale_npp_0'
>> > > >    Impossible to convert between the formats supported by the filter
>> > > 'graph
>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
>> > > >    Error reinitializing filters!
>> > > >    Failed to inject frame into filter network: Function not
>> implemented
>> > > >    Error while processing the decoded data for stream #0:0
>> > > >
>> > > >
>> > > >    9. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
>> cuda
>> > > -i
>> > > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
>> > > >
>> > > >    Fails:
>> > > >
>> > > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
>> > > >    samplerate:48000 chlayout:0x3
>> > > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
>> > > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count:
>> 262144
>> > > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
>> > > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
>> > > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input format:
>> > > p010le
>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure
>> output pad
>> > > >    on Parsed_scale_npp_0
>> > > >    Error reinitializing filters!
>> > > >    Failed to inject frame into filter network: Function not
>> implemented
>> > > >    Error while processing the decoded data for stream #0:0
>> > > >
>> > > >
>> > > > I'd appreciate any help or pointer in the right direction (even an
>> > > > alternate mailing list).
>> > >
>> > >
>> > > Hey there,
>> > >
>> > > Could you kindly provide a download link to the sample of the input
>> > > file you're working on?
>> > > That way we can reproduce what you're seeing here, thanks!
>>
>> Another thing:
>>
>> What version of FFmpeg are you running?
>> _______________________________________________
>> ffmpeg-user mailing list
>> [hidden email]
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>>
>> To unsubscribe, visit link above, or email
>> [hidden email] with subject "unsubscribe".
>
>
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Ray Randomnic
Hey folks, any luck with this??



On Wed, Sep 4, 2019 at 2:07 PM Ray Randomnic <[hidden email]> wrote:

> ffmpeg version N-94578-gd6bd902599-gcff309097a+3 Copyright (c) 2000-2019
> the FFmpeg developers
>   built with gcc 9.2.0 (Rev1, Built by MSYS2 project)
>   configuration:  --disable-autodetect --enable-amf --enable-bzlib
> --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv
> --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec
> --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus
> --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
> --enable-libdav1d --disable-debug --enable-fontconfig --enable-libass
> --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa
> --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
> --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora
> --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc
> --enable-libwavpack --enable-libwebp --enable-libxml2 --enable-libzimg
> --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid
> --enable-libaom --enable-libopenmpt --enable-version3 --enable-chromaprint
> --enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca
> --enable-libcdio --enable-libfdk-aac --enable-libflite --enable-libfribidi
> --enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc
> --enable-libkvazaar --enable-libmodplug --enable-librtmp
> --enable-librubberband --enable-libssh --enable-libtesseract
> --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal
> --enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa
> --enable-opencl --enable-opengl --enable-libnpp --enable-libopenh264
> --enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
> --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
> --extra-cflags=-DLIBSSH_STATIC
> --extra-ldflags='-Wl,--allow-multiple-definition'
> --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
> --extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
> --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
> --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads
> --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
> --extra-cflags='-IC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/include'
> --extra-ldflags='-LC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/lib/x64'
>   libavutil      56. 33.100 / 56. 33.100
>   libavcodec     58. 55.100 / 58. 55.100
>   libavformat    58. 31.101 / 58. 31.101
>   libavdevice    58.  9.100 / 58.  9.100
>   libavfilter     7. 58.100 /  7. 58.100
>   libswscale      5.  6.100 /  5.  6.100
>   libswresample   3.  6.100 /  3.  6.100
>   libpostproc    55.  6.100 / 55.  6.100
> Hyper fast Audio and Video encoder
>
> On Wed, Sep 4, 2019 at 2:05 PM Ray Randomnic <[hidden email]>
> wrote:
>
>> Link for the sample is still alive. I am able to download it. Right-click
>> and click save (or use wget or curl?)?
>> http://awakeman.redirectme.net/web/testvideo/sample.mp4
>>
>> As mentioned in the first email, the ffmpeg version is
>> N-94578-gd6bd902599-gcff309097a+3
>>
>> It's compiled from the latest source as of a week ago.
>>
>>
>>
>> On Wed, Sep 4, 2019 at 11:39 AM Dennis Mungai <[hidden email]> wrote:
>>
>>> On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]>
>>> wrote:
>>> >
>>> > Hey,
>>> >
>>> > Sure, any video taken by a Samsung device (such as Note or Galaxy S9 or
>>> > S10) with the HDR10+ setting will do. A sample is posted here:
>>> > http://awakeman.redirectme.net/web/testvideo/sample.mp4
>>> >
>>> > Thanks.
>>> >
>>> > On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]>
>>> wrote:
>>> >
>>> > > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]>
>>> wrote:
>>> > > >
>>> > > > Hey folks,
>>> > > >
>>> > > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to H264
>>> using
>>> > > a
>>> > > > GTX 1650 nvenc and having issues with what I assume are the pixel
>>> formats
>>> > > > conversions on hardware. My encode speed (in fps) is pretty low
>>> (see
>>> > > > below), far lower than I get when transcoding HEVC -> HEVC. ffmpeg
>>> > > version
>>> > > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though I
>>> don't
>>> > > > think this is relevant). For the purposes of this experiment,
>>> let's say
>>> > > I'm
>>> > > > not concerned with lossiness with format conversions.
>>> > > >
>>> > > > I'd like to know what I'm doing wrong and what commands I can
>>> issue for
>>> > > the
>>> > > > following:
>>> > > > decode on GPU -> format conversion (if necessary) on GPU -> encode
>>> on
>>> > > GPU.
>>> > > > I might not be understanding a few concepts.
>>> > > >
>>> > > > The combination of options that I thought were available and I
>>> tried out
>>> > > > are:
>>> > > > - decoder (I mostly left this blank for auto) and encoder (always
>>> > > > h264_nvenc)
>>> > > > - hwaccel
>>> > > > - hwaccel_output_format
>>> > > > - filters (vf):
>>> > > >   - format
>>> > > >   - scale_npp (for format conversion on gpu)
>>> > > >
>>> > > > I have no idea what the options pix_fmt or other filters like
>>> colorspace
>>> > > do
>>> > > > for hardware (how is pix_fmt different from
>>> hwaccel_output_format?). At
>>> > > > this point I'm kind of stuck. Don't know how to convert formats on
>>> the
>>> > > GPU
>>> > > > (I assume the format conversion is happening on the CPU).
>>> > > >
>>> > > > Input details:
>>> > > > ffprobe input.mp4
>>> > > >
>>> > > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
>>> > > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886 kb/s,
>>> SAR
>>> > > 1:1
>>> > > > DAR 16:9, 29.99 fps, ...
>>> > > >
>>> > > > Summary of various combinations (- indicates left blank):
>>> > > > test | hwaccel | hwaccel_output_format | filter (vf)              |
>>> > > > encodefps | note
>>> > > > 1    | cuda    | -                     | -
>>> | X
>>> > > >    | Failed
>>> > > > 2    | cuda    | cuda                  | -
>>> | X
>>> > > >    | Failed
>>> > > > 3    | cuda    | yuv420p               | -
>>> | 361
>>> > > >    | Video messed up
>>> > > > 4    | cuda    | cuda                  | format=yuv420p
>>>  | X
>>> > > >    | Failed
>>> > > > 5    | cuvid   | cuda                  | format=yuv420p
>>>  | 91
>>> > > >   | Not using GPU decode
>>> > > > 6    | cuda    | -                     | format=yuv420p
>>>  | 161
>>> > > >    | Not using GPU format conversion
>>> > > > 7    | cuvid   | -                     | format=yuv420p
>>>  | 91
>>> > > >   | Not using GPU decode
>>> > > > 8    | cuda    | -                     | scale_npp=format=yuv420p
>>> | X
>>> > > >    | Failed
>>> > > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p
>>> | X
>>> > > >    | Failed
>>> > > >
>>> > > > I would expect a speed of around test 3 (without the screwed up
>>> video).
>>> > > Is
>>> > > > there any way to convert the pixel formats on the hardware without
>>> > > screwing
>>> > > > up the video? On a similar note, I'd love for someone to explain
>>> the
>>> > > > failing encodes.
>>> > > >
>>> > > > Here are the details for corresponding encodes:
>>> > > >
>>> > > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v
>>> h264_nvenc
>>> > > >    output.mp4
>>> > > >
>>> > > >    Fails with the following:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
>>> > > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920 h:1080
>>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650 >
>>> has
>>> > > >    Compute SM 7.5 ]
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
>>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
>>> > > >    Error initializing output stream 0:0 -- Error while opening
>>> encoder
>>> > > for
>>> > > >    output stream #0:0 - maybe incorrect parameters such as
>>> bit_rate,
>>> > > rate,
>>> > > >    width or height
>>> > > >
>>> > > >    2. ffmpeg -loglevel verbose -hwaccel cuda
>>> -hwaccel_output_format cuda
>>> > > -i
>>> > > >    input.mp4 -c:v h264_nvenc output.mp4
>>> > > >
>>> > > >    Fails with the following:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
>>> > > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920 h:1080
>>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
>>> > > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't support
>>> > > required
>>> > > >    NVENC features
>>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
>>> > > >    Error initializing output stream 0:0 -- Error while opening
>>> encoder
>>> > > for
>>> > > >    output stream #0:0 - maybe incorrect parameters such as
>>> bit_rate,
>>> > > rate,
>>> > > >    width or height
>>> > > >
>>> > > >    Alright, so it seems that the hardware h264 encoder doesn't
>>> support 10
>>> > > >    bit encodes (that's coming from the decoder). So lets try
>>> changing the
>>> > > >    format:
>>> > > >
>>> > > >
>>> > > >    3. ffmpeg -loglevel verbose -hwaccel cuda -hwaccel_output_format
>>> > > yuv420p
>>> > > >    -i input.mp4 -c:v h264_nvenc output.mp4
>>> > > >
>>> > > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed
>>> up.
>>> > > Colors
>>> > > >    are weird:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
>>> > > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920 h:1080
>>> > > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
>>> > > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650 >
>>> has
>>> > > >    Compute SM 7.5 ]
>>> > > >    [h264_nvenc @ 00000256cb693700] supports NVENC
>>> > > >
>>> > > >    Let's use a format filter to change format:
>>> > > >
>>> > > >    4. ffmpeg -loglevel verbose -hwaccel cuda
>>> -hwaccel_output_format cuda
>>> > > -i
>>> > > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>>> > > >
>>> > > >    Fails with the following:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 00000193908675c0] NVDEC capabilities:
>>> > > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920 h:1080
>>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic'
>>> interl:0
>>> > > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
>>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>>> 0:0'
>>> > > and the
>>> > > >    filter 'Parsed_format_0'
>>> > > >    Impossible to convert between the formats supported by the
>>> filter
>>> > > 'graph
>>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
>>> > > >    Error reinitializing filters!
>>> > > >    Failed to inject frame into filter network: Function not
>>> implemented
>>> > > >    Error while processing the decoded data for stream #0:0
>>> > > >
>>> > > >    5. ffmpeg -loglevel verbose -hwaccel cuvid
>>> -hwaccel_output_format cuda
>>> > > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
>>> > > >
>>> > > >    Succeeds, but only encodes at around 91 fps, due to, I assume,
>>> not
>>> > > using
>>> > > >    GPU decoder. What is the difference between cuvid and cuda
>>> hwaccel
>>> > > (why did
>>> > > >    the previous fail and this succeed)? Here is the relevant
>>> output:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
>>> > > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format
>>> 'yuv420p10le' is
>>> > > not
>>> > > >    supported
>>> > > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
>>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
>>> > > initialized.
>>> > > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
>>> > > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
>>> > > >    Error while decoding stream #0:0: Operation not permitted
>>> > > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920 h:1080
>>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic'
>>> interl:0
>>> > > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
>>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>>> 0:0'
>>> > > and the
>>> > > >    filter 'Parsed_format_0'
>>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080 fmt:yuv420p10le
>>> > > sar:1/1
>>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>>> > > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
>>> > > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650 >
>>> has
>>> > > >    Compute SM 7.5 ]
>>> > > >    [h264_nvenc @ 000002152ac31800] supports NVENC
>>> > > >
>>> > > >    Take out hwaccel_output:
>>> > > >
>>> > > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf
>>> format=yuv420p
>>> > > >    -c:v h264_nvenc out.mp4
>>> > > >
>>> > > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder
>>> and
>>> > > >    encoder, but I believe the changing of format is happening on
>>> the CPU
>>> > > >    between the two stages).
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 0000025491b84900] NVDEC capabilities:
>>> > > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920 h:1080
>>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic'
>>> interl:0
>>> > > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
>>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>>> 0:0'
>>> > > and the
>>> > > >    filter 'Parsed_format_0'
>>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le
>>> sar:1/1 ->
>>> > > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>>> > > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
>>> > > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650 >
>>> has
>>> > > >    Compute SM 7.5 ]
>>> > > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
>>> > > >
>>> > > >
>>> > > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
>>> > > format=yuv420p
>>> > > >    -c:v h264_nvenc out.mp4
>>> > > >
>>> > > >    Only encoding on GPU, not decoding (91 fps).
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
>>> > > >    [AVHWFramesContext @ 00000216387fc300] Pixel format
>>> 'yuv420p10le' is
>>> > > not
>>> > > >    supported
>>> > > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
>>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
>>> > > initialized.
>>> > > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
>>> > > >    [hevc @ 000002163813d300] Could not find ref with POC 0
>>> > > >    Error while decoding stream #0:0: Operation not permitted
>>> > > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920 h:1080
>>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic'
>>> interl:0
>>> > > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
>>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>>> 0:0'
>>> > > and the
>>> > > >    filter 'Parsed_format_0'
>>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080 fmt:yuv420p10le
>>> > > sar:1/1
>>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
>>> > > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
>>> > > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
>>> > > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
>>> > > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650 >
>>> has
>>> > > >    Compute SM 7.5 ]
>>> > > >    [h264_nvenc @ 0000021638590f40] supports NVENC
>>> > > >
>>> > > >    Lets see if I can do format conversion in the GPU (instead of
>>> GPU ->
>>> > > CPU
>>> > > >    -> GPU), by using the scale_npp filter.
>>> > > >
>>> > > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
>>> > > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
>>> > > >
>>> > > >    Fails
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
>>> > > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920 h:1080
>>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic'
>>> interl:0
>>> > > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
>>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
>>> 0:0'
>>> > > and the
>>> > > >    filter 'Parsed_scale_npp_0'
>>> > > >    Impossible to convert between the formats supported by the
>>> filter
>>> > > 'graph
>>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
>>> > > >    Error reinitializing filters!
>>> > > >    Failed to inject frame into filter network: Function not
>>> implemented
>>> > > >    Error while processing the decoded data for stream #0:0
>>> > > >
>>> > > >
>>> > > >    9. ffmpeg -loglevel verbose -hwaccel cuda
>>> -hwaccel_output_format cuda
>>> > > -i
>>> > > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
>>> > > >
>>> > > >    Fails:
>>> > > >
>>> > > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
>>> > > >    samplerate:48000 chlayout:0x3
>>> > > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
>>> > > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count:
>>> 262144
>>> > > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
>>> > > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
>>> > > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920 h:1080
>>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
>>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input
>>> format:
>>> > > p010le
>>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure
>>> output pad
>>> > > >    on Parsed_scale_npp_0
>>> > > >    Error reinitializing filters!
>>> > > >    Failed to inject frame into filter network: Function not
>>> implemented
>>> > > >    Error while processing the decoded data for stream #0:0
>>> > > >
>>> > > >
>>> > > > I'd appreciate any help or pointer in the right direction (even an
>>> > > > alternate mailing list).
>>> > >
>>> > >
>>> > > Hey there,
>>> > >
>>> > > Could you kindly provide a download link to the sample of the input
>>> > > file you're working on?
>>> > > That way we can reproduce what you're seeing here, thanks!
>>>
>>> Another thing:
>>>
>>> What version of FFmpeg are you running?
>>> _______________________________________________
>>> ffmpeg-user mailing list
>>> [hidden email]
>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>>>
>>> To unsubscribe, visit link above, or email
>>> [hidden email] with subject "unsubscribe".
>>
>>
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Michael Shaffer
I'm using NVENC on a GeForce 1060 to do a "-vf curves" type of filter (for
removing fog/haze from a web cam view). A while ago someone on here told me
that the CPU actually does the filter. The GPU does the encoding to H264.
Just an idea, but maybe the CPU is handling the filter you're trying to run
and that's why it's so slow.

On Fri, Sep 6, 2019 at 1:10 PM Ray Randomnic <[hidden email]> wrote:

> Hey folks, any luck with this??
>
>
>
> On Wed, Sep 4, 2019 at 2:07 PM Ray Randomnic <[hidden email]>
> wrote:
>
> > ffmpeg version N-94578-gd6bd902599-gcff309097a+3 Copyright (c) 2000-2019
> > the FFmpeg developers
> >   built with gcc 9.2.0 (Rev1, Built by MSYS2 project)
> >   configuration:  --disable-autodetect --enable-amf --enable-bzlib
> > --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2
> --enable-iconv
> > --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2
> --enable-ffnvcodec
> > --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus
> > --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
> > --enable-libdav1d --disable-debug --enable-fontconfig --enable-libass
> > --enable-libbluray --enable-libfreetype --enable-libmfx
> --enable-libmysofa
> > --enable-libopencore-amrnb --enable-libopencore-amrwb
> --enable-libopenjpeg
> > --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora
> > --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc
> > --enable-libwavpack --enable-libwebp --enable-libxml2 --enable-libzimg
> > --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid
> > --enable-libaom --enable-libopenmpt --enable-version3
> --enable-chromaprint
> > --enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca
> > --enable-libcdio --enable-libfdk-aac --enable-libflite
> --enable-libfribidi
> > --enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc
> > --enable-libkvazaar --enable-libmodplug --enable-librtmp
> > --enable-librubberband --enable-libssh --enable-libtesseract
> > --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal
> > --enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa
> > --enable-opencl --enable-opengl --enable-libnpp --enable-libopenh264
> > --enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
> > --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
> > --extra-cflags=-DLIBSSH_STATIC
> > --extra-ldflags='-Wl,--allow-multiple-definition'
> > --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
> > --extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
> > --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
> > --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads
> > --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
> > --extra-cflags='-IC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/include'
> > --extra-ldflags='-LC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/lib/x64'
> >   libavutil      56. 33.100 / 56. 33.100
> >   libavcodec     58. 55.100 / 58. 55.100
> >   libavformat    58. 31.101 / 58. 31.101
> >   libavdevice    58.  9.100 / 58.  9.100
> >   libavfilter     7. 58.100 /  7. 58.100
> >   libswscale      5.  6.100 /  5.  6.100
> >   libswresample   3.  6.100 /  3.  6.100
> >   libpostproc    55.  6.100 / 55.  6.100
> > Hyper fast Audio and Video encoder
> >
> > On Wed, Sep 4, 2019 at 2:05 PM Ray Randomnic <[hidden email]>
> > wrote:
> >
> >> Link for the sample is still alive. I am able to download it.
> Right-click
> >> and click save (or use wget or curl?)?
> >> http://awakeman.redirectme.net/web/testvideo/sample.mp4
> >>
> >> As mentioned in the first email, the ffmpeg version is
> >> N-94578-gd6bd902599-gcff309097a+3
> >>
> >> It's compiled from the latest source as of a week ago.
> >>
> >>
> >>
> >> On Wed, Sep 4, 2019 at 11:39 AM Dennis Mungai <[hidden email]>
> wrote:
> >>
> >>> On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]>
> >>> wrote:
> >>> >
> >>> > Hey,
> >>> >
> >>> > Sure, any video taken by a Samsung device (such as Note or Galaxy S9
> or
> >>> > S10) with the HDR10+ setting will do. A sample is posted here:
> >>> > http://awakeman.redirectme.net/web/testvideo/sample.mp4
> >>> >
> >>> > Thanks.
> >>> >
> >>> > On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]>
> >>> wrote:
> >>> >
> >>> > > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <[hidden email]
> >
> >>> wrote:
> >>> > > >
> >>> > > > Hey folks,
> >>> > > >
> >>> > > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to
> H264
> >>> using
> >>> > > a
> >>> > > > GTX 1650 nvenc and having issues with what I assume are the pixel
> >>> formats
> >>> > > > conversions on hardware. My encode speed (in fps) is pretty low
> >>> (see
> >>> > > > below), far lower than I get when transcoding HEVC -> HEVC.
> ffmpeg
> >>> > > version
> >>> > > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS, though
> I
> >>> don't
> >>> > > > think this is relevant). For the purposes of this experiment,
> >>> let's say
> >>> > > I'm
> >>> > > > not concerned with lossiness with format conversions.
> >>> > > >
> >>> > > > I'd like to know what I'm doing wrong and what commands I can
> >>> issue for
> >>> > > the
> >>> > > > following:
> >>> > > > decode on GPU -> format conversion (if necessary) on GPU ->
> encode
> >>> on
> >>> > > GPU.
> >>> > > > I might not be understanding a few concepts.
> >>> > > >
> >>> > > > The combination of options that I thought were available and I
> >>> tried out
> >>> > > > are:
> >>> > > > - decoder (I mostly left this blank for auto) and encoder (always
> >>> > > > h264_nvenc)
> >>> > > > - hwaccel
> >>> > > > - hwaccel_output_format
> >>> > > > - filters (vf):
> >>> > > >   - format
> >>> > > >   - scale_npp (for format conversion on gpu)
> >>> > > >
> >>> > > > I have no idea what the options pix_fmt or other filters like
> >>> colorspace
> >>> > > do
> >>> > > > for hardware (how is pix_fmt different from
> >>> hwaccel_output_format?). At
> >>> > > > this point I'm kind of stuck. Don't know how to convert formats
> on
> >>> the
> >>> > > GPU
> >>> > > > (I assume the format conversion is happening on the CPU).
> >>> > > >
> >>> > > > Input details:
> >>> > > > ffprobe input.mp4
> >>> > > >
> >>> > > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> >>> > > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886
> kb/s,
> >>> SAR
> >>> > > 1:1
> >>> > > > DAR 16:9, 29.99 fps, ...
> >>> > > >
> >>> > > > Summary of various combinations (- indicates left blank):
> >>> > > > test | hwaccel | hwaccel_output_format | filter (vf)
>   |
> >>> > > > encodefps | note
> >>> > > > 1    | cuda    | -                     | -
> >>> | X
> >>> > > >    | Failed
> >>> > > > 2    | cuda    | cuda                  | -
> >>> | X
> >>> > > >    | Failed
> >>> > > > 3    | cuda    | yuv420p               | -
> >>> | 361
> >>> > > >    | Video messed up
> >>> > > > 4    | cuda    | cuda                  | format=yuv420p
> >>>  | X
> >>> > > >    | Failed
> >>> > > > 5    | cuvid   | cuda                  | format=yuv420p
> >>>  | 91
> >>> > > >   | Not using GPU decode
> >>> > > > 6    | cuda    | -                     | format=yuv420p
> >>>  | 161
> >>> > > >    | Not using GPU format conversion
> >>> > > > 7    | cuvid   | -                     | format=yuv420p
> >>>  | 91
> >>> > > >   | Not using GPU decode
> >>> > > > 8    | cuda    | -                     | scale_npp=format=yuv420p
> >>> | X
> >>> > > >    | Failed
> >>> > > > 9    | cuda    | cuda                  | scale_npp=format=yuv420p
> >>> | X
> >>> > > >    | Failed
> >>> > > >
> >>> > > > I would expect a speed of around test 3 (without the screwed up
> >>> video).
> >>> > > Is
> >>> > > > there any way to convert the pixel formats on the hardware
> without
> >>> > > screwing
> >>> > > > up the video? On a similar note, I'd love for someone to explain
> >>> the
> >>> > > > failing encodes.
> >>> > > >
> >>> > > > Here are the details for corresponding encodes:
> >>> > > >
> >>> > > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v
> >>> h264_nvenc
> >>> > > >    output.mp4
> >>> > > >
> >>> > > >    Fails with the following:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> >>> > > >    [hevc @ 000001cc8740fc00] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920
> h:1080
> >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX 1650
> >
> >>> has
> >>> > > >    Compute SM 7.5 ]
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices found
> >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> >>> > > >    Error initializing output stream 0:0 -- Error while opening
> >>> encoder
> >>> > > for
> >>> > > >    output stream #0:0 - maybe incorrect parameters such as
> >>> bit_rate,
> >>> > > rate,
> >>> > > >    width or height
> >>> > > >
> >>> > > >    2. ffmpeg -loglevel verbose -hwaccel cuda
> >>> -hwaccel_output_format cuda
> >>> > > -i
> >>> > > >    input.mp4 -c:v h264_nvenc output.mp4
> >>> > > >
> >>> > > >    Fails with the following:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> >>> > > >    [hevc @ 00000240b79e37c0] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920
> h:1080
> >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> >>> > > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't
> support
> >>> > > required
> >>> > > >    NVENC features
> >>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> >>> > > >    Error initializing output stream 0:0 -- Error while opening
> >>> encoder
> >>> > > for
> >>> > > >    output stream #0:0 - maybe incorrect parameters such as
> >>> bit_rate,
> >>> > > rate,
> >>> > > >    width or height
> >>> > > >
> >>> > > >    Alright, so it seems that the hardware h264 encoder doesn't
> >>> support 10
> >>> > > >    bit encodes (that's coming from the decoder). So lets try
> >>> changing the
> >>> > > >    format:
> >>> > > >
> >>> > > >
> >>> > > >    3. ffmpeg -loglevel verbose -hwaccel cuda
> -hwaccel_output_format
> >>> > > yuv420p
> >>> > > >    -i input.mp4 -c:v h264_nvenc output.mp4
> >>> > > >
> >>> > > >    Pretty decent encode at ~ 360 fps. Alas, the video is screwed
> >>> up.
> >>> > > Colors
> >>> > > >    are weird:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> >>> > > >    [hevc @ 00000256cbb737c0] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920
> h:1080
> >>> > > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> >>> > > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX 1650
> >
> >>> has
> >>> > > >    Compute SM 7.5 ]
> >>> > > >    [h264_nvenc @ 00000256cb693700] supports NVENC
> >>> > > >
> >>> > > >    Let's use a format filter to change format:
> >>> > > >
> >>> > > >    4. ffmpeg -loglevel verbose -hwaccel cuda
> >>> -hwaccel_output_format cuda
> >>> > > -i
> >>> > > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> >>> > > >
> >>> > > >    Fails with the following:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 00000193908675c0] NVDEC capabilities:
> >>> > > >    [hevc @ 00000193908675c0] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920
> h:1080
> >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic'
> >>> interl:0
> >>> > > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
> >>> 0:0'
> >>> > > and the
> >>> > > >    filter 'Parsed_format_0'
> >>> > > >    Impossible to convert between the formats supported by the
> >>> filter
> >>> > > 'graph
> >>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> >>> > > >    Error reinitializing filters!
> >>> > > >    Failed to inject frame into filter network: Function not
> >>> implemented
> >>> > > >    Error while processing the decoded data for stream #0:0
> >>> > > >
> >>> > > >    5. ffmpeg -loglevel verbose -hwaccel cuvid
> >>> -hwaccel_output_format cuda
> >>> > > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> >>> > > >
> >>> > > >    Succeeds, but only encodes at around 91 fps, due to, I assume,
> >>> not
> >>> > > using
> >>> > > >    GPU decoder. What is the difference between cuvid and cuda
> >>> hwaccel
> >>> > > (why did
> >>> > > >    the previous fail and this succeed)? Here is the relevant
> >>> output:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> >>> > > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format
> >>> 'yuv420p10le' is
> >>> > > not
> >>> > > >    supported
> >>> > > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame pool
> >>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> >>> > > initialized.
> >>> > > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> >>> > > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> >>> > > >    Error while decoding stream #0:0: Operation not permitted
> >>> > > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920
> h:1080
> >>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1
> sws_param:flags=2
> >>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic'
> >>> interl:0
> >>> > > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
> >>> 0:0'
> >>> > > and the
> >>> > > >    filter 'Parsed_format_0'
> >>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080
> fmt:yuv420p10le
> >>> > > sar:1/1
> >>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >>> > > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> >>> > > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX 1650
> >
> >>> has
> >>> > > >    Compute SM 7.5 ]
> >>> > > >    [h264_nvenc @ 000002152ac31800] supports NVENC
> >>> > > >
> >>> > > >    Take out hwaccel_output:
> >>> > > >
> >>> > > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf
> >>> format=yuv420p
> >>> > > >    -c:v h264_nvenc out.mp4
> >>> > > >
> >>> > > >    Succeeds, encodes at 161 fps (using both hardware GPU decoder
> >>> and
> >>> > > >    encoder, but I believe the changing of format is happening on
> >>> the CPU
> >>> > > >    between the two stages).
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 0000025491b84900] NVDEC capabilities:
> >>> > > >    [hevc @ 0000025491b84900] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920
> h:1080
> >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic'
> >>> interl:0
> >>> > > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
> >>> 0:0'
> >>> > > and the
> >>> > > >    filter 'Parsed_format_0'
> >>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le
> >>> sar:1/1 ->
> >>> > > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >>> > > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> >>> > > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX 1650
> >
> >>> has
> >>> > > >    Compute SM 7.5 ]
> >>> > > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> >>> > > >
> >>> > > >
> >>> > > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> >>> > > format=yuv420p
> >>> > > >    -c:v h264_nvenc out.mp4
> >>> > > >
> >>> > > >    Only encoding on GPU, not decoding (91 fps).
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> >>> > > >    [AVHWFramesContext @ 00000216387fc300] Pixel format
> >>> 'yuv420p10le' is
> >>> > > not
> >>> > > >    supported
> >>> > > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame pool
> >>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> >>> > > initialized.
> >>> > > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> >>> > > >    [hevc @ 000002163813d300] Could not find ref with POC 0
> >>> > > >    Error while decoding stream #0:0: Operation not permitted
> >>> > > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920
> h:1080
> >>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1
> sws_param:flags=2
> >>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic'
> >>> interl:0
> >>> > > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
> >>> 0:0'
> >>> > > and the
> >>> > > >    filter 'Parsed_format_0'
> >>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080
> fmt:yuv420p10le
> >>> > > sar:1/1
> >>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> >>> > > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> >>> > > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized successfully
> >>> > > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> >>> > > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX 1650
> >
> >>> has
> >>> > > >    Compute SM 7.5 ]
> >>> > > >    [h264_nvenc @ 0000021638590f40] supports NVENC
> >>> > > >
> >>> > > >    Lets see if I can do format conversion in the GPU (instead of
> >>> GPU ->
> >>> > > CPU
> >>> > > >    -> GPU), by using the scale_npp filter.
> >>> > > >
> >>> > > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> >>> > > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> >>> > > >
> >>> > > >    Fails
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> >>> > > >    [hevc @ 0000022f207d7f40] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920
> h:1080
> >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic'
> >>> interl:0
> >>> > > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting filter
> >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from stream
> >>> 0:0'
> >>> > > and the
> >>> > > >    filter 'Parsed_scale_npp_0'
> >>> > > >    Impossible to convert between the formats supported by the
> >>> filter
> >>> > > 'graph
> >>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> >>> > > >    Error reinitializing filters!
> >>> > > >    Failed to inject frame into filter network: Function not
> >>> implemented
> >>> > > >    Error while processing the decoded data for stream #0:0
> >>> > > >
> >>> > > >
> >>> > > >    9. ffmpeg -loglevel verbose -hwaccel cuda
> >>> -hwaccel_output_format cuda
> >>> > > -i
> >>> > > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> >>> > > >
> >>> > > >    Fails:
> >>> > > >
> >>> > > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000 samplefmt:fltp
> >>> > > >    samplerate:48000 chlayout:0x3
> >>> > > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> >>> > > >    [hevc @ 00000200747b65c0] format supported: yes, max_mb_count:
> >>> 262144
> >>> > > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> >>> > > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> >>> > > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920
> h:1080
> >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> >>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input
> >>> format:
> >>> > > p010le
> >>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure
> >>> output pad
> >>> > > >    on Parsed_scale_npp_0
> >>> > > >    Error reinitializing filters!
> >>> > > >    Failed to inject frame into filter network: Function not
> >>> implemented
> >>> > > >    Error while processing the decoded data for stream #0:0
> >>> > > >
> >>> > > >
> >>> > > > I'd appreciate any help or pointer in the right direction (even
> an
> >>> > > > alternate mailing list).
> >>> > >
> >>> > >
> >>> > > Hey there,
> >>> > >
> >>> > > Could you kindly provide a download link to the sample of the input
> >>> > > file you're working on?
> >>> > > That way we can reproduce what you're seeing here, thanks!
> >>>
> >>> Another thing:
> >>>
> >>> What version of FFmpeg are you running?
> >>> _______________________________________________
> >>> ffmpeg-user mailing list
> >>> [hidden email]
> >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >>>
> >>> To unsubscribe, visit link above, or email
> >>> [hidden email] with subject "unsubscribe".
> >>
> >>
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> [hidden email] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".
Reply | Threaded
Open this post in threaded view
|

Re: Trouble transcoding with cuda

Ray Randomnic
Yep, that's what my understanding is as transcribed in the original email.

From my understanding:
- decode happens in the GPU, the output format is 10 bit (yuv420p10le)
- GPU encoder does not understand this decoder output and therefore we need
to convert to something that the encoder can understand, such as yuv420p

The two ways I think this may be achieved are: apply a filter or have the
decoder output directly something encoder understands.

*Method 1:*
- GPU decodes the yuv420p10le stream and outputs it
- Apply a filter to change the format to something that the encoder will
understand (such as *yuv420p*)
  - Do it in GPU: *scale_npp=format=yuv420p* *This does NOT work. I don't
know why. Can anyone explain why?*
  - Do it in CPU: *format=yuv420p*. *This works, but is obviously slow*
(due to hwdownload from GPU and then upload it again to do the following
encode)
- GPU Encoder takes in the format (which it understands), and just encodes


*Method 2:*
- GPU decodes the yuv420p10le stream and outputs a format that the encoder
understands directly: (such as yuv420p)
   - I tried doing this with *-hwaccel_output_format yuv420p*, but the
colors are all messed up. *How do I convert from yuv420p10le -> yuv420p
without messing up colors? Or how to I get GPU decoder to output a certain
format?*
- GPU Encoder takes in the format (which it understands), and just encodes

Thoughts?

On Fri, Sep 6, 2019 at 2:14 PM Michael Shaffer <[hidden email]>
wrote:

> I'm using NVENC on a GeForce 1060 to do a "-vf curves" type of filter (for
> removing fog/haze from a web cam view). A while ago someone on here told me
> that the CPU actually does the filter. The GPU does the encoding to H264.
> Just an idea, but maybe the CPU is handling the filter you're trying to run
> and that's why it's so slow.
>
> On Fri, Sep 6, 2019 at 1:10 PM Ray Randomnic <[hidden email]>
> wrote:
>
> > Hey folks, any luck with this??
> >
> >
> >
> > On Wed, Sep 4, 2019 at 2:07 PM Ray Randomnic <[hidden email]>
> > wrote:
> >
> > > ffmpeg version N-94578-gd6bd902599-gcff309097a+3 Copyright (c)
> 2000-2019
> > > the FFmpeg developers
> > >   built with gcc 9.2.0 (Rev1, Built by MSYS2 project)
> > >   configuration:  --disable-autodetect --enable-amf --enable-bzlib
> > > --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2
> > --enable-iconv
> > > --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2
> > --enable-ffnvcodec
> > > --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus
> > > --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265
> > > --enable-libdav1d --disable-debug --enable-fontconfig --enable-libass
> > > --enable-libbluray --enable-libfreetype --enable-libmfx
> > --enable-libmysofa
> > > --enable-libopencore-amrnb --enable-libopencore-amrwb
> > --enable-libopenjpeg
> > > --enable-libsnappy --enable-libsoxr --enable-libspeex
> --enable-libtheora
> > > --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc
> > > --enable-libwavpack --enable-libwebp --enable-libxml2 --enable-libzimg
> > > --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid
> > > --enable-libaom --enable-libopenmpt --enable-version3
> > --enable-chromaprint
> > > --enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca
> > > --enable-libcdio --enable-libfdk-aac --enable-libflite
> > --enable-libfribidi
> > > --enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc
> > > --enable-libkvazaar --enable-libmodplug --enable-librtmp
> > > --enable-librubberband --enable-libssh --enable-libtesseract
> > > --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal
> > > --enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa
> > > --enable-opencl --enable-opengl --enable-libnpp --enable-libopenh264
> > > --enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
> > > --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
> > > --extra-cflags=-DLIBSSH_STATIC
> > > --extra-ldflags='-Wl,--allow-multiple-definition'
> > > --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
> > > --extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
> > > --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
> > > --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv
> --disable-w32threads
> > > --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
> > > --extra-cflags='-IC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/include'
> > > --extra-ldflags='-LC:/PROGRA~1/NVIDIA~2/CUDA/v10.1/lib/x64'
> > >   libavutil      56. 33.100 / 56. 33.100
> > >   libavcodec     58. 55.100 / 58. 55.100
> > >   libavformat    58. 31.101 / 58. 31.101
> > >   libavdevice    58.  9.100 / 58.  9.100
> > >   libavfilter     7. 58.100 /  7. 58.100
> > >   libswscale      5.  6.100 /  5.  6.100
> > >   libswresample   3.  6.100 /  3.  6.100
> > >   libpostproc    55.  6.100 / 55.  6.100
> > > Hyper fast Audio and Video encoder
> > >
> > > On Wed, Sep 4, 2019 at 2:05 PM Ray Randomnic <[hidden email]>
> > > wrote:
> > >
> > >> Link for the sample is still alive. I am able to download it.
> > Right-click
> > >> and click save (or use wget or curl?)?
> > >> http://awakeman.redirectme.net/web/testvideo/sample.mp4
> > >>
> > >> As mentioned in the first email, the ffmpeg version is
> > >> N-94578-gd6bd902599-gcff309097a+3
> > >>
> > >> It's compiled from the latest source as of a week ago.
> > >>
> > >>
> > >>
> > >> On Wed, Sep 4, 2019 at 11:39 AM Dennis Mungai <[hidden email]>
> > wrote:
> > >>
> > >>> On Wed, 4 Sep 2019 at 07:38, Ray Randomnic <[hidden email]>
> > >>> wrote:
> > >>> >
> > >>> > Hey,
> > >>> >
> > >>> > Sure, any video taken by a Samsung device (such as Note or Galaxy
> S9
> > or
> > >>> > S10) with the HDR10+ setting will do. A sample is posted here:
> > >>> > http://awakeman.redirectme.net/web/testvideo/sample.mp4
> > >>> >
> > >>> > Thanks.
> > >>> >
> > >>> > On Tue, Sep 3, 2019 at 10:07 PM Dennis Mungai <[hidden email]>
> > >>> wrote:
> > >>> >
> > >>> > > On Wed, 4 Sep 2019 at 04:32, Ray Randomnic <
> [hidden email]
> > >
> > >>> wrote:
> > >>> > > >
> > >>> > > > Hey folks,
> > >>> > > >
> > >>> > > > I'm trying to transcode an HEVC (yuv420p10le) encoded file to
> > H264
> > >>> using
> > >>> > > a
> > >>> > > > GTX 1650 nvenc and having issues with what I assume are the
> pixel
> > >>> formats
> > >>> > > > conversions on hardware. My encode speed (in fps) is pretty low
> > >>> (see
> > >>> > > > below), far lower than I get when transcoding HEVC -> HEVC.
> > ffmpeg
> > >>> > > version
> > >>> > > > is N-94578-gd6bd902599-gcff309097a+3 (on a Windows 10 OS,
> though
> > I
> > >>> don't
> > >>> > > > think this is relevant). For the purposes of this experiment,
> > >>> let's say
> > >>> > > I'm
> > >>> > > > not concerned with lossiness with format conversions.
> > >>> > > >
> > >>> > > > I'd like to know what I'm doing wrong and what commands I can
> > >>> issue for
> > >>> > > the
> > >>> > > > following:
> > >>> > > > decode on GPU -> format conversion (if necessary) on GPU ->
> > encode
> > >>> on
> > >>> > > GPU.
> > >>> > > > I might not be understanding a few concepts.
> > >>> > > >
> > >>> > > > The combination of options that I thought were available and I
> > >>> tried out
> > >>> > > > are:
> > >>> > > > - decoder (I mostly left this blank for auto) and encoder
> (always
> > >>> > > > h264_nvenc)
> > >>> > > > - hwaccel
> > >>> > > > - hwaccel_output_format
> > >>> > > > - filters (vf):
> > >>> > > >   - format
> > >>> > > >   - scale_npp (for format conversion on gpu)
> > >>> > > >
> > >>> > > > I have no idea what the options pix_fmt or other filters like
> > >>> colorspace
> > >>> > > do
> > >>> > > > for hardware (how is pix_fmt different from
> > >>> hwaccel_output_format?). At
> > >>> > > > this point I'm kind of stuck. Don't know how to convert formats
> > on
> > >>> the
> > >>> > > GPU
> > >>> > > > (I assume the format conversion is happening on the CPU).
> > >>> > > >
> > >>> > > > Input details:
> > >>> > > > ffprobe input.mp4
> > >>> > > >
> > >>> > > > Stream #0:0(eng): Video: hevc (Main 10) (hvc1 / 0x31637668),
> > >>> > > > yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 1920x1080, 24886
> > kb/s,
> > >>> SAR
> > >>> > > 1:1
> > >>> > > > DAR 16:9, 29.99 fps, ...
> > >>> > > >
> > >>> > > > Summary of various combinations (- indicates left blank):
> > >>> > > > test | hwaccel | hwaccel_output_format | filter (vf)
> >   |
> > >>> > > > encodefps | note
> > >>> > > > 1    | cuda    | -                     | -
> > >>> | X
> > >>> > > >    | Failed
> > >>> > > > 2    | cuda    | cuda                  | -
> > >>> | X
> > >>> > > >    | Failed
> > >>> > > > 3    | cuda    | yuv420p               | -
> > >>> | 361
> > >>> > > >    | Video messed up
> > >>> > > > 4    | cuda    | cuda                  | format=yuv420p
> > >>>  | X
> > >>> > > >    | Failed
> > >>> > > > 5    | cuvid   | cuda                  | format=yuv420p
> > >>>  | 91
> > >>> > > >   | Not using GPU decode
> > >>> > > > 6    | cuda    | -                     | format=yuv420p
> > >>>  | 161
> > >>> > > >    | Not using GPU format conversion
> > >>> > > > 7    | cuvid   | -                     | format=yuv420p
> > >>>  | 91
> > >>> > > >   | Not using GPU decode
> > >>> > > > 8    | cuda    | -                     |
> scale_npp=format=yuv420p
> > >>> | X
> > >>> > > >    | Failed
> > >>> > > > 9    | cuda    | cuda                  |
> scale_npp=format=yuv420p
> > >>> | X
> > >>> > > >    | Failed
> > >>> > > >
> > >>> > > > I would expect a speed of around test 3 (without the screwed up
> > >>> video).
> > >>> > > Is
> > >>> > > > there any way to convert the pixel formats on the hardware
> > without
> > >>> > > screwing
> > >>> > > > up the video? On a similar note, I'd love for someone to
> explain
> > >>> the
> > >>> > > > failing encodes.
> > >>> > > >
> > >>> > > > Here are the details for corresponding encodes:
> > >>> > > >
> > >>> > > >    1. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -c:v
> > >>> h264_nvenc
> > >>> > > >    output.mp4
> > >>> > > >
> > >>> > > >    Fails with the following:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 000001cc9670e4c0] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 000001cc8740fc00] NVDEC capabilities:
> > >>> > > >    [hevc @ 000001cc8740fc00] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 000001cc8740fc00] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 000001cc8740fc00] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 000001cc87420840] w:1920
> > h:1080
> > >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] 1 CUDA capable devices found
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] [ GPU #0 - < GeForce GTX
> 1650
> > >
> > >>> has
> > >>> > > >    Compute SM 7.5 ]
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] 10 bit encode not supported
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] No NVENC capable devices
> found
> > >>> > > >    [h264_nvenc @ 000001cc8747fbc0] Nvenc unloaded
> > >>> > > >    Error initializing output stream 0:0 -- Error while opening
> > >>> encoder
> > >>> > > for
> > >>> > > >    output stream #0:0 - maybe incorrect parameters such as
> > >>> bit_rate,
> > >>> > > rate,
> > >>> > > >    width or height
> > >>> > > >
> > >>> > > >    2. ffmpeg -loglevel verbose -hwaccel cuda
> > >>> -hwaccel_output_format cuda
> > >>> > > -i
> > >>> > > >    input.mp4 -c:v h264_nvenc output.mp4
> > >>> > > >
> > >>> > > >    Fails with the following:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 00000240b7932340] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 00000240b79e37c0] NVDEC capabilities:
> > >>> > > >    [hevc @ 00000240b79e37c0] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 00000240b79e37c0] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 00000240b79e37c0] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 00000240b7937e00] w:1920
> > h:1080
> > >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [h264_nvenc @ 00000240b7483700] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 00000240b7483700] 10 bit encode not supported
> > >>> > > >    [h264_nvenc @ 00000240b7483700] Provided device doesn't
> > support
> > >>> > > required
> > >>> > > >    NVENC features
> > >>> > > >    [h264_nvenc @ 00000240b7483700] Nvenc unloaded
> > >>> > > >    Error initializing output stream 0:0 -- Error while opening
> > >>> encoder
> > >>> > > for
> > >>> > > >    output stream #0:0 - maybe incorrect parameters such as
> > >>> bit_rate,
> > >>> > > rate,
> > >>> > > >    width or height
> > >>> > > >
> > >>> > > >    Alright, so it seems that the hardware h264 encoder doesn't
> > >>> support 10
> > >>> > > >    bit encodes (that's coming from the decoder). So lets try
> > >>> changing the
> > >>> > > >    format:
> > >>> > > >
> > >>> > > >
> > >>> > > >    3. ffmpeg -loglevel verbose -hwaccel cuda
> > -hwaccel_output_format
> > >>> > > yuv420p
> > >>> > > >    -i input.mp4 -c:v h264_nvenc output.mp4
> > >>> > > >
> > >>> > > >    Pretty decent encode at ~ 360 fps. Alas, the video is
> screwed
> > >>> up.
> > >>> > > Colors
> > >>> > > >    are weird:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 00000256c9ac7b40] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 00000256cbb737c0] NVDEC capabilities:
> > >>> > > >    [hevc @ 00000256cbb737c0] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 00000256cbb737c0] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 00000256cbb737c0] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 00000256cbac7e00] w:1920
> > h:1080
> > >>> > > >    pixfmt:yuv420p tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [h264_nvenc @ 00000256cb693700] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 00000256cb693700] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 00000256cb693700] 1 CUDA capable devices found
> > >>> > > >    [h264_nvenc @ 00000256cb693700] [ GPU #0 - < GeForce GTX
> 1650
> > >
> > >>> has
> > >>> > > >    Compute SM 7.5 ]
> > >>> > > >    [h264_nvenc @ 00000256cb693700] supports NVENC
> > >>> > > >
> > >>> > > >    Let's use a format filter to change format:
> > >>> > > >
> > >>> > > >    4. ffmpeg -loglevel verbose -hwaccel cuda
> > >>> -hwaccel_output_format cuda
> > >>> > > -i
> > >>> > > >    input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >>> > > >
> > >>> > > >    Fails with the following:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 0000019390de5c80] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 00000193908675c0] NVDEC capabilities:
> > >>> > > >    [hevc @ 00000193908675c0] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 00000193908675c0] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 00000193908675c0] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 00000193a031ee80] w:1920
> > h:1080
> > >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [auto_scaler_0 @ 00000193b7aee780] w:iw h:ih flags:'bicubic'
> > >>> interl:0
> > >>> > > >    [Parsed_format_0 @ 00000193908eee80] auto-inserting filter
> > >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from
> stream
> > >>> 0:0'
> > >>> > > and the
> > >>> > > >    filter 'Parsed_format_0'
> > >>> > > >    Impossible to convert between the formats supported by the
> > >>> filter
> > >>> > > 'graph
> > >>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >>> > > >    Error reinitializing filters!
> > >>> > > >    Failed to inject frame into filter network: Function not
> > >>> implemented
> > >>> > > >    Error while processing the decoded data for stream #0:0
> > >>> > > >
> > >>> > > >    5. ffmpeg -loglevel verbose -hwaccel cuvid
> > >>> -hwaccel_output_format cuda
> > >>> > > >    -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
> > >>> > > >
> > >>> > > >    Succeeds, but only encodes at around 91 fps, due to, I
> assume,
> > >>> not
> > >>> > > using
> > >>> > > >    GPU decoder. What is the difference between cuvid and cuda
> > >>> hwaccel
> > >>> > > (why did
> > >>> > > >    the previous fail and this succeed)? Here is the relevant
> > >>> output:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 000002152cc3cc00] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 000002152ac33700] Initializing cuvid hwaccel
> > >>> > > >    [AVHWFramesContext @ 000002152cc3f0c0] Pixel format
> > >>> 'yuv420p10le' is
> > >>> > > not
> > >>> > > >    supported
> > >>> > > >    [hevc @ 000002152ac33700] Error initializing a CUDA frame
> pool
> > >>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > >>> > > initialized.
> > >>> > > >    [hevc @ 000002152ac33700] Error parsing NAL unit #2.
> > >>> > > >    [hevc @ 000002152ac79180] Could not find ref with POC 0
> > >>> > > >    Error while decoding stream #0:0: Operation not permitted
> > >>> > > >    [graph 0 input from stream 0:0 @ 000002152d638b80] w:1920
> > h:1080
> > >>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1
> > sws_param:flags=2
> > >>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:iw h:ih flags:'bicubic'
> > >>> interl:0
> > >>> > > >    [Parsed_format_0 @ 000002152d3fee40] auto-inserting filter
> > >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from
> stream
> > >>> 0:0'
> > >>> > > and the
> > >>> > > >    filter 'Parsed_format_0'
> > >>> > > >    [auto_scaler_0 @ 000002152ca176c0] w:1920 h:1080
> > fmt:yuv420p10le
> > >>> > > sar:1/1
> > >>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >>> > > >    [h264_nvenc @ 000002152ac31800] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 000002152ac31800] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 000002152ac31800] 1 CUDA capable devices found
> > >>> > > >    [h264_nvenc @ 000002152ac31800] [ GPU #0 - < GeForce GTX
> 1650
> > >
> > >>> has
> > >>> > > >    Compute SM 7.5 ]
> > >>> > > >    [h264_nvenc @ 000002152ac31800] supports NVENC
> > >>> > > >
> > >>> > > >    Take out hwaccel_output:
> > >>> > > >
> > >>> > > >    6. ffmpeg -loglevel verbose -hwaccel cuda -i in.mp4 -vf
> > >>> format=yuv420p
> > >>> > > >    -c:v h264_nvenc out.mp4
> > >>> > > >
> > >>> > > >    Succeeds, encodes at 161 fps (using both hardware GPU
> decoder
> > >>> and
> > >>> > > >    encoder, but I believe the changing of format is happening
> on
> > >>> the CPU
> > >>> > > >    between the two stages).
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 0000025491bf2b00] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 0000025491b84900] NVDEC capabilities:
> > >>> > > >    [hevc @ 0000025491b84900] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 0000025491b84900] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 0000025491b84900] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 0000025491c0eec0] w:1920
> > h:1080
> > >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:iw h:ih flags:'bicubic'
> > >>> interl:0
> > >>> > > >    [Parsed_format_0 @ 000002549203d840] auto-inserting filter
> > >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from
> stream
> > >>> 0:0'
> > >>> > > and the
> > >>> > > >    filter 'Parsed_format_0'
> > >>> > > >    [auto_scaler_0 @ 00000254b747cfc0] w:1920 h:1080 fmt:p010le
> > >>> sar:1/1 ->
> > >>> > > >    w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >>> > > >    [h264_nvenc @ 00000254920a0f40] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 00000254920a0f40] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 00000254920a0f40] 1 CUDA capable devices found
> > >>> > > >    [h264_nvenc @ 00000254920a0f40] [ GPU #0 - < GeForce GTX
> 1650
> > >
> > >>> has
> > >>> > > >    Compute SM 7.5 ]
> > >>> > > >    [h264_nvenc @ 00000254920a0f40] supports NVENC
> > >>> > > >
> > >>> > > >
> > >>> > > >    7. ffmpeg -loglevel verbose -hwaccel cuvid -i in.mp4 -vf
> > >>> > > format=yuv420p
> > >>> > > >    -c:v h264_nvenc out.mp4
> > >>> > > >
> > >>> > > >    Only encoding on GPU, not decoding (91 fps).
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 000002163875b5c0] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 00000216380c3c00] Initializing cuvid hwaccel
> > >>> > > >    [AVHWFramesContext @ 00000216387fc300] Pixel format
> > >>> 'yuv420p10le' is
> > >>> > > not
> > >>> > > >    supported
> > >>> > > >    [hevc @ 00000216380c3c00] Error initializing a CUDA frame
> pool
> > >>> > > >    cuvid hwaccel requested for input stream #0:0, but cannot be
> > >>> > > initialized.
> > >>> > > >    [hevc @ 00000216380c3c00] Error parsing NAL unit #2.
> > >>> > > >    [hevc @ 000002163813d300] Could not find ref with POC 0
> > >>> > > >    Error while decoding stream #0:0: Operation not permitted
> > >>> > > >    [graph 0 input from stream 0:0 @ 00000216387594c0] w:1920
> > h:1080
> > >>> > > >    pixfmt:yuv420p10le tb:1/90000 fr:30/1 sar:1/1
> > sws_param:flags=2
> > >>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:iw h:ih flags:'bicubic'
> > >>> interl:0
> > >>> > > >    [Parsed_format_0 @ 00000216387593c0] auto-inserting filter
> > >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from
> stream
> > >>> 0:0'
> > >>> > > and the
> > >>> > > >    filter 'Parsed_format_0'
> > >>> > > >    [auto_scaler_0 @ 000002164f8a0c40] w:1920 h:1080
> > fmt:yuv420p10le
> > >>> > > sar:1/1
> > >>> > > >    -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x4
> > >>> > > >    [h264_nvenc @ 0000021638590f40] Loaded Nvenc version 9.0
> > >>> > > >    [h264_nvenc @ 0000021638590f40] Nvenc initialized
> successfully
> > >>> > > >    [h264_nvenc @ 0000021638590f40] 1 CUDA capable devices found
> > >>> > > >    [h264_nvenc @ 0000021638590f40] [ GPU #0 - < GeForce GTX
> 1650
> > >
> > >>> has
> > >>> > > >    Compute SM 7.5 ]
> > >>> > > >    [h264_nvenc @ 0000021638590f40] supports NVENC
> > >>> > > >
> > >>> > > >    Lets see if I can do format conversion in the GPU (instead
> of
> > >>> GPU ->
> > >>> > > CPU
> > >>> > > >    -> GPU), by using the scale_npp filter.
> > >>> > > >
> > >>> > > >    8. ffmpeg -loglevel verbose -hwaccel cuda -i input.mp4 -vf
> > >>> > > >    scale_npp=format=yuv420p -c:v h264_nvenc output.mp4
> > >>> > > >
> > >>> > > >    Fails
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 0000022f3001e080] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 0000022f207d7f40] NVDEC capabilities:
> > >>> > > >    [hevc @ 0000022f207d7f40] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 0000022f207d7f40] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 0000022f207d7f40] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 0000022f3034ee80] w:1920
> > h:1080
> > >>> > > >    pixfmt:p010le tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [auto_scaler_0 @ 0000022f47b2d300] w:iw h:ih flags:'bicubic'
> > >>> interl:0
> > >>> > > >    [Parsed_scale_npp_0 @ 0000022f20c49b40] auto-inserting
> filter
> > >>> > > >    'auto_scaler_0' between the filter 'graph 0 input from
> stream
> > >>> 0:0'
> > >>> > > and the
> > >>> > > >    filter 'Parsed_scale_npp_0'
> > >>> > > >    Impossible to convert between the formats supported by the
> > >>> filter
> > >>> > > 'graph
> > >>> > > >    0 input from stream 0:0' and the filter 'auto_scaler_0'
> > >>> > > >    Error reinitializing filters!
> > >>> > > >    Failed to inject frame into filter network: Function not
> > >>> implemented
> > >>> > > >    Error while processing the decoded data for stream #0:0
> > >>> > > >
> > >>> > > >
> > >>> > > >    9. ffmpeg -loglevel verbose -hwaccel cuda
> > >>> -hwaccel_output_format cuda
> > >>> > > -i
> > >>> > > >    in.mp4 -vf scale_npp=format=yuv420p -c:v h264_nvenc out.mp4
> > >>> > > >
> > >>> > > >    Fails:
> > >>> > > >
> > >>> > > >    [graph_1_in_0_1 @ 00000200040adac0] tb:1/48000
> samplefmt:fltp
> > >>> > > >    samplerate:48000 chlayout:0x3
> > >>> > > >    [hevc @ 00000200747b65c0] NVDEC capabilities:
> > >>> > > >    [hevc @ 00000200747b65c0] format supported: yes,
> max_mb_count:
> > >>> 262144
> > >>> > > >    [hevc @ 00000200747b65c0] min_width: 144, max_width: 8192
> > >>> > > >    [hevc @ 00000200747b65c0] min_height: 144, max_height: 8192
> > >>> > > >    [graph 0 input from stream 0:0 @ 00000200040aa8c0] w:1920
> > h:1080
> > >>> > > >    pixfmt:cuda tb:1/90000 fr:30/1 sar:1/1 sws_param:flags=2
> > >>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Unsupported input
> > >>> format:
> > >>> > > p010le
> > >>> > > >    [Parsed_scale_npp_0 @ 0000020074c75b80] Failed to configure
> > >>> output pad
> > >>> > > >    on Parsed_scale_npp_0
> > >>> > > >    Error reinitializing filters!
> > >>> > > >    Failed to inject frame into filter network: Function not
> > >>> implemented
> > >>> > > >    Error while processing the decoded data for stream #0:0
> > >>> > > >
> > >>> > > >
> > >>> > > > I'd appreciate any help or pointer in the right direction (even
> > an
> > >>> > > > alternate mailing list).
> > >>> > >
> > >>> > >
> > >>> > > Hey there,
> > >>> > >
> > >>> > > Could you kindly provide a download link to the sample of the
> input
> > >>> > > file you're working on?
> > >>> > > That way we can reproduce what you're seeing here, thanks!
> > >>>
> > >>> Another thing:
> > >>>
> > >>> What version of FFmpeg are you running?
> > >>> _______________________________________________
> > >>> ffmpeg-user mailing list
> > >>> [hidden email]
> > >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > >>>
> > >>> To unsubscribe, visit link above, or email
> > >>> [hidden email] with subject "unsubscribe".
> > >>
> > >>
> > _______________________________________________
> > ffmpeg-user mailing list
> > [hidden email]
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > [hidden email] with subject "unsubscribe".
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> [hidden email] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[hidden email]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[hidden email] with subject "unsubscribe".