How to use ocr filter

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to use ocr filter

nicolab
When I using ocr filter, how to output ocr text file ?
https://ffmpeg.org/ffmpeg-filters.html#ocr

img.png


ffmpeg -f lavfi -i "movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.ocr.text"
out.png -y -loglevel 99
ffmpeg version 2.8.git Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --prefix=/mingw/i686-w64-mingw32 --enable-version3 --enable-gpl
 --enable-memalign-hack --enable-w32threads --enable-libtesseract --disable-outdev=sdl
 --disable-ffplay --disable-ffprobe --disable-ffserver --disable-doc --disable-htmlpages
 --disable-manpages --disable-podpages --disable-txtpages --disable-debug
 --pkg-config-flags=--static
  libavutil      55.  2.100 / 55.  2.100
  libavcodec     57.  2.100 / 57.  2.100
  libavformat    57.  2.100 / 57.  2.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6.  4.100 /  6.  4.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.100 /  2.  0.100
  libpostproc    54.  0.100 / 54.  0.100
Splitting the commandline.
Reading option '-f' ... matched as option 'f' (force format) with argument 'lavf
i'.
Reading option '-i' ... matched as input file with argument 'movie=img.png,ocr=d
atapath=tessdata:language=eng,drawgraph=lavfi.ocr.text'.
Reading option 'out.png' ... matched as output file.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argu
ment '1'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level)
with argument '99'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option y (overwrite output files) with argument 1.
Applying option loglevel (set logging level) with argument 99.
Successfully parsed a group of options.
Parsing a group of options: input file movie=img.png,ocr=datapath=tessdata:langu
age=eng,drawgraph=lavfi.ocr.text.
Applying option f (force format) with argument lavfi.
Successfully parsed a group of options.
Opening an input file: movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
h=lavfi.ocr.text.
detected 4 logical cores
[Parsed_movie_0 @ 02438040] Setting 'filename' to value 'img.png'
Probing image2 score:50 size:929
Probing mp3 score:1 size:929
Probing png_pipe score:99 size:929
[png_pipe @ 02438480] Format png_pipe probed with size=2048 and score=99
[png_pipe @ 02438480] Before avformat_find_stream_info() pos: 0 bytes read:929 s
eeks:0
[png_pipe @ 02438480] 0: start_time: -9223372036854.775 duration: -9223372036854
.775
[png_pipe @ 02438480] stream: start_time: -9223372036854.775 duration: -92233720
36854.775 bitrate=0 kb/s
[png_pipe @ 02438480] After avformat_find_stream_info() pos: 929 bytes read:929
seeks:0 frames:1
[Parsed_movie_0 @ 02438040] seek_point:0 format_name:(null) file_name:img.png st
ream_index:-1
[Parsed_ocr_1 @ 04813f80] Setting 'datapath' to value 'tessdata'
[Parsed_ocr_1 @ 04813f80] Setting 'language' to value 'eng'
[Parsed_ocr_1 @ 04813f80] Tesseract version: 3.02
[Parsed_drawgraph_2 @ 024375e0] Setting 'm1' to value 'lavfi.ocr.text'
[auto-inserted scaler 0 @ 048187c0] w:iw h:ih flags:'bilinear' interl:0
[Parsed_ocr_1 @ 04813f80] auto-inserting filter 'auto-inserted scaler 0' between
 the filter 'Parsed_movie_0' and the filter 'Parsed_ocr_1'
[AVFilterGraph @ 02437580] query_formats: 4 queried, 2 merged, 1 already done, 0
 delayed
[auto-inserted scaler 0 @ 048187c0] picking yuv444p out of 15 ref:rgb24 alpha:0
[auto-inserted scaler 0 @ 048187c0] w:160 h:48 fmt:rgb24 sar:1/1 -> w:160 h:48 f
mt:yuv444p sar:1/1 flags:0x2
[lavfi @ 024331e0] All info found
[lavfi @ 024331e0] 0: start_time: 0.000 duration: -9223372036854.775
[lavfi @ 024331e0] stream: start_time: 0.000 duration: -9223372036854.775 bitrat
e=0 kb/s
Input #0, lavfi, from 'movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
h=lavfi.ocr.text':
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #0:0, 1, 1/25: Video: rawvideo, 1 reference frame (RGBA / 0x41424752)
, rgba, 900x256 [SAR 1:1 DAR 225:64], 1/25, 25 tbr, 25 tbn, 25 tbc
Successfully opened the file.
Parsing a group of options: output file out.png.
Successfully parsed a group of options.
Opening an output file: out.png.
Successfully opened the file.
[graph 0 input from stream 0:0 @ 04838fa0] Setting 'video_size' to value '900x25
6'
[graph 0 input from stream 0:0 @ 04838fa0] Setting 'pix_fmt' to value '28'
[graph 0 input from stream 0:0 @ 04838fa0] Setting 'time_base' to value '1/25'
[graph 0 input from stream 0:0 @ 04838fa0] Setting 'pixel_aspect' to value '1/1'

[graph 0 input from stream 0:0 @ 04838fa0] Setting 'sws_param' to value 'flags=2
'
[graph 0 input from stream 0:0 @ 04838fa0] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 04838fa0] w:900 h:256 pixfmt:rgba tb:1/25 fr:25
/1 sar:1/1 sws_param:flags=2
[format @ 04838a60] compat: called with args=[rgb24|rgba|rgb48be|rgba64be|pal8|g
ray|ya8|gray16be|ya16be|monob]
[format @ 04838a60] Setting 'pix_fmts' to value 'rgb24|rgba|rgb48be|rgba64be|pal
8|gray|ya8|gray16be|ya16be|monob'
[AVFilterGraph @ 04817400] query_formats: 4 queried, 3 merged, 0 already done, 0
 delayed
Output #0, image2, to 'out.png':
  Metadata:
    encoder         : Lavf57.2.100
    Stream #0:0, 0, 1/25: Video: png, 1 reference frame, rgba, 900x256 [SAR 1:1
DAR 225:64], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
    Metadata:
      encoder         : Lavc57.2.100 png
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> png (native))
Press [q] to stop, [?] for help
Cliping frame in rate conversion by 0.000008
[output stream 0:0 @ 048391e0] EOF on sink link output stream 0:0:default.
No more output streams to write to, finishing.
[AVIOContext @ 048416e0] Statistics: 0 seeks, 1 writeouts
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A
video:2kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing ove
rhead: unknown
Input file #0 (movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.
ocr.text):
  Input stream #0:0 (video): 1 packets read (921638 bytes); 1 frames decoded;
  Total: 1 packets (921638 bytes) demuxed
Output file #0 (out.png):
  Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (1543 bytes);
  Total: 1 packets (1543 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 02438a80] Statistics: 929 bytes read, 0 seeks
https://twitter.com/nico_lab http://nico-lab.net/
Reply | Threaded
Open this post in threaded view
|

Re: How to use ocr filter

Paul B Mahol
On 9/17/15, nicolab <[hidden email]> wrote:
> When I using ocr filter, how to output ocr text file ?
> https://ffmpeg.org/ffmpeg-filters.html#ocr
>
> img.png
> <http://ffmpeg-users.933282.n4.nabble.com/file/n4672454/img.png>
>
> ffmpeg -f lavfi -i
> "movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.ocr.text"
> out.png -y -loglevel 99

drawgraph accepts only floats values.

ffplay ~/img.png -vf
"ocr,split[ocr][o1],[ocr]lutyuv=y=0:u=128:v=128,drawtext=fontcolor=white:x=10:y=10:text=%{metadata\\:lavfi.ocr.text}[o2],[o1][o2]vstack"

> ffmpeg version 2.8.git Copyright (c) 2000-2015 the FFmpeg developers
>   built with gcc 5.2.0 (GCC)
>   configuration: --prefix=/mingw/i686-w64-mingw32 --enable-version3
> --enable-gpl
>  --enable-memalign-hack --enable-w32threads --enable-libtesseract
> --disable-outdev=sdl
>  --disable-ffplay --disable-ffprobe --disable-ffserver --disable-doc
> --disable-htmlpages
>  --disable-manpages --disable-podpages --disable-txtpages --disable-debug
>  --pkg-config-flags=--static
>   libavutil      55.  2.100 / 55.  2.100
>   libavcodec     57.  2.100 / 57.  2.100
>   libavformat    57.  2.100 / 57.  2.100
>   libavdevice    57.  0.100 / 57.  0.100
>   libavfilter     6.  4.100 /  6.  4.100
>   libswscale      4.  0.100 /  4.  0.100
>   libswresample   2.  0.100 /  2.  0.100
>   libpostproc    54.  0.100 / 54.  0.100
> Splitting the commandline.
> Reading option '-f' ... matched as option 'f' (force format) with argument
> 'lavf
> i'.
> Reading option '-i' ... matched as input file with argument
> 'movie=img.png,ocr=d
> atapath=tessdata:language=eng,drawgraph=lavfi.ocr.text'.
> Reading option 'out.png' ... matched as output file.
> Reading option '-y' ... matched as option 'y' (overwrite output files) with
> argu
> ment '1'.
> Reading option '-loglevel' ... matched as option 'loglevel' (set logging
> level)
> with argument '99'.
> Finished splitting the commandline.
> Parsing a group of options: global .
> Applying option y (overwrite output files) with argument 1.
> Applying option loglevel (set logging level) with argument 99.
> Successfully parsed a group of options.
> Parsing a group of options: input file
> movie=img.png,ocr=datapath=tessdata:langu
> age=eng,drawgraph=lavfi.ocr.text.
> Applying option f (force format) with argument lavfi.
> Successfully parsed a group of options.
> Opening an input file:
> movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
> h=lavfi.ocr.text.
> detected 4 logical cores
> [Parsed_movie_0 @ 02438040] Setting 'filename' to value 'img.png'
> Probing image2 score:50 size:929
> Probing mp3 score:1 size:929
> Probing png_pipe score:99 size:929
> [png_pipe @ 02438480] Format png_pipe probed with size=2048 and score=99
> [png_pipe @ 02438480] Before avformat_find_stream_info() pos: 0 bytes
> read:929 s
> eeks:0
> [png_pipe @ 02438480] 0: start_time: -9223372036854.775 duration:
> -9223372036854
> .775
> [png_pipe @ 02438480] stream: start_time: -9223372036854.775 duration:
> -92233720
> 36854.775 bitrate=0 kb/s
> [png_pipe @ 02438480] After avformat_find_stream_info() pos: 929 bytes
> read:929
> seeks:0 frames:1
> [Parsed_movie_0 @ 02438040] seek_point:0 format_name:(null)
> file_name:img.png st
> ream_index:-1
> [Parsed_ocr_1 @ 04813f80] Setting 'datapath' to value 'tessdata'
> [Parsed_ocr_1 @ 04813f80] Setting 'language' to value 'eng'
> [Parsed_ocr_1 @ 04813f80] Tesseract version: 3.02
> [Parsed_drawgraph_2 @ 024375e0] Setting 'm1' to value 'lavfi.ocr.text'
> [auto-inserted scaler 0 @ 048187c0] w:iw h:ih flags:'bilinear' interl:0
> [Parsed_ocr_1 @ 04813f80] auto-inserting filter 'auto-inserted scaler 0'
> between
>  the filter 'Parsed_movie_0' and the filter 'Parsed_ocr_1'
> [AVFilterGraph @ 02437580] query_formats: 4 queried, 2 merged, 1 already
> done, 0
>  delayed
> [auto-inserted scaler 0 @ 048187c0] picking yuv444p out of 15 ref:rgb24
> alpha:0
> [auto-inserted scaler 0 @ 048187c0] w:160 h:48 fmt:rgb24 sar:1/1 -> w:160
> h:48 f
> mt:yuv444p sar:1/1 flags:0x2
> [lavfi @ 024331e0] All info found
> [lavfi @ 024331e0] 0: start_time: 0.000 duration: -9223372036854.775
> [lavfi @ 024331e0] stream: start_time: 0.000 duration: -9223372036854.775
> bitrat
> e=0 kb/s
> Input #0, lavfi, from
> 'movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
> h=lavfi.ocr.text':
>   Duration: N/A, start: 0.000000, bitrate: N/A
>     Stream #0:0, 1, 1/25: Video: rawvideo, 1 reference frame (RGBA /
> 0x41424752)
> , rgba, 900x256 [SAR 1:1 DAR 225:64], 1/25, 25 tbr, 25 tbn, 25 tbc
> Successfully opened the file.
> Parsing a group of options: output file out.png.
> Successfully parsed a group of options.
> Opening an output file: out.png.
> Successfully opened the file.
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'video_size' to value
> '900x25
> 6'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'pix_fmt' to value '28'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'time_base' to value
> '1/25'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'pixel_aspect' to value
> '1/1'
>
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'sws_param' to value
> 'flags=2
> '
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'frame_rate' to value
> '25/1'
> [graph 0 input from stream 0:0 @ 04838fa0] w:900 h:256 pixfmt:rgba tb:1/25
> fr:25
> /1 sar:1/1 sws_param:flags=2
> [format @ 04838a60] compat: called with
> args=[rgb24|rgba|rgb48be|rgba64be|pal8|g
> ray|ya8|gray16be|ya16be|monob]
> [format @ 04838a60] Setting 'pix_fmts' to value
> 'rgb24|rgba|rgb48be|rgba64be|pal
> 8|gray|ya8|gray16be|ya16be|monob'
> [AVFilterGraph @ 04817400] query_formats: 4 queried, 3 merged, 0 already
> done, 0
>  delayed
> Output #0, image2, to 'out.png':
>   Metadata:
>     encoder         : Lavf57.2.100
>     Stream #0:0, 0, 1/25: Video: png, 1 reference frame, rgba, 900x256 [SAR
> 1:1
> DAR 225:64], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
>     Metadata:
>       encoder         : Lavc57.2.100 png
> Stream mapping:
>   Stream #0:0 -> #0:0 (rawvideo (native) -> png (native))
> Press [q] to stop, [?] for help
> Cliping frame in rate conversion by 0.000008
> [output stream 0:0 @ 048391e0] EOF on sink link output stream 0:0:default.
> No more output streams to write to, finishing.
> [AVIOContext @ 048416e0] Statistics: 0 seeks, 1 writeouts
> frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A
> video:2kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing
> ove
> rhead: unknown
> Input file #0
> (movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.
> ocr.text):
>   Input stream #0:0 (video): 1 packets read (921638 bytes); 1 frames
> decoded;
>   Total: 1 packets (921638 bytes) demuxed
> Output file #0 (out.png):
>   Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (1543
> bytes);
>   Total: 1 packets (1543 bytes) muxed
> 1 frames successfully decoded, 0 decoding errors
> [AVIOContext @ 02438a80] Statistics: 929 bytes read, 0 seeks
>
>
>
> -----
> https://twitter.com/nico_lab
> http://nico-lab.net/
> --
> View this message in context:
> http://ffmpeg-users.933282.n4.nabble.com/How-to-use-ocr-filter-tp4672454.html
> Sent from the FFmpeg-users mailing list archive at Nabble.com.
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> http://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Reply | Threaded
Open this post in threaded view
|

Re: How to use ocr filter

Moritz Barsnick
On Thu, Sep 17, 2015 at 15:12:17 +0000, Paul B Mahol wrote:
> On 9/17/15, nicolab <[hidden email]> wrote:
> > When I using ocr filter, how to output ocr text file ?

> drawgraph accepts only floats values.
>
> ffplay ~/img.png -vf "ocr,split[ocr][o1],[ocr]lutyuv=y=0:u=128:v=128,drawtext=fontcolor=white:x=10:y=10:text=%{metadata\\:lavfi.ocr.text}[o2],[o1][o2]vstack"

Whatever you're doing, Paul, isn't outputting to text file, as nicolab
requested. ;-)

I think ffprobe is the correct approach. It can output - as text - the
metadata which the ocr filter inserts.

I can't test, since none of my binaries has ocr support, but probably
something like:

$ ffprobe -show_frames -f lavfi -i "movie=img.png,ocr"

and some additional options for selecting the correct fields.

ffmpeg also has an output format "ffmetadata", that might help as well
(but I guess it contains _all_ metadata, and fields can't be selected
or filtered).

Moritz
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Reply | Threaded
Open this post in threaded view
|

Re: How to use ocr filter

Paul B Mahol
On 9/17/15, Moritz Barsnick <[hidden email]> wrote:

> On Thu, Sep 17, 2015 at 15:12:17 +0000, Paul B Mahol wrote:
>> On 9/17/15, nicolab <[hidden email]> wrote:
>> > When I using ocr filter, how to output ocr text file ?
>
>> drawgraph accepts only floats values.
>>
>> ffplay ~/img.png -vf
>> "ocr,split[ocr][o1],[ocr]lutyuv=y=0:u=128:v=128,drawtext=fontcolor=white:x=10:y=10:text=%{metadata\\:lavfi.ocr.text}[o2],[o1][o2]vstack"
>
> Whatever you're doing, Paul, isn't outputting to text file, as nicolab
> requested. ;-)
>
> I think ffprobe is the correct approach. It can output - as text - the
> metadata which the ocr filter inserts.
>
> I can't test, since none of my binaries has ocr support, but probably
> something like:
>
> $ ffprobe -show_frames -f lavfi -i "movie=img.png,ocr"
>
> and some additional options for selecting the correct fields.
>
> ffmpeg also has an output format "ffmetadata", that might help as well
> (but I guess it contains _all_ metadata, and fields can't be selected
> or filtered).

ffprobe -show_entries frame_tags=lavfi.ocr.text -f lavfi -i "movie=img.png,ocr"

>
> Moritz
> _______________________________________________
> ffmpeg-user mailing list
> [hidden email]
> http://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Reply | Threaded
Open this post in threaded view
|

Re: How to use ocr filter

nicolab
>On 9/17/15, Moritz Barsnick <[hidden email]> wrote:

>> On Thu, Sep 17, 2015 at 15:12:17 +0000, Paul B Mahol wrote:
>>> On 9/17/15, nicolab <[hidden email]> wrote:
>>> > When I using ocr filter, how to output ocr text file ?
>>
>>> drawgraph accepts only floats values.
>>>
>>> ffplay ~/img.png -vf
>>> "ocr,split[ocr][o1],[ocr]lutyuv=y=0:u=128:v=128,drawtext=fontcolor=white:x=10:y=10:text=%{metadata\\:lavfi.ocr.text}[o2],[o1][o2]vstack"
>>
>> Whatever you're doing, Paul, isn't outputting to text file, as nicolab
>> requested. ;-)
>>
>> I think ffprobe is the correct approach. It can output - as text - the
>> metadata which the ocr filter inserts.
>>
>> I can't test, since none of my binaries has ocr support, but probably
>> something like:
>>
>> $ ffprobe -show_frames -f lavfi -i "movie=img.png,ocr"
>>
>> and some additional options for selecting the correct fields.
>>
>> ffmpeg also has an output format "ffmetadata", that might help as well
>> (but I guess it contains _all_ metadata, and fields can't be selected
>> or filtered).

>ffprobe -show_entries frame_tags=lavfi.ocr.text -f lavfi -i "movie=img.png,ocr"

Thanks! I can output ocr text.

ffprobe -show_entries frame_tags=lavfi.ocr.text -f lavfi -i
"movie=img.png,ocr" > ocr.txt

https://www.ffmpeg.org/ffprobe.html

>> Moritz
>> _______________________________________________
>> ffmpeg-user mailing list
>> [hidden email]
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-user

>_____________________________________________
>ffmpeg-user mailing list
>[hidden email]
>http://ffmpeg.org/mailman/listinfo/ffmpeg-user
https://twitter.com/nico_lab http://nico-lab.net/