This mode enables you to perform OCR (optical character recognition) to extract data that can be recognized as text from the scanned image and create a PDF/OOXML (pptx/docx) file that is searchable. You can also set <Compact> if you select PDF as the file format. |
1 | Select <PDF> press <OCR (Prioritize Speed)> or <OCR (Prioritize Precision)>. You cannot use OCR (Prioritize Precision) in the following cases: If you selected <OCR (Prioritize Precision)> and <Trace & Smooth> If you selected <OCR (Prioritize Precision)> and <Limited Color> (C7700 Series / C5700 Series only) If you set both <OCR (Prioritize Precision)> and <Compact>, the <Image Quality Level for Limited Color/Compact> or <Image Quality Level for Compact> setting is disabled. <Image Quality Level for Limited Color/Compact>/<Image Quality Level for Compact> If you create a PDF file with both <OCR (Prioritize Precision)> and <Compact> set, the image quality may differ from a PDF file created with <OCR (Prioritize Speed)> and <Compact> set. To change a language to use for OCR, press <OCR Language> select a language press <OK>. Only European languages can be detected with <OCR (Prioritize Precision)>. Settings and Languages for OCR Processing |
1 | Select <OOXML> <Word>. To change a language to use for OCR, press <Change> select a language or language group press <OK>. Select a language or language group according to the language used in the scanned documents. |
1 | Select <OOXML> <PowerPoint> <OCR (Text Searchable)>. |
2 | Select a language to use for OCR press <OK>. |
Long strip originals (25 1/4" (432 mm) or longer) cannot be used with <OCR (Text Searchable)>. |
If you select <PDF (OCR)> or <OOXML (OCR)> as the file format, and <Smart Scan> is set to <On> in <OCR (Text Searchable) Settings>/<OCR (Prioritize Speed)>, the orientation of the original is detected, and the document is automatically rotated if necessary before it is sent. <OCR (Text Searchable) Settings> If you select <OCR (Text Searchable)>, you can only send at a zoom ratio of <Direct>/<1:1> or <Auto>. If you select <PDF> as the file format, you can set <Compact> and <OCR (Text Searchable)> at the same time. In that case, <PDF (Compact)> is displayed as the file format on the Scan and Send Basic Features screen. If you select <Word> for <OOXML>, you can set to delete the scanned background images. You can generate Word files which are easy to edit without unwanted images. <Include Background Images in Word File> If you are currently using the <Scan and Store> function, the OCR language can only be specified when <Word> is selected for <OOXML> or <OCR (Prioritize Precision)> is selected for <PDF>. |
Item | Details |
Language Settings for Character Recognition | When a language is specified with OCR selected in <File Format>: Characters are recognized based on the language you select for each file format. When a language is not specified with OCR selected in <File Format>: Characters are recognized based on the language you select in <Switch Language/Keyboard> (<Switch Language/Keyboard>).*1 |
Recognizable Asian Languages*2 | Japanese, Chinese (Simplified), Chinese (Traditional), Korean Recognizable Character Types and Fonts (Asian Languages) |
Recognizable European Languages and Language Groups | Languages: English, French, Italian, German, Spanish, Dutch, Portuguese, Albanian, Catalan, Danish, Finnish, Icelandic, Norwegian, Swedish, Croatian, Czech, Hungarian, Polish, Slovak, Estonian, Latvian, Lithuanian, Russian, Greek, Turkish, Slovenian*3, Romanian*3, Bulgarian*3, Hebrew*3 Language Groups: Western European (ISO)*4, Central European (ISO)*5, Baltic (ISO)*6 Recognizable Character Types and Fonts (European Languages) |
Item | Details |
Recognizable Character Types | Japanese: Alphanumeric characters, Kana characters, Kanji characters (JIS first level, and some of the JIS second level), Symbols Chinese (Simplified): Alphanumeric characters, Chinese characters, Symbols (GB2312-80) Chinese (Traditional): Alphanumeric characters, Chinese characters, Symbols (Big5) Korean: Alphanumeric characters, Chinese characters, Hangul characters, Symbols (KSC5601) |
Recognizable Fonts | Multiple fonts are supported. (Ming-cho type is recommended.) Italicized characters cannot be recognized. |
Fonts Used for Converted Characters (Only when Word is selected as the file format) | Japanese: Asian characters: MS Mincho European characters: Century Chinese (Simplified): Asian characters: SimSun European characters: Calibri Chinese (Traditional): Asian characters: PMingLiU European characters: Calibri |
Item | Details |
Recognizable Character Types | Alphanumeric characters, Special characters of the recognized language*1, Symbols |
Recognizable Fonts | Multiple fonts are supported. (Times, Century, and Arial are recommended.)*2 Italicized characters can be recognized. |
Fonts Used for Converted Characters (Only when Word is selected as the file format) | Calibri Italic style is not reproduced. |
Item | Details |
Original Format | Printed documents, Word processor documents (documents consisting of text, graphics, photographs, or tables, and with no character slant) |
Text Format | Horizontal and vertical writing (documents containing both horizontal and vertical writing can also be recognized) Only horizontal writing can be recognized for European languages and Korean text. One to three column documents with no complex column settings |
Character Size | 8 to 40 point |
Table Format (For Word Format Only) | Tables that meet the following conditions: Tables consist of squares divided with solid lines Tables with up to 32 columns Tables with up to 32 rows |
Some originals suitable for OCR processing may not be processed properly.High accuracy may not be achieved with originals including a large amount of text on each page. Characters may be replaced with unintended characters or be missing due to the background color of the original, form and size of characters, or slanted characters.* Paragraphs, line breaks, or tables may not be reproduced.* Some parts of illustrations, photographs, or seal impressions may be recognized as characters and be replaced with characters.* * When Word is selected as the file format. |