IMPORTANT
|
Even if you perform OCR according to a language used in the originals, the proper result may not be obtained, depending on the text and file format of the originals.
|
Item
|
File Format
|
|
PDF/XPS/PowerPoint
|
Word
|
|
Recognition Language
|
Characters are recognized as one of the following languages or language groups according to a language selected in [Language/Keyboard Switch] in [Preferences] (Settings/Registration)*1 *2
|
Press [Change] to select a language used in the originals from the following languages or language groups. Characters are recognized according to the selected language.
|
Asian Languages
|
Text in the following languages is recognized:
Japanese, Chinese (Simplified), Chinese (Traditional), Korean
|
|
European Languages
|
Text in the following languages or language groups is recognized
Languages
English, French, Italian, German, Spanish, Dutch, Portuguese, Albanian, Catalan, Danish, Finnish, Icelandic, Norwegian, Swedish, Croatian, Czech, Hungarian, Polish, Slovak, Estonian, Latvian, Lithuanian, Russian, Greek, Turkish
Language Groups
Western European (ISO), Central European (ISO), Baltic (ISO) *3
|
|
Character Recognition for Asian Languages
|
||
Recognition Character Type
|
Japanese: Alphanumeric characters, Kana characters, Kanji characters (JIS first level, JIS second level (partly)), Symbols
Chinese (Simplified): Alphanumeric characters, Chinese characters, Symbols (GB2312-80)
Chinese (Traditional): Alphanumeric characters, Chinese characters, Symbols (Big5)
Korean: Alphanumeric characters, Kanji characters, Korean Hangul characters, Symbols (KSC5601)
|
|
Recognition Font
|
Multi font supported (Ming-cho type is recommended)
Italic type cannot be recognized
|
|
Converted Font
|
-
|
When Japanese is selected:
Asian text: MS Mincho
European text: Century
When Chinese (Simplified) is selected:
Asian text: SimSun
European text: Calibri
When Chinese (Traditional) is selected:
Asian text: PMingLiU
European text: Calibri
|
Character Recognition for European Languages
|
||
Recognition Character Type
|
Alphanumeric characters, Special characters of the recognized language*4, Symbols
|
|
Recognition Font
|
Multi font supported (Times, Century, and Arial are recommended)
Italic type can be recognized
|
|
Converted Font
|
-
|
Displayed in Calibli
Italic type cannot be converted
|
Western European (ISO):
|
English, French, Italian, German, Spanish, Dutch, Portuguese, Albanian, Catalan, Danish, Finnish, Icelandic, Norwegian, Swedish
|
Central European (ISO):
|
Croatian, Czech, Hungarian, Polish, Slovak
|
Baltic (ISO):
|
Estonian, Latvian, Lithuanian
|
IMPORTANT
|
If you use originals which contain a large amount of text per page, OCR may not perform properly.
When you select Word format, OCR may not perform properly even if you use originals in the recommended file format.
Depending on the background colour, character style, character size, and character slant, some characters may be replaced incorrectly or may be missing in the OCR result.
Paragraphs, breaks, and tables in the original may not be recognized.
A part of an image, such as graphics, photos, or seal imprints, may be recognized and replaced with text.
|
Item
|
Details
|
Format of Original
|
Printed documents, Text documents (a document which consists of text, figures, images, tables, and no character slant)
|
Format of Text
|
Horizontal writing, Vertical writing
Documents which contain both horizontal and vertical writing can be recognized.
Only horizontal writing can be recognized for European languages and Korean.
Document without complex columns
|
Character Size
|
8 to 40 point
|
Format of Table
(only for Word documents)
|
Tables that meet the following conditions
Square tables with solid lines
The number of rows is 32 or below
The number of columns is 32 or below
|