Text Is Not OCR Processed Correctly

Text may not be OCR processed correctly when creating a text-searchable file. In this case, check whether the language setting for OCR processing and the original being used are appropriate.
You can change the language setting according to the original and improve the processing accuracy by using an original with a character type and font recognizable by the machine and using an original suitable for OCR processing.
IMPORTANT
When OCR Processing Is Not Performed Correctly Even with an Appropriate Language Setting and Original
The processing accuracy may not be able to be improved for originals with a large volume of text per page.
Note the following when creating a Word-format OOXML file:
The text may be replaced with unintended characters or characters may be missing depending on the background color, font, font size, italicization, and other factors.
Paragraphs, line breaks, and tables are not reproduced.
Some images such as diagrams, photos, and seals may be recognized as text and replaced with text.

Settings and Languages Standard for OCR Processing

Standard language settings for character recognition
The languages selected when setting OCR processing are the basis for character recognition. Creating a Text-Searchable File (OCR Processing)
Recognizable Asian Languages
Japanese, Chinese (Simplified), Chinese (Traditional), Korean
* For the character types and fonts, see the following:
Recognizable European Languages and Language Groups
Languages
English, French, Italian, German, Spanish, Dutch, Portuguese, Albanian, Catalan, Danish, Finnish, Icelandic, Norwegian, Swedish, Croatian, Czech, Hungarian, Polish, Slovak, Estonian, Latvian, Lithuanian, Russian, Greek, Turkish, Slovenian*1, Romanian*1, Bulgarian*1, Hebrew*1
Language Groups
Western European (ISO)*2, Central European (ISO)*3, Baltic (ISO)*4
* For the character types and fonts, see the following:
*1Can only be selected with [OCR (European Languages)].
*2Includes English, French, Italian, German, Spanish, Dutch, Portuguese, Albanian, Catalan, Danish, Finnish, Icelandic, Norwegian, and Swedish.
*3Includes English, Croatian, Czech, Hungarian, Polish, and Slovak.
*4Includes English, Estonian, Latvian, and Lithuanian.

Recognizable Character Types and Fonts (Asian Languages)

Recognizable character types
Japanese
Alphabet, numbers, kanji*1, symbols, hiragana, and katakana
Chinese (Simplified)
GB2312-80 (alphabet, numbers, kanji, and symbols)
Chinese (Traditional)
Big5 (alphabet, numbers, kanji, and symbols)
Korean
KSC5601 (alphabet, numbers, kanji, symbols, and Hangul)
Recognizable fonts
Multi-font support (Recommended: Mincho)
* Italicized characters cannot be recognized.
Recognizable font sizes
8 pt. to 48 pt.
Fonts used after OCR processing*2
Japanese
Asian characters: MS Mincho
European characters: Century
Chinese (Simplified)
Asian characters: SimSun
European characters: Calibri
Chinese (Traditional)
Asian characters: PMingLiU
European characters: Calibri
Korean
Asian characters: Malgun Gothic
European characters: Calibri
*1All JIS 1 standard kanji and some JIS 2 standard kanji
*2Only when creating a Word-format OOXML file

Recognizable Character Types and Fonts (European Languages)

Recognizable character types
Alphabet, characters unique to the recognition language*1, numbers, symbols
Recognizable fonts
Multi-font support (Recommended: Times, Century, Arial)*2
* Italicized characters can be recognized.
Recognizable font sizes
6 pt. to 72 pt.
Fonts used after OCR processing*3
Calibri
* Italics cannot be reproduced.
*1Depending on the language, some unique characters may not be recognized.
*2Arial, Times New Roman, and Courier New fonts can be recognized with [OCR (European Languages)].
*3Only when creating a Word-format OOXML file

Originals Suitable for OCR Processing

You can improve the OCR processing accuracy by using an original suitable for OCR processing.
File format of original
Printed documents and word processing documents
Originals composed of text, diagrams, photos, and/or tables and are not slanted
Text format
Horizontal or vertical writing (Documents with both horizontal and vertical writing can also be recognized)*1
Documents with one to three columns and without complex intricacies
Font size
8 pt. to 40 pt.
Table format*2
Tables that meet the following conditions:
Rectangular format consisting of solid border lines
32 columns or less
32 rows or less
*1Only horizontal writing is recognizable for European languages and Korean.
*2Only when creating a Word-format OOXML file
A08C-1E7