Improvements shared by conventional OCR libraries and deep learning OCR libraries
- Recognition of characters in contact with ruled lines Recognition of characters in contact with ruled lines has been conducted.
- Automatic italic detection, automatic vertical/horizontal writing detection, automatic orientation detection Although italic recognition has existed before, it was necessary to specify the angle of italic text from the application side. With automatic italic detection, the results are automatically determined by comparing the certainty of recognition in the original and several types of italic text, and selecting the result with the highest certainty.
In the table below, characters are in contact with ruled lines in over 100 places, both above, below, left and right.
Let's enlarge a part of it.
Furthermore, let's enlarge a part of it even more.
This time, it has become easier to automatically separate and recognize the ruled lines and characters. The green parts represent the characters.
Let's enlarge a part of the recognition results.
Similarly, vertical/horizontal writing detection and orientation detection could be specified from the application side before. With automatic detection, the determination can be made automatically by comparing the certainty of recognition for the necessary combinations of vertical, horizontal, and orientation (90°, 180°, 270° rotation).
However, in the case of automatic italic detection, if the certainty of recognition based on the original is low, it will take approximately 1 to 3 times the normal recognition time to recognize multiple angles of italic text. In cases where only part of the text is in italic, the recognition time remains mostly the same.
The same process is performed for vertical/horizontal writing detection. It is recognized as horizontal/vertical writing, and if the certainty is low, it is recognized as vertical/horizontal writing to compare the certainty.
In automatic orientation detection, it usually recognizes the lower side of the image (left side for vertical writing) as the bottom (left) of the string, and if the certainty is low, it rotates in 90-degree increments to compare the certainty.
When combining automatic italic detection + automatic vertical/horizontal writing detection + automatic orientation detection, it is important to note that it requires 24 times the recognition time with a maximum of 3 (original, italic 1, italic 2) x 2 (horizontal writing, vertical writing) x 4 (0°, 90°, 180°, 270° rotation).