2-10 times faster execution compared to GPU + Python-based deep learning OCR. Capable of processing 7000+ characters per second on desktop. Enables simultaneous execution of 20 processes of deep learning OCR with 2 times faster execution on PC (4-8 processes on a notebook PC)

Common Effects with Other Deep Learning OCR

Improved accuracy with deep learning support

(Conventional OCR uses statistically compressed pattern data of 600 font patterns and 800MB (monochrome))

(Conventional OCR supports only monochrome 2-value. It is binarized before library invocation)

What makes our deep learning support different from others

High-speed recognition

Multi-threading enables recognition speeds of approximately 3 to 4 times faster (over 2000 characters per second) on a laptop

Condition python + TensorFlow (GPU enabled)	Speed
Inference processing during training Character image of 48 pixels × 48 pixels from the beginning Mini batch size 1024	7000 characters per second Multi-threading and multi-processing not available
Case where inference processing is called one character at a time Character image of 48 pixels × 48 pixels from the beginning Mini batch size 1	700 characters per second Multi-threading and multi-processing not available
Case where inference processing is called one character at a time (almost the same conditions as our library (without language processing)) Paragraph extraction / line extraction / character extraction / normalization to 48 × 48 pixels Mini batch size 1	350 characters per second Multi-threading and multi-processing not available
Case where inference processing is called one line (average 17 characters) at a time (without language processing) Paragraph extraction / line extraction / character extraction / normalization to 48 × 48 pixels Mini batch size 17	1200 characters per second Multi-threading and multi-processing not available

Condition C++ version deep learning compatible OCR library	Speed
Using the training results with python + TensorFlow (GPU enabled)
Operation of our library (one character at a time) Paragraph extraction / line extraction / character extraction / normalization to 48 × 48 pixels / language processing included	Speed of single-threaded deep learning OCR is 650 characters per second (32-bit version on a laptop) / 1300 characters per second or more on desktop conditions & 64-bit version Speed can be accelerated up to 2000 characters to 10,000 characters per second with multi-threading Speed can be accelerated by 1.8 times with 2 threads, achieving a recognition speed of 1200 characters per second (laptop conditions), surpassing the recognition speed of python + TensorFlow (GPU enabled) by 3 times Speed can be accelerated by 3.5 times with 4 threads. Speed of deep learning OCR reaching 2200 characters per second (laptop conditions), surpassing the recognition speed of python + TensorFlow (GPU enabled) by more than 6 times (It seems that the reason why speed does not increase by 4 times with 4 threads is that threads that cannot utilize CPU cache are generated) Multi-processing possible. Simultaneous execution of 4 to (laptop) and 8 to (desktop PC) processes without any decrease in speed

Condition C++ version deep learning compatible OCR library

Speed

Using the training results with python + TensorFlow (GPU enabled)

Operation of our library (one character at a time)
Paragraph extraction / line extraction / character extraction / normalization to 48 × 48 pixels / language processing included

Speed of single-threaded deep learning OCR is 650 characters per second (32-bit version on a laptop) / 1300 characters per second or more on desktop conditions & 64-bit version
Speed can be accelerated up to 2000 characters to 10,000 characters per second with multi-threading
Speed can be accelerated by 1.8 times with 2 threads, achieving a recognition speed of 1200 characters per second (laptop conditions), surpassing the recognition speed of python + TensorFlow (GPU enabled) by 3 times
Speed can be accelerated by 3.5 times with 4 threads. Speed of deep learning OCR reaching 2200 characters per second (laptop conditions), surpassing the recognition speed of python + TensorFlow (GPU enabled) by more than 6 times
(It seems that the reason why speed does not increase by 4 times with 4 threads is that threads that cannot utilize CPU cache are generated)
Multi-processing possible. Simultaneous execution of 4 to (laptop) and 8 to (desktop PC) processes without any decrease in speed

Training processing for patterns that cannot be recognized is instant (within 1 ms). Results are immediately reflected in recognition results.
Past assets such as user pattern dictionaries and user language dictionaries registered with conventional OCR libraries can be used as is

Approach: Compatibility with conventional OCR libraries

Support for existing library users even in the 32bit environment

Operation in environments without GPU necessary for existing library users/Need for multi-threaded operation

Operating Environment
Conventional OCR	Deep Learning OCR	Python version of Deep Learning OCR
32-bit/64-bit	32-bit/64-bit	64-bit
Parallel operation possible with multithreading	Parallel operation possible with multithreading	Cannot perform multithreaded operation
GPU not required	GPU not required	GPU required (very slow without it)
C++ implementation	C++ implementation	Python/TensorFlow/Keras implementation

Improved accuracy through the use of deep learning

Inference-based language processing (scheduled for release in the first quarter of 2023)

Inheritance of assets from conventional OCR libraries

Function and Performance Comparison
	Conventional OCR	Deep Learning OCR (Without GPU, 5.6 million parameters)	Deep Learning OCR (With GPU, 5.6 million parameters)
Overview	OCR based on a traditional method released in 2000, without GPU support (in C/C++)	A mode that inherits assets from conventional OCR while also enjoying the benefits of deep learning support. (in C/C++, without GPU support)	A speed comparison dedicated mode, using Python+TensorFlow+Keras with GPU support
Recognition Accuracy	High-quality: 99.0%~ Low-quality: 95.0%~	High-quality: 99.5%~ (Half the misrecognition rate of conventional OCR) Low-quality: 98%~ (Significant effect at low quality) Further reduction in misrecognition through AI language processing Adjustment of accuracy priority or speed priority is possible	High-quality: 99.5%~ (Half the misrecognition rate of conventional OCR) Low-quality: 98%~ (Significant effect at low quality)
Recognition Speed (including paragraph extraction, line extraction, and character extraction)	1300 characters/second Speed can be increased by 2 to 10 times with multi-threading (depending on the number of CPU cores) Approximately 4 times faster than python+TensorFlow with GPU utilization in single-threading	650 characters/second Speed can be increased by 2 to 10 times with multi-threading (depending on the number of CPU cores) Simultaneous multi-execution possible Simultaneous multi-execution of multi-threaded processes is also possible Approximately 2 times faster than python+TensorFlow with GPU utilization in single-threading	350 characters/second Cannot be multi-threaded/cannot be simultaneously multi-executed
Registered pattern dictionary by conventional OCR	Referenced with priority	Referenced with priority	Not available
Language processing	3-gram dictionary (co-occurrence frequency dictionary) Specialized terminology dictionary	3-gram dictionary (co-occurrence frequency dictionary) Specialized terminology dictionary AI dictionary	None
User-registered language dictionary from conventional OCR	Referenced with priority	Referenced with priority	Not available
Supported images	Monochrome binary Grayscale/color can be used by converting to monochrome binary outside the library	Monochrome binary/grayscale/color	Only grayscale

back