Nguyen Đình Cong
- Titre = Real-time LoG-based operator for scene text detection
Abstract: In this thesis, a novel real-time Laplacian of Gaussian (RT-LoG) operator is proposed for scene text detection. This operator applies a two-step process for box selection within the spatial and spatial/scale-space domains and kernel decomposition with the box filtering method. Two levels of optimization are given. The first level of optimization within the spatial domain is obtained by box multualization. The second level of optimization within the spatial/scale-space domains is performed using a mixed method for box selection. The proposed RT-LoG operator appears as the top operator for scene text detection with a balanced performance between accuracy and processing time. It speeds up approximately three times as much as the brute-force operator while ensuring a reduction by a half of the latency at a same resolution level. We have embedded this operator into a new two-stage system for scene text detection. Within this system, a dedicated grouping method of keypoints was proposed using the spa- tial/scale space representation of the RT-LoG operator. The grouping is optimized through a strategy for the scale-space partitioning. The proposed grouping method is near scale and contrast-invariant, supports a normalization process. A CNN is used in the final stage for a text verification. The overall system is competitive with the top accurate systems in the literature while requiring less than two orders of magnitude for the processing resources.
Keywords: Text detection, Laplacian of Gaussian (LoG), blobs, key-points, Gaussian filtering, scale-space, stroke model, RT-LoG operator, real-time, predictability.