ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai 华中科大
PAMI2018
本文代码 https://github.com/bgshih/aster 作者主页 http://cloud.eic.hust.edu.cn:8071/~xbai/
场景文本识别中的挑战是处理 distortions or irregular 的文本. 特别是 perspective text and curved text are common in natural scenes and are difficult to recognize。aster由a rectification network and a recognition network组成,rectification 网络自适应地将输入图像转换为新图像,对其中的文本进行矫正。识别网络是一个 attentional sequence-to-sequence model 。
训练只需要 images and their groundtruth text,此外aster还能够增强探测器的性能。
场景文本检测识别困难的原因是:The large variations in background,appearance, and layout pose significant challenges。不规则文本typical cases include oriented text, perspective text , and curved text。ASTER通过明确的矫正机制解决了不规则的文本问题。矫正网络是通过一个 parameterized Thin-Plate Spline(