论文阅读和分析：Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition

原创

已于 2023-03-24 23:11:49 修改 · 656 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#论文阅读 #深度学习 #人工智能

于 2023-03-13 21:30:00 首次发布

该论文系列关注于手写数学表达式识别(HMER)的任务，提出了将DenseNet与多尺度注意力模型结合的方法，以改善细小符号如小数点的识别。核心模型包括多尺度注意力编码器，它利用GRU和覆盖注意力机制来处理特征。实验表明，增加DenseBlock的深度可以提高识别性能，并与其他模型相比表现出优势。

HMER论文系列
1、论文阅读和分析：When Counting Meets HMER Counting-Aware Network for HMER_KPer_Yang的博客-优快云博客
2、论文阅读和分析：Syntax-Aware Network for Handwritten Mathematical Expression Recognition_KPer_Yang的博客-优快云博客
3、论文阅读和分析：A Tree-Structured Decoder for Image-to-Markup Generation_KPer_Yang的博客-优快云博客
4、论文阅读和分析：Watch, attend and parse An end-to-end neural network based approach to HMER_KPer_Yang的博客-优快云博客
5、论文阅读和分析：Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition_KPer_Yang的博客-优快云博客
6、论文阅读和分析：Mathematical formula recognition using graph grammar_KPer_Yang的博客-优快云博客
7、论文阅读和分析：Hybrid Mathematical Symbol Recognition using Support Vector Machines_KPer_Yang的博客-优快云博客
8、论文阅读和分析：HMM-BASED HANDWRITTEN SYMBOL RECOGNITION USING ON-LINE AND OFF-LINE FEATURES_KPer_Yang的博客-优快云博客

主要工作：

1、将当时比较火的DenseNet用到HMER任务中；

2、使用多尺度注意力模型，通过将高分辨率、低语义的特征和低分辨率、高语义特征相融合，识别出细小符号，例如小数点；

核心模型实现：

1、多尺度注意力编码器：

在这里插入图片描述

k:growth rate

D:(number of convolution layers) of each block:

for example:D = 32 which means each block has 16 1 × 1 convolution layers and 16 3 × 3 convolution layers.(A batch normalization layer [24] and a ReLU activation layer [25] are performed after each convolution layer consecutively)

流程:

(1)先从正常的DenseNet中，第一个池化层分出，进行DenseB的处理得到B： $\mathbf{B}\in \mathbb{R} ^{2H \times 2W \times C^{'}}$ .

(2)GRU计算t步的s hat；
$\mathbf{\hat{s}}_t=GRU\left(\mathbf{y}_{t-1},\mathbf{s}_{t-1}\right)$
(3)计算A和B的a single-scale coverage based attention model.
$\mathbf{c}\mathbf{A}_{t}=f_{\mathrm{catt}}\left(\mathbf{A},\mathbf{\hat{s}}_{t}\right)\\ \mathbf{cB}_{t}=f_{ {\mathrm{catt}}}\left(\mathbf{B},\mathbf{\hat{\mathbf{s}}}_{t}\right)$