[ESWA 2023]Multi-scale receptive fields: Graph attention neural network for hyperspectral image cla-

论文网址:Multi-scale receptive fields: Graph attention neural network for hyperspectral image classification - ScienceDirect

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related concepts and definitions

2.3.1. Problem definition

2.3.2. Graph convolutional neural network (GCN)

2.3.3. Graph attention convolutional neural network (GAT)

2.4. Proposed method

2.4.1. The overview of MRGAT

2.4.2. Spectral-spatial transformer module (STM)

2.4.3. Multi-features attention module (MFaM)

2.4.4. Multi-scale receptive fields construction module (MRcM)

2.4.5. Feature fusion and attention decision module (FaDM)

2.4.6. HSI classification using MRGAT

2.4.7. Computational complexity analysis

2.5. Experimental results

2.5.1. Experimental Setup

2.5.2. Dataset description and processing

2.5.3. Classification results

2.5.4. Analysis of the parameter effect

2.5.5. The performances with limited labeled samples

2.5.6. Ablation study

2.5.7. Training time comparison

2.6. Conclusion

3. Reference


1. 心得

(1)接上篇

(2)感觉瞟一眼已经很相似了

(3)好奇怪的写作风格...读起来不是很友善...感觉把不是很难的东西写得好难...

(4)到底为什么非要把高光谱仪器放上来啊???是你们的吗就放?

2. 论文逐段精读

2.1. Abstract

        ①Existing problems: GNNs are time consuming, inefficient in information description, and poor in anti-noise robustness

        ②Thus, they proposed multi-scale receptive fields graph attention neural network (MRGAT)

2.2. Introduction

        ①Challenges in hyperspectral image (HSI) classification: label deficiency, high data dimension, spectrum similarity, pixel blending

        ②从传统的机器学习到CNN到GNN的应用介绍

        ③Weaknesses of existing GNNs on HSI classification: a) high computational complexity, b) only focus on local information, c) noise in nodes

2.3. Related concepts and definitions

2.3.1. Problem definition

        ①Labelled set of a graph is defined as\{(\boldsymbol{x}_i,\boldsymbol{y}_i)\}_{i=1}^l, where l of m nodes are labelled, \boldsymbol{x}_i denotes spectral vector of node i\boldsymbol{y}_i \in \mathcal{L}=\left \{ 1,...,c \right \} denotes the label of node ic denotes the number of classes

        ②Mapping: f:\mathcal{X}^{m}\mapsto\mathcal{Y}^{m}

2.3.2. Graph convolutional neural network (GCN)

        ①Undirected graph: \mathcal{G}=(\mathcal{V},\mathcal{E},\boldsymbol{A}), where \mathcal{V} denotes vertex set, \mathcal{E} denotes edge set, A\in\mathbb{R}^{m\times m} denotes adjacency matrix, X\in\mathbb{R}^{m\times d} is node feature matrix

        ②Aggregation operation of GCN:

\boldsymbol{X}_{i+1}=\sigma\left(\boldsymbol{D}^{-\frac{1}{2}}\widehat{\boldsymbol{A}}\boldsymbol{D}^{-\frac{1}{2}}\boldsymbol{X}_{i}\boldsymbol{Q}_{i}\right)

where \sigma denotes nonlinear activation function, \widehat{A}=A+IX_{0}=XD denotes degree matrix, Q_i\in\mathbb{R}^{c_i\times c_{i+1}} denotes learnable matrix at layer i

2.3.3. Graph attention convolutional neural network (GAT)

        ①Implement linear transformation on node features:

\boldsymbol{x}_{i}^{\prime}=\boldsymbol{W}^{T}\boldsymbol{x}_{i},\boldsymbol{W}\in\mathbb{R}^{F\times F^{\prime}}

        ②Importance score:

e_{ij}=\left(\sigma\left(\boldsymbol{a}^T\left[\boldsymbol{W}^T\boldsymbol{x}_i||\boldsymbol{W}^T\boldsymbol{x}_j\right]\right)\right)

a^{T}\in\mathbb{R}^{2F^{\prime}} is parameter vector

        ③Apply Softmax on e_{ij}:

a_{ij}=softmax(e_{ij})=\frac{\exp(\sigma(\boldsymbol{a}^{T}[\boldsymbol{W}^{T}\boldsymbol{x}_{i}||\boldsymbol{W}^{T}\boldsymbol{x}_{j}]))}{\sum_{j\in N_{i}}\exp(\sigma(\boldsymbol{a}^{T}[\boldsymbol{W}^{T}\boldsymbol{x}_{i}||\boldsymbol{W}^{T}\boldsymbol{x}_{j}]))}

where N_i denotes the number of neighbors of node i

        ④Aggregation of GAT:

h_i^{\prime}=\sigma\left(\sum_{j\in N_i}a_{ij}\cdot W^Tx_j\right)

2.4. Proposed method

2.4.1. The overview of MRGAT

        ①Overall framework of MRGAT:

2.4.2. Spectral-spatial transformer module (STM)

        ①For a HSI cube I_{B}=\{x_{1},x_{2},\cdots,x_{m}\}\in\mathbb{R}^{m\times B} with B spectral channels and m=W\times H pixels, x_i denotes feature vector of pixel iW denotes width, and H denotes height

        ②They apply PCA to reduce dimension and employ SLIC to segment image to superpixels:

\mathrm{HSI}=\cup_{i=1}^KS_i,S_i\cap S_j=\emptyset,i\neq j;i,j=1,2,\cdots,K

where S_{i}=\{p_{i,1},\cdots,p_{i,n_{i}}\} denotes superpixel i with n_i pixels, K is the total number of superpixels

        ③To remain more features in superpixels, they design Spectral transformer:

        ④They add location on the feature:

\boldsymbol{p}_0=(x,y)

h(\boldsymbol{p}_0)=(X_1(\boldsymbol{p}_0),X_2(\boldsymbol{p}_0),\cdots,X_B(\boldsymbol{p}_0))

where X_i(\boldsymbol{p}_0) is the pixel’s spectral value in i th spectral channel(为啥h完全没有再图里面体现出来,而且这个\boldsymbol{p}的中介作用到底是啥?

        ⑤The output of 1 × 1 Conv at channel i:

X_i^l(\boldsymbol{p}_0)=\sigma\left(\boldsymbol{W}_i^l\cdot\widetilde{X}_i^{l-1}(\boldsymbol{p}_0)+a_i^l\right)

        ⑥An association matrix M\in\mathbb{R}^{HW\times K} for reflecting the relationship between pixels and superpixels:

\boldsymbol{M}_{i,j}=\left\{ \begin{array} {ll}j & \quad\mathrm{if}\boldsymbol{x}_i\in S_j \\ 0 & \quad\mathrm{otherwise} \end{array}\right.,I_B=\mathrm{Flatten}(HSI)

        ⑦HSI feature:

\begin{aligned} & H=\left[H_{1},H_{2},\ldots H_{K}\right]^{T} \\ & =\left[\frac{1}{n_{1}}\sum_{k=1}^{n_{1}}h_{k}^{1},\frac{1}{n_{2}}\sum_{k=1}^{n_{2}}h_{k}^{2},\ldots,\frac{1}{n_{K}}\sum_{k=1}^{n_{K}}h_{k}^{K}\right]^{T} \end{aligned}

        ⑧Reshape the spatial relations:

HSI_r=reshape(M_{i,j}V,H),V_i=\left(\frac{1}{n_i}\sum_{k=1}^{n_i}x_i,\frac{1}{n_i}\sum_{k=1}^{n_i}y_i\right)

2.4.3. Multi-features attention module (MFaM)

        ①Pipeline of MFaM:

        ②The l-th conv layer:

x_i^l=\sigma\left(\sum_{j\in N_i}\cdot e_{ij}^nW_n^T\widetilde{x}_j^{l-1}\right)

where e_{ij} denotes learned attention coefficients of neighbors

        ③Multilayer conv:

x_{i}\leftarrow e_{i1}^{n}x_{i1}+e_{i2}^{n}x_{i2}+\cdots+e_{ik}^{n}x_{ik},1\leqslant k\leqslant K.\leftarrow\sum_{k=1}^{K}e_{ik}^{n}x_{ik},1\leqslant k\leqslant K.

where \leftarrow denotes assignment symbol, i denotes the i th hop neighbors of node xK denotes the total number of neighbors in the i-th hop of node xe_{ik}^n denotes the important coefficients of x_{ik}

        ④A Gaussian distance to represent node relationship:

a_{ij}=\left\{ \begin{array} {c}e^{-\gamma\|h_i-h_j\|^2},ifh_i\in N_t(h_j)orh_j\in N_t(h_i) \\ 0,otherwise \end{array}\right.

where \gamma is empirical value which are set to 0.2

        ⑤The edge attention conv:

a_i^l=\sigma\left(\sum_{j\in N_i}e_{ij}^a\cdot W_a^T\tilde{a}_j^{l-1}\right)

where e_{ij}^{a} denotes learned attention coefficients of edges

        ⑥The final a_i:

a_{i}\leftarrow e_{i1}^{a}a_{i1}+e_{i2}^{a}a_{i2}+\cdots+e_{ik}^{a}a_{ik},1\leqslant k\leqslant K.\leftarrow\sum_{k=1}^{K}e_{ik}^{n}a_{ik},1\leqslant k\leqslant K.

        ⑦Feature fusion attention:

\boldsymbol{x}=\sigma\left(e_i^n\boldsymbol{W}^T\boldsymbol{x}_i+e_i^a\boldsymbol{W}^T\boldsymbol{a}_i\right)

        ⑧The centroid node:

\begin{aligned} \mathrm{x} & =\alpha_{1}x_{i1}+\alpha_{2}x_{i2}+\cdots+\alpha_{k}x_{ik}+\beta_{1}a_{i1}+\beta_{2}a_{i2}+\cdots+\beta_{3}a_{ik}\cdot \\ & =\sum_{k=1}^{K}(\alpha_{i}x_{ik}+\beta_{i}a_{ik}),1{\leqslant}k{\leqslant}K. \end{aligned}

where \alpha_i and \beta_i are weight coefficient

2.4.4. Multi-scale receptive fields construction module (MRcM)

        ①The receptive field of the node x:

R_i(\boldsymbol{x})=R_{i-1}(\boldsymbol{x})\cup R_1(R_{i-1}(\boldsymbol{x}))

where the subscript of R denotes the hop number, R_{0}(x)=x

        ②Feature of centroid node:

x^i=\sum_{k=1}^K(\alpha_ix_{ik}+\beta_ia_{ik})

        ③Vis of hop:

2.4.5. Feature fusion and attention decision module (FaDM)

        ①The output of MRcM:

O=\sigma\left(\sum_{i\in S}e_{i}\cdot W^{T}x^{i}\right)

        ②Softmax classification:

O_l=\frac{e^{k_iO+b_i}}{\sum_i^Ce^{k_i\cdot O+b_i}}

where C denotes the number of classes

2.4.6. HSI classification using MRGAT

        ①Loss function:

L=-\sum_{z\in y_G}\sum_{f=1}^CY_{zf}\ln O_{Gzf}^{(final)}

where Y_{zf} denotes label matrix, \mathbf{y}_{\mathbf{G}} denotes labeled example set

        ②Optimizer: Adam gradient descent

        ③Algorithm of MRGAT:

2.4.7. Computational complexity analysis

        ①我放个原文吧感觉这里也比较精简我就懒得再简化了:

2.5. Experimental results

2.5.1. Experimental Setup

        ①Hyperparameters setting:

        ②The architectural details of MRGAT:

        ③Framework of MRGAT:

        ④Running times: 10

        ⑤Training samples per class: 30

2.5.2. Dataset description and processing

        ①Pavia University, with 103 of 115 bands after processing, 9 classes and size of 610*340:

        ②Salinas, with 204 of 224 bands after removing water vapor absorption bands, 16 categories and size of 512*217:

        ③Houston 2013, with 144 bands, 15 classes, size of 364-1046 nm:

        ④震撼人心的高光谱仪器,即便不知道是哪里来的,就没有一点版权问题吗??看作者也没有自己收集

2.5.3. Classification results

        ①Performance on Pavia University:

        ②Performance on Salinas:

        ③Performance on Houston 2013:

2.5.4. Analysis of the parameter effect

        ①Ablation of L and K ((a) Pavia University. (b) Salinas. (c) Houston 2013):

        ②Ablation of N amd T ((a) Pavia University. (b) Salinas. (c) Houston 2013):

2.5.5. The performances with limited labeled samples

        ①Performance at limited labelled data trained:

2.5.6. Ablation study

        ①Module ablation:

2.5.7. Training time comparison

        ①Training time:

2.6. Conclusion

        ~

3. Reference

Ding, Y. et al. (2023) Multi-scale receptive fields: Graph attention neural network for hyperspectral image classification, Expert Systems with Applications, 223. doi: Redirecting

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值