论文阅读:MC-Loss

本文提出互通道损失MC - Loss用于细粒度图像分类,无需复杂网络设计或训练机制。MC - Loss由判别性和多样性组件组成,判别性组件使特征通道有判别力,多样性组件让通道关注不同区域。该损失不引入额外参数,可用于任何网络架构,还给出了实验及特殊情况的解决方案。

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification(MC-Loss)

2020,TIP

摘要

本文展示了可以在不需要复杂的网络设计或训练机制的情况下训练微妙的细节,只需要一个损失即可。主要技巧在于如何尽早深入研究单个特征通道,而不是从整合特征图开始。

所提出的损失函数称为互通道损失MC-Loss,由两个特定于通道的组件组成:判别性组件和多样性组件。判别性组件通过一个通道注意力机制迫使属于同一类的所有特征通道具有判别性。多样性组件在通道上施加了约束,在空间上互斥。最终结果是一组特征通道,每个通道都反映特定类别的不同局部判别区域。

代码:https://github.com/dongliangchang/Mutual-Channel-Loss

1 引言

检测区域方法类的部件:显示检测部分、确保学习到的特征具有区分性。

本文没有引入显式检测部分的网络组件,而是只用一个损失同时实现判别特征学习和部件定位。优势:

  1. 不引入任何额外的网络参数,使网络更容易训练
  2. 原则上可以应用于任何网络架构

关键在于如何尽早深入研究特征通道,而不是直接在特征图上学习细粒度的部分级特征。

核心思想:假设有固定数量的特征通道来表示每个类,直接在通道上施加损失,使得属于同一类的所有特征通道都是判别性的(每个都有助于与其他类别区分)、互斥的(每个通道与不同的局部相关)。最终结果是一组类对齐的特征通道,每个通道都区分相互不同的局部部分,如下图:

image-20210531163956115

MC-Loss

  1. 判别性组件。在融合之前,迫使与一个类对应的特征通道具有判别性 。引入一个通道注意力机制,在训练过程中,固定比例的通道被随机屏蔽,迫使剩余通道对给定的类别具有辨别力。再应用跨通道最大池化来融合特征通道并生成最终的特征图,该图现在是类对齐的并且具有最佳判别性
  2. 多样性组件。使每个通道都将关注相互不同的部分。要求属于同一类的通道峰值的空间距离最大化。这可以通过再次应用跨通道最大池化,然后要求最大空间求和来实现。

3 MC-Loss

整体训练框架:

image-20210531170322542

特征图分成组,每个组与一个类别对应,每个组有 ξ \xi ξ个通道,共 c c c个组( c c c也是类别数目, ξ \xi ξ是超参数)。

每个通道表示成 F i ∈ R W H \mathcal{F}_i\in R^{WH} FiRWH(reshape),每个组的通道表示成 F i ∈ R ξ × W H \mathbf{F}_i\in R^{\xi \times WH} FiRξ×WH
F i = { F i × ξ + 1 , F i × ξ + 2 , . . . , F i × ξ + ξ } \mathbf{F}_i=\{\mathcal{F}_{i\times\xi+1},\mathcal{F}_{i\times\xi+2},...,\mathcal{F}_{i\times\xi+\xi}\} Fi={Fi×ξ+1,Fi×ξ+2,...,Fi×ξ+ξ}
特征图也就是 F = { F 0 , F 1 , . . . , F c − 1 } \mathbf{F}=\{\mathbf{F}_0,\mathbf{F}_1,...,\mathbf{F}_{c-1}\} F={F0,F1,...,Fc1}。通过交叉熵损失和MC-Loss。
L o s s ( F ) = L C E ( F ) + μ × L M C ( F ) L M C ( F ) = L d i s ( F ) − λ × L d i v ( F ) Loss(\mathbf F)=L_{CE}(\mathbf F) + \mu \times L_{MC}(\mathbf F)\\ L_{MC}(\mathbf F)=L_{dis}(F)-\lambda\times L_{div}(\mathbf F) Loss(F)=LCE(F)+μ×LMC(F)LMC(F)=Ldis(F)λ×Ldiv(F)

3.1 判别性组件

目的:迫使特征通道类对齐(和特定类相关),每个通道有足够的判别能力。
L d i s = L C E ( y , [ e g ( F 0 ) , e g ( F 1 ) , . . . , e g ( F c − 1 ) ] T ∑ i = 0 c − 1 e g ( F i ) ⏟ s o f t m a x ) L_{dis}=L_{CE}(y,\underbrace{\frac{[e^{g(\mathbf F_0)},e^{g(\mathbf F_1)},...,e^{g(\mathbf F_{c-1})}]^T}{\sum^{c-1}_{i=0}e^{g(\mathbf F_i)}}}_{softmax}) Ldis=LCE(y,softmax i=0c1eg(Fi)[eg(F0),eg(F1),...,eg(Fc1)]T)
y y y是真实类别。其中的 g g g是:
g ( F i ) = 1 W H ∑ k = 1 W H ⏟ G A P max ⁡ j = 1 , 2 , . . . , ξ ⏟ C C M P [ M i ⋅ F i , j , k ] ⏟ C W A g(\mathbf F_i)=\underbrace{\frac{1}{WH}\sum^{WH}_{k=1}}_{GAP} \underbrace{\max_{j=1,2,...,\xi}}_{CCMP} \underbrace{[M_i\cdot \mathbf F_{i,j,k}]}_{CWA} g(Fi)=GAP WH1k=1WHCCMP j=1,2,...,ξmaxCWA [MiFi,j,k]

  • GAP:全局均值池化。 c × W H → c × 1 c\times WH\to c\times 1 c×WHc×1
  • CCMP:跨通道最大池化。 c × ( ξ , W H ) → c × ( 1 , W H ) c\times(\xi,WH)\to c\times(1,WH) c×(ξ,WH)c×(1,WH)。得到 c c c个特定于类的向量。
  • CWA:通道注意力。 c × ( ξ , W H ) → c × ( ξ , W H ) c\times(\xi,WH)\to c\times(\xi,WH) c×(ξ,WH)c×(ξ,WH)

Mask是一个零一掩模,随机选择一半 ⌊ ξ 2 ⌋ \lfloor\frac{\xi}{2}\rfloor 2ξ的零。 M i = d i a g ( M a s k i ) M_i=diag(\mathbf{Mask}_i) Mi=diag(Maski),对角矩阵。(假设 M i M_i Mi对角线都是1,那就是单位矩阵了,对 F i \mathbf F_i Fi没有影响了。0对应的行(通道)被消除了。原理上特别像dropout)

image-20210601210205764

3.2 多样性组件

image-20210601211522702

目的:特征通道的近似距离测量,用于计算所有通道的总相似度。一个类的不同特征通道应该关注图像的不同区域,而不是所有通道都关注最具辨别力的区域。它通过使每个组的特征通道多样化来减少冗余信息,并有助于发现图像中每个类别的不同判别区域。 此操作可以解释为通道解耦,以便从图像的不同显着区域捕获细节。
L d i v ( F ) = 1 c ∑ i = 0 c − 1 h ( F i ) h ( F i ) = ∑ k = 1 W H max ⁡ j = 1 , 2 , . . . , ξ ⏟ C C M P [ e F i , j , k ∑ k ′ = 1 W H e F i , j , k ′ ] ⏟ S o f t m a x L_{div}(F)=\frac{1}{c}\sum^{c-1}_{i=0}h(\mathbf F_i)\\ h(\mathbf F_i)=\sum^{WH}_{k=1} \underbrace{\max_{j=1,2,...,\xi}}_{CCMP} \underbrace{[\frac{e^{\mathbf F_{i,j,k}}}{\sum^{WH}_{k'=1}e^{\mathbf F_{i,j,k'}}}]}_{Softmax} Ldiv(F)=c1i=0c1h(Fi)h(Fi)=k=1WHCCMP j=1,2,...,ξmaxSoftmax [k=1WHeFi,j,keFi,j,k]
函数softmax是空间维度的归一化,这里的CCMP与判别组件中的作用相同。

多样性组件不能单独用于分类,它充当判别器的正则化项,隐式发现图像中不同的原始区域的损失。

4 实验

特殊情况:假设每个类有 ξ \xi ξ个通道,但是特征图的总的通道数不一样,怎么办呢?

解决方案: ξ \xi ξ不均匀。例如,前88个类,每个类分配2 个通道,其余类用3个特征通道。

SOTA比较

image-20210601213815952

image-20210601213438978

CUB数据集上每个类只有两个通道,但是数据集中的鸟有丰富的区别性区域,很难再通道数目不够的情况下获得鲁棒性的描述

消融

image-20210601214233647

image-20210601213233418

这是一篇关于人工智能方向的论文初稿,请帮我完善其中的各个部分。 标题:A Physics-Informed Multi-Modal Fusion Approach for Intelligent Assessment and Life Prediction of Geomembrane Welds in High-Altitude Environments 摘要: The weld seam is the most critical yet vulnerable part of a geomembrane anti-seepage system in high-altitude environments. Traditional assessment methods struggle with inefficiency and an inability to characterize internal defects, while existing prediction models fail to capture the complex degradation mechanisms under multi-field coupling conditions. This study proposes a novel physics-informed deep learning framework for the intelligent assessment and life prediction of geomembrane welds. First, a multi-modal sensing system integrating vision, thermal, and ultrasound is developed to construct a comprehensive weld defect database. Subsequently, a Physics-Informed Attention Fusion Network (PIAF-Net) is proposed, which embeds physical priors (e.g., the oxidation sensitivity of the Heat-Affected Zone) into the attention mechanism to guide the fusion of heterogeneous information, achieving an accuracy of 94.7% in defect identification with limited samples. Furthermore, a Physics-Informed Neural Network with Uncertainty Quantification (PINN-UQ) is established for long-term performance prediction. By hard-constraining the network output with oxidation kinetics and damage evolution equations, and incorporating a Bayesian uncertainty quantification framework, the model provides probabilistic predictions of the remaining service life. Validation results from both laboratory and a case study at the Golmud South Mountain Pumped Storage Power Station (over 3500m altitude) demonstrate the high accuracy (R² > 0.96), robustness, and physical consistency of the proposed framework, offering a groundbreaking tool for the predictive maintenance of critical infrastructure in extreme environments. 关键词: Geomembrane Weld; Multi-Modal Fusion; Physics-Informed Neural Network; Defect Assessment; Life Prediction; High-Altitude Environment 1. Introduction High-density polyethylene (HDPE) geomembranes are pivotal as impermeable liners in major water conservancy projects, such as pumped storage power stations in high-altitude regions of western China [1, 2]. However, the long-term performance and sealing reliability of the entire system are predominantly determined by the quality of the field welds, which are subjected to extreme environmental stresses including low temperature, intense ultraviolet (UV) radiation, significant diurnal temperature cycles, and strong windblown sand [3, 4]. Statistics indicate that over 80% of geomembrane system failures originate from weld seams [5], highlighting them as the primary薄弱环节 (weak link). Current non-destructive evaluation (NDE) methods, such as air pressure testing and spark testing, are largely qualitative, inefficient, and incapable of identifying internal flaws like incomplete fusion [6, 7]. While some researchers have begun exploring machine learning and deep learning for automated defect recognition [8, 9], these data-driven approaches often suffer from two fundamental limitations: (1) a lack of physical interpretability, making their predictions untrustworthy for high-stakes engineering decisions, and (2) poor generalization performance under "small-sample" conditions typical of specialized weld defects [10]. For long-term performance prediction, the classical Arrhenius model remains the most common tool but is primarily suited for homogeneous materials under constant, single-factor thermal aging [11, 12]. It fails to account for the significant microstructural heterogeneity, residual stresses, and the synergistic effects of multi-field coupling inherent in weld seams under real-world high-altitude service conditions [13, 14]. Pure data-driven models like Gaussian Process Regression (GPR) or standard Neural Networks (NNs), while flexible, often exhibit high extrapolation risks and lack physical consistency [15]. To bridge these gaps, this study introduces a physics-informed deep learning framework that seamlessly integrates physical knowledge with data-driven models. The main contributions are threefold: We propose a Physics-Informed Attention Fusion Network (PIAF-Net) that leverages physical priors derived from material aging mechanisms to guide the fusion of multi-modal NDE data, significantly enhancing defect identification accuracy and interpretability under small-sample constraints. We develop a Physics-Informed Neural Network with Uncertainty Quantification (PINN-UQ) for life prediction, which embds oxidation kinetics and damage mechanics laws directly into the loss function, ensuring physical plausibility while providing probabilistic life predictions through a Bayesian framework. We validate the proposed framework rigorously through independent laboratory tests and a real-world engineering case study at a high-altitude pumped storage power station, demonstrating its superior performance, robustness, and practical engineering value. 2. Methodology The overall framework of the proposed methodology is illustrated in Fig. 1, comprising three main stages: multi-modal data acquisition, intelligent defect assessment, and physics-informed life prediction. 2.1 Multi-Modal Data Acquisition and Database Construction A synchronized multi-sensor data acquisition system was developed, comprising: Vision Module: A 5-megapixel CCD camera with uniform LED lighting to capture high-resolution surface images. Features like Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), and morphological parameters (weld width uniformity, edge straightness) were extracted. Thermal Module: A mid-wave infrared thermal camera (100 Hz) recorded the dynamic temperature field during the natural cooling of the weld. Key features included cooling rate and temperature distribution uniformity. Ultrasound Module: A high-frequency ultrasonic probe using pulse-echo mode acquired A-scan signals. Features such as sound velocity, attenuation coefficient, and spectral centroid were derived to characterize internal fusion status. A comprehensive weld defect database was constructed, containing 600 samples covering various process defects (virtual weld, over-weld, weak weld, contamination) and aging states (0h, 500h, 1500h of accelerated multi-field coupling aging). 2.2 Physics-Informed Attention Fusion Network (PIAF-Net) for Defect Assessment The architecture of PIAF-Net is shown in Fig. 2. It consists of a dual-stream feature extraction module and a novel physics-informed attention fusion module. *2.2.1 Dual-Stream Feature Extraction* One stream processes appearance information (visual + thermal features) using a pre-trained CNN (e.g., VGG16) and a custom 3D CNN, respectively. The other stream processes internal information (ultrasonic features) using a 1D CNN. This separation allows for dedicated feature abstraction from different physical domains. *2.2.2 Physics-Informed Attention Fusion Module* Instead of learning attention weights purely from data, this module incorporates physical priors p p (e.g., known correlations between ultrasonic signal attenuation and internal lack of fusion, or between abnormal cooling rates and over-weld-induced grain coarsening). The attention weight a i a i ​ for the i i-th modality is computed as: a i = softmax ( ( W p ⋅ p ) ⊙ ( W f ⋅ f i ) ) a i ​ =softmax((W p ​ ⋅p)⊙(W f ​ ⋅f i ​ )) where f i f i ​ is the feature vector, W p W p ​ and W f W f ​ are learnable projection matrices, and ⊙ ⊙ denotes element-wise multiplication. This design forces the model to focus on feature combinations that are physically meaningful. *2.2.3 Meta-Learning for Small-Sample Training* To address the limited defect samples, a Model-Agnostic Meta-Learning (MAML) paradigm was adopted. The model is trained on a multitude of N-way K-shot tasks, enabling it to rapidly adapt to new, unseen defect types with very few examples. 2.3 Physics-Informed Neural Network with Uncertainty Quantification (PINN-UQ) for Life Prediction The PINN-UQ model integrates physical laws governing weld degradation, as summarized from accelerated aging tests (see Fig. 3 for the conceptual physical model). 2.3.1 Physical Mechanism Module The degradation is modeled through a coupled chemical and mechanical process: Non-Homogeneous Oxidation Kinetics: d α d t = A ⋅ f ( C I 0 , T weld ) ⋅ exp ⁡ ( − E a R T ) ⋅ ( 1 − α ) n ⋅ g ( I U V ) dt dα ​ =A⋅f(CI 0 ​ ,T weld ​ )⋅exp(− RT E a ​ ​ )⋅(1−α) n ⋅g(I UV ​ ) where α α is the aging degree, f ( C I 0 , T weld ) f(CI 0 ​ ,T weld ​ ) is a spatial function accounting for initial antioxidant depletion in the Heat-Affected Zone (HAZ), and g ( I U V ) g(I UV ​ ) is the UV intensity function. Damage Evolution Model: d D d t = C 1 ⋅ ( σ eff σ 0 ) m ⋅ N f + C 2 ⋅ ( Abrasion ) dt dD ​ =C 1 ​ ⋅( σ 0 ​ σ eff ​ ​ ) m ⋅N f ​ +C 2 ​ ⋅(Abrasion) where D D is the damage variable, σ eff σ eff ​ is the equivalent thermal stress from temperature cycles, and N f N f ​ is the cycle count. Macroscopic Performance Coupling: P = P 0 ⋅ ( 1 − α ) β ⋅ ( 1 − D ) γ P=P 0 ​ ⋅(1−α) β ⋅(1−D) γ where P P is a macroscopic property (e.g., tensile strength), and β , γ β,γ are coupling coefficients. *2.3.2 PINN-UQ Architecture and Hybrid Loss Function* The network input is the multi-modal feature sequence X fusion ( t ) X fusion ​ (t) and environmental stress data. Crucially, the network's final layer outputs the physical state variables α α and D D, not the performance P P directly. The predicted performance P pred P pred ​ is then calculated using the physical equation above, enforcing physical consistency. The hybrid loss function is defined as: L total = L data + λ ⋅ L physics L total ​ =L data ​ +λ⋅L physics ​ L data = 1 N ∑ i = 1 N ( P pred , i − P meas , i ) 2 L data ​ = N 1 ​ i=1 ∑ N ​ (P pred,i ​ −P meas,i ​ ) 2 L physics = 1 N ∑ i = 1 N [ ( d α d t − R α ) 2 + ( d D d t − R D ) 2 ] L physics ​ = N 1 ​ i=1 ∑ N ​ [( dt dα ​ −R α ​ ) 2 +( dt dD ​ −R D ​ ) 2 ] where R α R α ​ and R D R D ​ are the right-hand sides of the oxidation and damage evolution equations, computed via automatic differentiation. 2.3.3 Uncertainty Quantification Framework A Bayesian Neural Network (BNN) with Monte Carlo (MC) Dropout is employed to quantify both epistemic (model) and aleatoric (data) uncertainties. The predictive distribution is obtained by performing M M stochastic forward passes, providing the mean prediction and its confidence interval. 3. Results and Discussion 3.1 Performance of PIAF-Net for Defect Assessment The performance of PIAF-Net was evaluated using 5-fold cross-validation and compared against baseline models on the same dataset (Table 1). Table 1. Performance comparison of different models for weld defect identification (Mean ± Std). Model Accuracy (%) Precision (%) Recall (%) F1-Score Vision Only (CNN) 85.3 ± 1.5 84.1 ± 2.1 83.7 ± 1.8 0.839 Thermal Only (3D-CNN) 80.2 ± 2.1 79.5 ± 2.8 78.9 ± 2.5 0.792 Simple Feature Concatenation 90.5 ± 1.2 89.8 ± 1.5 89.4 ± 1.7 0.896 PIAF-Net (Proposed) 95.8 ± 0.8 95.2 ± 1.0 94.9 ± 1.1 0.951 PIAF-Net significantly outperformed all single-modality and simple fusion models, demonstrating the effectiveness of physics-guided attention. The t-SNE visualization (Fig. 4a) showed clear clustering of different defect types in the learned feature space, with samples of the same defect type forming continuous trajectories reflecting severity, indicating the model captured physically meaningful representations. 3.2 Performance and Analysis of PINN-UQ for Life Prediction The PINN-UQ model was trained on data from multi-field coupled aging tests and tested on an independent validation set. Fig. 4b shows the model's prediction of tensile strength degradation under full coupling conditions, alongside the 95% confidence interval. The prediction mean (red line) closely matches the experimental measurements (black dots), with a high R² value of 0.963 and a low RMSE of 1.18 MPa. The 95% confidence interval (blue shaded area) effectively encapsulates the dispersion of the experimental data, especially during the accelerated degradation phase after 1500 hours, quantitatively reflecting prediction uncertainty. Analysis of the internally predicted physical variables α α and D D revealed that the aging degree in the HAZ evolved much faster than in the parent material, aligning perfectly with micro-FTIR observations from our mechanistic studies (Chapter 2 of the thesis). This emergent behavior, enforced by the physical constraints, confirms the model's physical consistency. 3.3 Engineering Application and Validation The framework was applied to assess welds that had been in service for 3 years at the Golmud South Mountain Pumped Storage Power Station. PIAF-Net successfully identified two welds with "weak weld" characteristics from 15 in-situ inspections, which were later confirmed by destructive tests to have substandard peel strength. For life prediction, the PINN-UQ model, taking the field-derived features and local environmental spectrum as input, predicted a mean remaining service life of 42 years with a 95% confidence interval of [35, 51] years for the welds. The model also identified the HAZ as the life-limiting factor, providing critical guidance for targeted maintenance. 4. Discussion The superior performance of the proposed framework stems from its deep integration of physical knowledge. In PIAF-Net, the physical priors act as an expert guide, steering the model away from spurious correlations and towards physically plausible feature interactions, which is crucial for generalization with small samples. In PINN-UQ, the physical laws serve as a powerful regularizer, constraining the solution space to physically admissible trajectories. This not only improves extrapolation but also imbues the model with a degree of interpretability often missing in pure "black-box" models. The probabilistic output provided by the UQ framework is of paramount practical importance. It transforms a single-point life estimate into a risk-informed decision support tool, allowing engineers to plan maintenance based on conservative lower-bound estimates (e.g., 35 years) or to assess the probability of failure within a design lifetime. 5. Conclusion This study has developed and validated a novel physics-informed deep learning framework for the intelligent assessment and life prediction of geomembrane welds in high-altitude environments. The main conclusions are: The proposed PIAF-Net model, by embedding physical priors into the attention mechanism, achieves high-accuracy (95.8%), interpretable defect identification with limited labeled data, overcoming the limitations of traditional methods and pure data-driven models. The PINN-UQ model successfully integrates the physics of weld degradation into a data-driven framework, providing accurate (R² > 0.96), physically consistent, and probabilistic predictions of long-term performance and remaining service life. The successful application in a real-world high-altitude engineering case demonstrates the framework's robustness and practical value, paving the way for a paradigm shift from experience-based and reactive maintenance towards model-guided and predictive management of critical infrastructure. Acknowledgments (This section will be completed as needed) References [1] Koerner, R. M., & Koerner, G. R. (2018). Journal of Geotechnical and Geoenvironmental Engineering, 144(6), 04018029. [2] Rowe, R. K. (2020). Geotextiles and Geomembranes, 48(4), 431-446. [3] ... (Other references will be meticulously added from the thesis and relevant literature)
最新发布
11-29
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值