关键词:
Decoding
摘要:
In RGB-T salient object detection, effective utilization of the different characteristics of RGB and thermal modalities is essential to achieve accurate detection. Most of the previous methods usually only focus on reducing the differences between modalities, which may ignore the specific features that are crucial for salient object detection, leading to suboptimal results. To address the above issue, an RGB-T SOD network that simultaneously considers the reduction of modality differences and the preservation of specific features is proposed. Specifically, we construct a modality differences reduction and specific features preserving module (MDRSFPM) which aims to bridge the gap between modalities and enhance the specific features of each modality. In MDRSFPM, the dynamic vector generated by the interaction of RGB and thermal features is used to reduce modality differences, and then a dual branch is constructed to deal with the RGB and thermal modalities separately, employing a combination of channel-level and spatial-level operations to preserve their respective specific features. In addition, a multi-scale global feature enhancement module (MGFEM) is proposed to enhance global contextual information to provide guidance information for the subsequent decoding stage, so that the model can more easily localize the salient objects. Furthermore, our approach includes a fully fusion and gate module (FFGM) that utilises dynamically generated importance maps to selectively filter and fuse features during the decoding process. Extensive experiments demonstrate that our proposed model surpasses other state-of-the-art models on three publicly available RGB-T datasets remarkably. © 2024, The Authors. All rights reserved.