Skip to Main content Skip to Navigation
Conference papers

A hybrid multi-modal visual data cross fusion network for indoor and outdoor segmentation

Abstract : Multi-modal scene parsing is a prevalent topic in robotics and autonomous driving since the knowledge of different modalities can complement each other. Recently, the success of self-attention-based methods has demonstrated the effectiveness of capturing long-range dependencies. However, the tremendous cost dramatically limits the application of this idea in multi-modal fusion. To alleviate this problem, this paper designs a multimodal cross-fusion block (AC) and its elegant variant (EAC) based on an additive attention mechanism to capture global awareness among different modalities efficiently. Moreover, a simple yet efficient transformer-based trans-context block (TC) is also presented to connect the contextual information. Based on the above components, we propose light HCFNet, which can explore long-range dependencies of multi-modal information while keeping local details. Finally, we conduct comprehensive experiments and analyses on both indoor (NYUv2-13,-40) and outdoor (Cityscapes-11) datasets. Experiment results show that the proposed HCFNet achieved 66.9% and 51.5% mIoU on NYUv2-13 and-40 classes settings, which outperform current start-of-the-art multi-model methods. Our model also shows a competitive mIoU of 80.6% on the Cityscapes-11 dataset. The code will be available at
Document type :
Conference papers
Complete list of metadata
Contributor : Désiré Sidibé Connect in order to contact the contributor
Submitted on : Monday, July 11, 2022 - 11:22:52 AM
Last modification on : Wednesday, July 20, 2022 - 3:27:09 AM


ICPR2022 (1).pdf
Files produced by the author(s)


  • HAL Id : hal-03719440, version 1


Sijie Hu, Fabien Bonardi, Samia Bouchafa, Désiré Sidibé. A hybrid multi-modal visual data cross fusion network for indoor and outdoor segmentation. 26TH International Conference on Pattern Recognition (ICPR 2022), Aug 2022, Montreal, Canada. ⟨hal-03719440⟩



Record views


Files downloads