Y. Ohta, T. Kanade, and T. Sakai, An analysis system for scenes containing 750 objects with substructures, Proceedings of the Fourth International Joint Conference on Pattern Recognition, pp.752-754, 1978.

A. Garcia-garcia, S. Orts-escolano, S. Oprea, V. Villena-martinez, and J. Garcia-rodriguez, A Review on Deep Learning Techniques Applied to Semantic Segmentation, 2017.

H. Yu, Z. Yang, L. Tan, Y. Wang, W. Sun et al., Methods and datasets on semantic segmentation: A review, Neurocomputing, vol.304, pp.82-103, 2018.

S. Edelman and T. Poggio, Integrating visual cues for object segmentation and recognition, Optics News, vol.15, 1989.

A. Kirillov, K. He, R. Girshick, C. Rother, and P. Dollár, Panoptic Segmentation, 2018.

B. Cheng, M. D. Collins, Y. Zhu, T. Liu, T. S. Huang et al., Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation, 2019.

Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.521, p.436, 2015.

I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, 2016.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, pp.2278-2324, 1998.

Z. C. Lipton, J. Berkowitz, and C. Elkan, A critical review of recurrent neural networks for sequence learning, 2015.

F. Visin, M. Ciccone, A. Romero, K. Kastner, K. Cho et al., Reseg: A recurrent neural network-based model for semantic segmentation, Proceedings of the IEEE Conference on Com-775 puter Vision and Pattern Recognition Workshops, pp.41-48, 2016.

R. Gade and T. B. Moeslund, Thermal cameras and applications: a survey, Machine vision and applications, vol.25, pp.245-262, 2014.

A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, Multimodal deep learning for robust RGB-D object recognition, pp.681-687, 2015.

J. Liu, S. Zhang, S. Wang, and D. N. Metaxas, Multispectral deep neural networks for pedestrian detection, British Machine Vision Conference 785 2016, BMVC, 2016.

X. Xu, Y. Li, G. Wu, and J. Luo, Multi-modal deep feature learning for rgb-d object detection, Pattern Recognition, vol.72, pp.300-313, 2017.

A. Asvadi, L. Garrote, C. Premebida, P. Peixoto, and U. J. Nunes, Multimodal vehicle detection: fusing 3d-lidar and color camera data, vol.115, pp.20-29, 2018.

M. Y. Yang, B. Rosenhahn, and V. Murino, Multimodal Scene Understanding: Algorithms, Applications and Deep Learning, 2019.

X. Liu, Z. Deng, and Y. Yang, Recent progress in semantic image segmenta-795 tion, Artificial Intelligence Review, vol.52, pp.1089-1106, 2018.

S. A. Taghanaki, K. Abhishek, J. P. Cohen, J. Cohen-adad, and G. Hamarneh, Deep semantic segmentation of natural and medical images: A review, 2019.

M. Naseer, S. Khan, and F. Porikli, Indoor Scene Understanding in 2.5/3D 800 for Autonomous Agents: A Survey, IEEE Access, vol.7, pp.1859-1887, 2019.

F. Fooladgar and S. Kasaei, A survey on indoor RGB-D semantic segmentation: from hand-crafted features to deep convolutional neural networks, Multimedia Tools and Applications, pp.1-26, 2019.

D. Feng, C. Haase-schuetz, L. Rosenbaum, H. Hertlein, and F. Duffhauss, , p.805

C. Glaeser, W. Wiesbeck, and K. Dietmayer, Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges, 2019.

Z. Zhang, X. Zhang, C. Peng, X. Xue, and J. Sun, Exfuse: Enhancing feature fusion for semantic segmentation, Proceedings of the European Conference on Computer Vision (ECCV), pp.269-284, 2018.

H. Li, P. Xiong, H. Fan, and J. Sun, DFANet: Deep Feature Aggregation for

S. Hung, S. Lo, H. Hang, and I. Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation, vol.975, p.2018

L. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, Attention to scale: Scale-aware semantic image segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3640-3649, 2016.

M. Ren and R. S. Zemel, End-to-end instance segmentation and counting with recurrent attention, 2016.

Y. Li, X. Chen, Z. Zhu, L. Xie, G. Huang et al., Attentionguided unified network for panoptic segmentation, 2018.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones et al., Attention is all you need, 2017.

B. Khaleghi, A. Khamis, F. O. Karray, and S. N. Razavi, Multisensor data fusion: A review of the state-of-the-art, Information Fusion, vol.14, pp.990-1018, 2013.

L. I. Kuncheva, Combining pattern classifiers: methods and algorithms, 2014.

J. Fiérrez-aguilar, J. Ortega-garcia, and J. Gonzalez-rodriguez, Fusion strategies in multimodal biometric verification, vol.995, 2003.

, Conference on Multimedia and Expo. ICME'03. Proceedings (Cat. No. 03TH8698), vol.3, 2003.

A. González, D. Vázquez, A. M. López, and J. Amores, On-board object detection: Multicue, multimodal, and multiview random forest of local experts, IEEE transactions on cybernetics, vol.47, pp.3980-3990, 2016.

K. Simonyan and A. Zisserman, Very deep convolutional networks for largescale image recognition, 3rd International Conference on Learning Representations, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, 2015.

C. Couprie, C. Farabet, L. Najman, and Y. Lecun, Indoor Semantic Segmentation using depth information, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00805105

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, FuseNet: Incorporating depth into semantic segmentation via fusion-based CNN architecture, Lecture Notes in Computer Science

, Artificial Intelligence and Lecture Notes in Bioinformatics, vol.10111, 2017.

L. Schneider, M. Jasch, B. Fröhlich, T. Weber, U. Franke et al., Multimodal neural networks: RGB-D for semantic segmentation and object detection, Lecture Notes in Computer Science 1015 (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 10269 LNCS, pp.98-109, 2017.

A. Valada, G. L. Oliveira, T. Brox, and W. Burgard, Deep Multispectral Semantic Scene Understanding of Forested Environments Using Multimodal 1020

, International Symposium on Experimental Robotics, pp.465-477, 2017.

T. Karasawa, K. Watanabe, Q. Ha, A. Tejero-de-pablos, Y. Ushiku et al., Thematic Workshops 2017 -Proceedings of the Thematic Workshops of 1025

. Acm-multimedia, , pp.35-43, 2017.

M. Bijelic, F. Mannan, T. Gruber, W. Ritter, K. Dietmayer et al., Seeing Through Fog Without Seeing Fog: Deep Sensor Fusion in the Absence of Labeled Training Data, 2019.

O. Mees, A. Eitel, and W. Burgard, Choosing smartly: Adaptive multimodal fusion for object detection in changing environments, IEEE International Conference on Intelligent Robots and Systems, volume 2016-Novem, pp.151-156, 2016.

J. Wagner, V. Fischer, M. Herman, and S. Behnke, Multispectral pedestrian detection using deep fusion convolutional neural networks, ESANN 2016 -24th European Symposium on Artificial Neural Networks, pp.509-514, 2016.

J. Guerry, B. L. Saux, and D. Filliat, Look at this one" detection sharing 1040 between modality-independent classifiers for robotic discovery of people, 2017 European Conference on Mobile Robots (ECMR), pp.1-6, 2017.

H. Chen and Y. Li, Progressively complementarity-aware fusion network for RGB-D salient object detection, Proceedings of the IEEE Conference 1045 on Computer Vision and Pattern Recognition, pp.3051-3060, 2018.

N. Wang and X. Gong, Adaptive Fusion for RGB-D Salient Object Detection, 2019.

S. Mcmahon, N. Sunderhauf, B. Upcroft, and M. Milford, Multimodal Trip Hazard Affordance Detection on Construction Sites, IEEE Robotics, p.1050

, Automation Letters, vol.3, pp.1-8, 2018.

R. Zhang, S. A. Candra, K. Vetter, and A. Zakhor, Sensor fusion for semantic segmentation of urban scenes, 2015 IEEE International Conference on Robotics and Automation (ICRA), pp.1850-1857, 2015.

J. Janai, F. Güney, A. Behl, and A. Geiger, Computer Vision for Autonomous 1055 Vehicles: Problems, Datasets and State-of-the-Art, 2017.

N. Patel, A. Choromanska, P. Krishnamurthy, and F. Khorrami, Sensor modality fusion with CNNs for UGV autonomous driving in indoor environments, IEEE International Conference on Intelligent 1060 Robots and Systems, vol.2017, pp.1531-1536, 2017.

X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, Multi-view 3D object detection network for autonomous driving, Proceedings -30th IEEE Conference on Computer Vision and Pattern Recognition, pp.6526-6534, 1065.

A. Pfeuffer and K. Dietmayer, Optimal sensor data fusion architecture for object detection in adverse weather conditions, 2018 21st International Conference on Information Fusion (FUSION), pp.1-8, 2018.

D. Xu, D. Anguelov, and A. Jain, PointFusion: Deep Sensor Fusion for 3D

, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.244-253, 2018.

Y. Xiao, F. Codevilla, A. Gurram, O. Urfalioglu, and A. M. López, Multi-1075 modal End-to-End Autonomous Driving, 2019.

K. Liu, Y. Li, N. Xu, and P. Natarajan, Learn to combine modalities in multimodal deep learning, 2018.

J. Wang, Z. Wang, D. Tao, S. See, and G. Wang, Learning common and specific in: European Conference on Computer Vision, pp.664-679, 2016.

L. Ma, J. Stückler, C. Kerl, and D. Cremers, Multi-view deep learning for consistent semantic mapping with rgb-d cameras, IEEE, 2017.

Y. Zhang, O. Morel, M. Blanchon, R. Seulin, M. Rastgoo et al., Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation, VISIGRAPP 2019 -Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol.5, pp.336-343, 1115.
URL : https://hal.archives-ouvertes.fr/hal-02060222

R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, Others, Adaptive mixtures of local experts, vol.3, pp.79-87, 1991.

D. Eigen, M. Ranzato, and I. Sutskever, Learning factored representations in a deep mixture of experts, 2013.

S. Park, K. Hong, and S. Lee, RDFNet: RGB-D multi-level residual feature fusion for indoor semantic segmentation, Proceedings of the IEEE International Conference on Computer Vision, pp.4980-4989, 2017.

J. Jiang, Z. Zhang, Y. Huang, and L. Zheng, Incorporating depth into both cnn and crf for indoor semantic segmentation, pp.2017-2025

, Conference on Software Engineering and Service Science (ICSESS), pp.525-530, 2017.

Y. Li, J. Zhang, Y. Cheng, K. Huang, and T. Tan, Semantics-guided multilevel RGB-D feature fusion for indoor semantic segmentation, Proceedings -International Conference on Image Processing, pp.2017-1130

I. Septe, , pp.1262-1266, 2018.

D. Lin, G. Chen, D. Cohen-or, P. A. Heng, and H. Huang, Cascaded Feature Network for Semantic Segmentation of RGB-D Images, Proceedings of the IEEE International Conference on Computer Vision, pp.1320-1328, 2017.

J. Jiang, L. Zheng, F. Luo, and Z. Zhang, RedNet: Residual Encoder-Decoder Network for indoor RGB-D Semantic Segmentation, 2018.

H. Blum, A. Gawel, R. Siegwart, and C. Cadena, Modular sensor fusion for semantic segmentation, IEEE, p.1140, 2018.

, Intelligent Robots and Systems (IROS), pp.3670-3677, 2018.

L. Xu, A. Krzyzak, and C. Y. Suen, Methods of combining multiple classifiers and their applications to handwriting recognition, IEEE transactions on systems, man, and cybernetics, vol.22, pp.418-435, 1992.

N. Piasco, D. Sidibé, V. Gouet-brunet, and C. Demonceaux, Learning 1145 scene geometry for visual localization in challenging conditions, International Conference on Robotics and Automation, ICRA 2019, pp.9094-9100, 2019.

J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara et al., Speed/accuracy trade-offs for modern convolutional object detectors, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.7310-7311, 2017.

A. Holliday, M. Barekatain, J. Laurmaa, C. Kandaswamy, and H. Prendinger, Speedup of deep learning ensembles for semantic segmentation using a model compression technique, Comput. Vis. Image Underst, vol.164, pp.16-26, 2017.

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, , p.1160

, The pascal visual object classes (voc) challenge, International journal of computer vision, vol.88, pp.303-338, 2010.

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler et al., The Cityscapes Dataset for Semantic Urban Scene Understanding, Proceedings of the IEEE Com-1165 puter Society Conference on Computer Vision and Pattern Recognition, pp.3213-3223, 2016.

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, Indoor segmentation and support inference from rgbd images, European Conference on Com-1170 puter Vision, pp.746-760, 2012.

S. Song, S. P. Lichtenberg, J. Xiao, and S. Rgb-d, A RGB-D scene understanding benchmark suite, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.567-576, 2015.

J. Shotton, J. Winn, C. Rother, and A. Criminisi, Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation, European conference on computer vision, pp.1-15, 2006.

R. Mottaghi, X. Chen, X. Liu, N. Cho, S. Lee et al., Ur-1180 tasun, A. Yuille, The role of context for object detection and semantic segmentation in the wild, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.891-898, 2014.

X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun et al., Detect what you can: Detecting and representing objects using holistic models 1185 and body parts, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1971-1978, 2014.

B. Hariharan, P. Arbeláez, L. Bourdev, S. Maji, and J. Malik, Semantic contours from inverse detectors, Proceedings of the IEEE International Conference on Computer Vision, pp.991-998, 2011.

T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona et al., European conference on computer vision, pp.740-755, 2014.

S. Gould, R. Fulton, and D. Koller, Decomposing a scene into geometric 1195 and semantically consistent regions, 2009 IEEE 12th international conference on computer vision, pp.1-8, 2009.

G. J. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla, Segmentation and recognition using structure from motion point clouds, in: European conference on computer vision, pp.44-57, 2008.

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The KITTI dataset, International Journal of Robotics Research, vol.32, pp.1231-1237, 2013.

G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez et al., A Large Collection of Synthetic Images for Semantic Seg-1205 mentation of Urban Scenes, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.3234-3243, 2016.

S. R. Richter, V. Vineet, S. Roth, and V. Koltun, Playing for data: Ground truth from computer games, Lecture Notes in Computer Science 1210 (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume, vol.9906, pp.102-118, 2016.

B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso et al., Proceedings -30th IEEE Confer-1215 ence on Computer Vision and Pattern Recognition, pp.5122-5130, 2017.

G. Neuhold, T. Ollmann, S. Bulò, and P. Kontschieder, The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

O. Zendel, K. Honauer, M. Murschitz, D. Steininger, and G. F. Dominguez, WildDash-creating hazard-aware benchmarks, Proceedings of the European Conference on Computer Vision (ECCV), pp.402-416, 2018.

H. Yin and C. Berger, When to use what data set for your self-driving car algorithm: An overview of publicly available driving datasets, IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, pp.1-8, 2018.

O. Zendel, M. Murschitz, M. Zeilinger, D. Steininger, S. Abbasi et al., Belez-1230 nai, Railsem19: A dataset for semantic rail scene understanding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.0-0, 2019.

A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser et al., Scannet: Richly-annotated 3d reconstructions of indoor scenes, Pro-1235 ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5828-5839, 2017.

Q. Ha, K. Watanabe, T. Karasawa, Y. Ushiku, and T. Harada, MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes, IEEE International Conference on Intelligent 1240 Robots and Systems, vol.2017, pp.5108-5115, 2017.

W. Treible, P. Saponaro, Y. Liu, A. D. Gupta, V. Veerendraveer et al., RANUS: RGB and NIR urban scene dataset for deep scene parsing, CVPR Workshops, vol.3, pp.1808-1815, 2018.

S. S. Shivakumar, N. Rodrigues, A. Zhou, I. D. Miller, V. Kumar et al., RGB-Thermal Calibration, Dataset and Segmentation 1250, vol.900

, Network, 2019.

P. Kirsanov, A. Gaskarov, F. Konokhov, K. Sofiiuk, A. Vorontsova et al., DISCOMAN: Dataset of Indoor SCenes for Odometry, Mapping And Navigation, 2019.

M. Schwarz, H. Schulz, and S. Behnke, Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features, 2015 IEEE international conference on robotics and automation (ICRA), pp.1329-1335, 2015.

M. Velte, Semantic image segmentation combining visible and near-1260 infrared channels with depth information, 2015.

X. Jin, Q. Jiang, S. Yao, D. Zhou, R. Nie et al., A survey of infrared and visual image fusion methods, Infrared Physics & Technology, vol.85, pp.478-501, 2017.

Y. Choi, N. Kim, S. Hwang, K. Park, J. S. Yoon et al., KAIST Multi-Spectral Day/Night Data Set for Autonomous and Assisted Driving, IEEE Transactions on Intelligent Transportation Systems, vol.19, pp.934-948, 2018.

C. Li, D. Song, R. Tong, and M. Tang, Illumination-aware faster r-cnn for 1270 robust multispectral pedestrian detection, Pattern Recognition, vol.85, pp.161-171, 2019.

S. Farokhi, J. Flusser, and U. U. Sheikh, Near infrared face recognition: A literature survey, Computer Science Review, vol.21, pp.1-17, 2016.

D. Jang and R. Park, Colour image dehazing using near-infrared fu-1275 sion, IET Image Processing, vol.11, pp.587-594, 2017.

F. Dümbgen, M. E. Helou, N. Gucevska, and S. Süsstrunk, Near-infrared fusion for photorealistic image dehazing, Electronic Imaging, pp.321-322, 2018.

Y. Benezeth, D. Sidibé, and J. Thomas, Background subtraction with mul-1280 tispectral video sequences, 2014.

R. Gade and T. Moeslund, Thermal cameras and applications: A survey, vol.25, pp.245-262, 2014.

L. B. Wolff, Polarization vision: A new sensory approach to image understanding, Image and Vision Computing, vol.15, pp.81-93, 1997.

L. B. Wolff and T. E. Boult, Constraining object features using a polarization reflectance model, IEEE Trans. Pattern Anal. Mach. Intell, vol.13, pp.635-657, 1991.

Y. Y. Schechner, S. G. Narasimhan, and S. K. Nayar, Instant dehazing of images using polarization, Proceedings of the, p.1290, 2001.

, ) I-I, Conference on Computer Vision and Pattern Recognition. CVPR, 2001.

W. A. Smith, R. Ramamoorthi, and S. Tozza, Linear depth estimation from an uncalibrated, monocular polarisation image, European Conference on Computer Vision, pp.109-125, 2016.

D. Zhu and W. A. Smith, Depth from a polarisation + rgb stereo pair, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

M. Blanchon, O. Morel, Y. Zhang, R. Seulin, N. Crombez et al., Outdoor Scenes Pixel-wise Semantic Segmentation using Polarimetry and 14th International Joint Conference on Computer Vision, vol.5, pp.328-335, 2019.

D. Sun, X. Huang, and K. Yang, A multimodal vision sensor for autonomous 1305 driving, Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies III, vol.11166, p.111660, 2019.

S. Qiu, Q. Fu, C. Wang, and W. Heidrich, Polarization demosaicking for monochrome and color polarization focal plane arrays, 2019.

N. A. Rubin, G. D'aversa, P. Chevalier, Z. Shi, W. T. Chen et al., Matrix fourier optics enables a compact full-stokes polarization camera, Science, vol.365, p.1839, 2019.

K. Yang, L. M. Bergasa, E. Romera, X. Huang, and K. Wang, Predicting Polarization beyond Semantics for Wearable Robotics, vol.2018, pp.96-103, 2019.

M. Rastgoo, C. Demonceaux, R. Seulin, and O. Morel, Attitude estimation from polarimetric cameras, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.8397-8403, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01865236

E. Fernandez-moral, R. Martins, D. Wolf, and P. Rives, A new metric for evaluating semantic segmentation: leveraging global and contour accuracy, 2018 IEEE Intelligent Vehicles Symposium (IV), pp.1051-1056, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01581525

M. Sokolova and G. Lapalme, A systematic analysis of performance measures 1325 for classification tasks, Information Processing and Management, vol.45, pp.427-437, 2009.

F. Chollet, Proceedings -30th IEEE Conference on Computer Vision and Pattern Recognition, pp.1800-1807, 2017.

L. Fan, W. Wang, F. Zha, and J. Yan, Exploring new backbone and attention module for semantic segmentation in street scenes, IEEE Access, vol.6, pp.71566-71580, 2018.

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., ImageNet: A 1335 Large-Scale Hierarchical Image Database, p.9, 2009.

A. Kendall, V. Badrinarayanan, and R. Cipolla, Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding, British Machine Vision Conference, p.2017, 2017.

G. Lin, C. Shen, A. Van-den, I. Hengel, and . Reid, Exploring context with deep structured models for semantic segmentation, IEEE transactions, vol.40, pp.1352-1366, 2017.

L. M. Bergasa, R. Arroyo, E. Romera, and M. Alvarez, Efficient ConvNet for Real-time Semantic Segmentation, IEEE Intelligent Vehicles Sympo-1345 sium, Proceedings, Iv, pp.1789-1794, 2017.

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, Pyramid scene parsing network, Proceedings -30th IEEE Conference on Computer Vision and Pattern Recognition, pp.6230-6239, 1350.

P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang et al., Understanding convolution for semantic segmentation, 2018 IEEE winter conference on applications of computer vision (WACV), pp.1451-1460, 2018.