Skip to Main content Skip to Navigation
New interface
Journal articles

Learnable pooling weights for facial expression recognition

Abstract : Pooling layers are spatial down-sampling layers used in convolutional neural networks (CNN) to gradually downscale the feature map, increase the receptive field size and reduce the number of the parameters in the model. The use of pooling layers leads to less computing complexity and memory consumption reduction but also introduces invariance to certain filter distortions which may induce subtle detail loss. This behaviour is undesired for some fine-grained recognition tasks such as facial expression recognition (FER) which highly relies on specific regional distortion detection. In this paper, we introduce a more filter distortion aware pooling layer based on kernel functions. The proposed pooling reduces the feature map dimensions while keeping track of the majority of the information fed to the next layer instead of ignoring part of them. The experiments on RAF, FER2013 and ExpW databases demonstrate the benefits of such layer and show that our model achieves competitive results with respect to the state-of-the-art approaches.
Complete list of metadata
Contributor : Frédéric Davesne Connect in order to contact the contributor
Submitted on : Friday, October 9, 2020 - 11:50:56 PM
Last modification on : Monday, December 13, 2021 - 9:17:17 AM



M. Amine Mahmoudi, Aladine Chetouani, Fatma Boufera, Hedi Tabia. Learnable pooling weights for facial expression recognition. Pattern Recognition Letters, 2020, 138, pp.644--650. ⟨10.1016/j.patrec.2020.09.001⟩. ⟨hal-02963286⟩



Record views