Skip to Main content Skip to Navigation
New interface
Conference papers

Speech Emotion Recognition using Time-frequency Random Circular Shift and Deep Neural Networks

Abstract : This paper addresses the problem of emotion recognition from a speech signal. Thus, we investigate a data augmentation technique based on circular shift of the input time-frequency representation which significantly enhances the emotion prediction results using a deep convolutional neural network method. After an investigation of the best combination of the method parameters, we comparatively assess several neural network architectures (Alexnet, Resnet and Inception) using our approach applied on two publicly available datasets: eNTERFACE05 and EMO-DB. Our results reveal an improvement of the prediction accuracy in comparison to a more complicated technique of the state of the art based on Discriminant Temporal Pyramid Matching (DCNN-DTPM).
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03583535
Contributor : Dominique Fourer Connect in order to contact the contributor
Submitted on : Monday, February 21, 2022 - 7:58:13 PM
Last modification on : Tuesday, November 1, 2022 - 3:28:31 AM
Long-term archiving on: : Sunday, May 22, 2022 - 7:17:47 PM

File

article.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03583535, version 1

Citation

Sylvain Xia, Dominique Fourer, Liliana Audin, Jean-Luc Rouas, Takaaki Shochi. Speech Emotion Recognition using Time-frequency Random Circular Shift and Deep Neural Networks. Speech Prosody 2022, May 2022, Lisbonne, Portugal. ⟨hal-03583535⟩

Share

Metrics

Record views

148

Files downloads

27