Recognizing affective states from physiological signals is essential for enabling emotion-aware systems, particularly in human-robot interaction. This paper presents a hybrid deep learning framework for multimodal emotion recognition that integrates deep feature extraction with handcrafted physiological descriptors. The system processes electrocardiogram, photoplethysmogram, and galvanic skin response signals to predict arousal and valence in a continuous regression setting. To our aim, we evaluate two fusion strategies - feature-level and decision-level fusion - using two public affective datasets (AMIGOS and DEAP). Features extracted from each modality via a shared one-dimensional convolutional neural network and signal-specific physiological metrics are either concatenated (feature-level fusion) or separately modeled and combined at the prediction level (decision-level fusion). A broad set of machine learning regressors, including boosting methods and tree ensembles, is explored. Experiments were conducted with a leave-one-subject-out cross-validation protocol to assess generalization across users. Results show that feature-level fusion generally outperforms decision-level fusion, achieving the best root mean square error of 0.089 for arousal and 0.053 for valence. Statistical analyses confirm the significance of these differences, particularly favoring adaptive boosting and random forest under feature fusion. The proposed architecture offers a robust and interpretable solution for physiological emotion recognition and provides a solid foundation for real-time applications in emotion-aware social robotics and human-centered adaptive systems.

Comparing Fusion Strategies for Multimodal Emotion Prediction Using Deep Physiological Features

Tamantini C.
;
Orlandini A.;Fracasso F.;
2025

Abstract

Recognizing affective states from physiological signals is essential for enabling emotion-aware systems, particularly in human-robot interaction. This paper presents a hybrid deep learning framework for multimodal emotion recognition that integrates deep feature extraction with handcrafted physiological descriptors. The system processes electrocardiogram, photoplethysmogram, and galvanic skin response signals to predict arousal and valence in a continuous regression setting. To our aim, we evaluate two fusion strategies - feature-level and decision-level fusion - using two public affective datasets (AMIGOS and DEAP). Features extracted from each modality via a shared one-dimensional convolutional neural network and signal-specific physiological metrics are either concatenated (feature-level fusion) or separately modeled and combined at the prediction level (decision-level fusion). A broad set of machine learning regressors, including boosting methods and tree ensembles, is explored. Experiments were conducted with a leave-one-subject-out cross-validation protocol to assess generalization across users. Results show that feature-level fusion generally outperforms decision-level fusion, achieving the best root mean square error of 0.089 for arousal and 0.053 for valence. Statistical analyses confirm the significance of these differences, particularly favoring adaptive boosting and random forest under feature fusion. The proposed architecture offers a robust and interpretable solution for physiological emotion recognition and provides a solid foundation for real-time applications in emotion-aware social robotics and human-centered adaptive systems.
2025
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Affective Computing
Deep Feature Extraction
Emotion Recognition
Physiological Signals
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/562024
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact