Computer-Human interaction is more frequent now than ever before, thus the main goal of this research area is to improve communication with computers, so it becomes as natural as possible. A key aspect to achieve such interaction is the affective component often missing from last decade developments. To improve computer human interaction in this paper we present a method to convert discrete or categorical data from a CNN emotion classifier trained with Mel scale spectrograms to a two-dimensional model, pursuing integration of the human voice as a feature for emotional inference multimodal frameworks. Lastly, we discuss preliminary results obtained from presenting audiovisual stimuli to different subject and comparing dimensional arousal-valence results and it’s SAM surveys.
Información general
Fecha de exposición:2021
Fecha de publicación:2021
Idioma del documento:Inglés
Evento:IX Jornadas de Cloud Computing, Big Data & Emerging Topics (Modalidad virtual, 22 al 25 de junio de 2021)
Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)