Disruptive Situations Detection on Public Transports through Speech Emotion Recognition

Mancini, Eleonora (2021) Disruptive Situations Detection on Public Transports through Speech Emotion Recognition. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (3MB)

Abstract

In this thesis, we describe a study on the application of Machine Learning and Deep Learning methods for Voice Activity Detection (VAD) and Speech Emotion Recognition (SER). The study is in the context of a European project whose objective is to detect disruptive situations in public transports. To this end, we developed an architecture, implemented a prototype and ran validation tests on a variety of options. The architecture consists of several modules. The denoising module was realized through the use of a filter and the VAD module through an open-source toolkit, while the SER system was entirely developed in this thesis. For SER architecture we adopted the use of two audio features (MFCC and RMS) and two kind of classifiers, namely CNN and SVM, to detect emotions indicative of disruptive situations such as fighting or shouting. We aggregated several models through ensemble learning. The ensemble was evaluated on several datasets and showed encouraging experimental results, even compared to the baselines of the state-of the-art. The code is available at: https://github.com/helemanc/ambient-intelligence

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Mancini, Eleonora

Relatore della tesi

Torroni, Paolo

Correlatore della tesi

Galassi, Andrea ; Ruggeri, Federico ; Escrig Escrig, Josep ; Huerta Casado, Ivan

Scuola

Ingegneria e Architettura

Corso di studio

Artificial intelligence [LM-DM270]

Ordinamento Cds

DM270

Parole chiave

Speech Emotion Recognition,Speech Recognition,Voice Activity Detection,Machine Learning,Natural Language Processing,Deep Learning,Convolutional Neural Network,Support Vector Machine,MFCC

Data di discussione della Tesi

3 Dicembre 2021

URI

https://amslaurea.unibo.it/id/eprint/24721

Altri metadati

Statistica sui download

Vedi altre statistiche

Gestione del documento:

Strumenti di navigazione

Collezioni AlmaDL

Disruptive Situations Detection on Public Transports through Speech Emotion Recognition

Abstract

Altri metadati

Statistica sui download