Sentiment Analysis from Sound Spectrograms via Soft BoVW and Temporal Structure Modelling

Επιστημονική δημοσίευση - Ανακοίνωση Συνεδρίου uoadl:3188490 39 Αναγνώσεις

Μονάδα:
Ερευνητικό υλικό ΕΚΠΑ
Τίτλος:
Sentiment Analysis from Sound Spectrograms via Soft BoVW and Temporal
Structure Modelling
Γλώσσες Τεκμηρίου:
Αγγλικά
Περίληψη:
Monitoring and analysis of human sentiments is currently one of the
hottest research topics in the field of human-computer interaction,
having many applications. However, in order to become practical in daily
life, sentiment recognition techniques should analyze data collected in
an unobtrusive way. For this reason, analyzing audio signals of human
speech (as opposed to say biometrics) is considered key to potential
emotion recognition systems. In this work, we expand upon previous
efforts to analyze speech signals using computer vision techniques on
their spectrograms. In particular, we utilize ORB descriptors on
keypoints distributed on a regular grid over the spectrogram to obtain
an intermediate representation. Firstly, a technique similar to
Bag-of-Visual-Words (BoVW) is used, where a visual vocabulary is created
by clustering keypoint descriptors, but instead a soft candidacy score
is used to construct the histogram descriptors of the signal.
Furthermore, a technique which takes into account the temporal structure
of the spectrograms is examined, allowing for effective model
regularization. Both of these techniques are evaluated in several
popular emotion recognition datasets, with results indicating an
improvement over the simple BoVW method.
Έτος δημοσίευσης:
2020
Συγγραφείς:
Pikramenos, George
Smyrnis, Georgios
Vernikos, Ioanrtis and
Konidaris, Thomas
Spyrou, Evaggelos
Perantonis, Stavros
Εκδότης:
SCITEPRESS
Τίτλος συνεδρίου:
ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN
RECOGNITION APPLICATIONS AND METHODS
Σελίδες:
361-369
Λέξεις-κλειδιά:
Sentiment Analysis; Speech Analysis; Bag-of-Visual-Words
Επίσημο URL (Εκδότης):
DOI:
10.5220/0009174503610369
Το ψηφιακό υλικό του τεκμηρίου δεν είναι διαθέσιμο.