Μοντελοποίηση της ακουστικής αναπαράστασης της τυπογραφίας εγγράφων μέσω εκφραστικής συνθετικής ομιλίας για τυφλούς και βλέποντες

Doctoral Dissertation uoadl:1309192 591 Read counter

Unit:
Κατεύθυνση / ειδίκευση Επεξεργασία-Μάθηση Σήματος και Πληροφορίας (ΕΜΠ)
Library of the School of Science
Deposit date:
2012-04-05
Year:
2012
Author:
Τσώνος Δημήτριος
Dissertation committee:
Αναπληρωτής Καθηγητής, Γεώργιος Κουρουπέτρογλου
Original Title:
Μοντελοποίηση της ακουστικής αναπαράστασης της τυπογραφίας εγγράφων μέσω εκφραστικής συνθετικής ομιλίας για τυφλούς και βλέποντες
Languages:
Greek
Summary:
This dissertation deals with the sonification of the Visual Presentation
Elements in Documents (VPED) metadata during their transformation to speech.
The approach to this problem includes: a) the automatic extraction of the VPED
induced reader’s emotional states and b) their acoustic rendition using
expressive emotional synthetic speech. A novel architecture is proposed for the
multimodal universal accessibility of documents, regardless of their natural
language, content and culture, based on the automatic extraction of the VPED
induced emotional states and the appropriate documents’ annotation with this
information. A quantitative model is developed for the sonification of the VPED
typographic alternations by: i) the mathematical formulation of the induced
reader’s emotional state, based on the dimensional nature of the emotions
(“Pleasure”, “Arousal” and “Dominance”), and ii) their mapping into alternation
of the prosodic characteristics of the expressive synthetic speech. For the
evaluation of the prosodic model we have explored whether the listeners can
acoustically recognize the typographic alternations. The results were positive
even in the case of listeners without any previous training. The evaluation of
the model by sighted and blind students of primary education shows enhancement
of their performance during the didactic process.
Keywords:
Human Computer-Interaction, Universal Accessibility, Design-for-All, Emotions, Expressive Speech Synthesis
Index:
Yes
Number of index pages:
17, 165
Contains images:
Yes
Number of references:
213
Number of pages:
180
document.pdf (2 MB) Open in new window