Unit:
Department of Informatics and TelecommunicationsΠληροφορική
Author:
KARAGEORGOU IOANNA
Supervisors info:
Αλέξης Δελής, Καθηγητής Τμήματος Πληροφορικής και Τηλεπικοινωνιών ΕΚΠΑ
Original Title:
Just-in-time Sentiment Analysis for Multilingual Streams
Translated title:
Αγγλικά
Summary:
The growth of social-media platforms has been remarkable in terms of both number of
users and volume of content generated. As citizens tend to freely express their sentiments
on social platforms, Twitter has inherently become an indispensable source for the public
discourse in a wide variety of topics. Carrying out sentiment analysis on a timely manner
on streamed tweets is undoubtedly a demanding endeavor. In this thesis, we propose a
Spark-based Twitter sentiment analysis software architecture that receives online multilingual
streamed messages and compiles analytics. We outline the main elements of our proposal
and discuss how they collectively help address the challenges involved in this big-data
processing task. In particular, our framework: i) exploits the Spark machine-learning library
to classify Greek, French and English tweets in a timely-manner, ii) manages streamed
tweets in synergy with contemporary queuing and in-memory data systems, and iii) determines
with high accuracy whether a sentiment is expressed by a genuine account.
Main subject category:
Technology - Computer science
Keywords:
apache spark, spark streaming, machine learning, multilingual data