Clusterix: A visual analytics approach to clustering

Graduate Thesis uoadl:1324498 535 Read counter

Unit:
Τομέας Υπολογιστικών Συστημάτων και Εφαρμογών
Library of the School of Science
Deposit date:
2016-07-22
Year:
2016
Author:
Κουτσάκης Ηλίας
Supervisors info:
Παναγιώτης Σταματόπουλος
Original Title:
Clusterix: A visual analytics approach to clustering
Languages:
English
Translated title:
Clusterix: Συσταδοποίηση δεδομένων με οπτικοποίηση
Summary:
In this thesis I present a web-based, visual analytics tool called Clusterix to
support clustering tasks by users, by having analysts at the center of the
workflow. Clusterix provides the facilities to:

- Load and preview CSV data;
- select columns to be used by the clustering algorithm and modify weights;
- select and run one or more clustering algorithms (K-Means, Hierarchical
Clustering) with varying parameters;
- view and interact with the results in a browser environment;
- modify the parameters or input data to correct the clustering output.

Such an iterative, visual analytics approach allows users to quickly determine
the best clustering algorithm and parameters for their data, and to correct any
errors in the clustering output. Clusterix has been applied to the clustering
of heterogeneous data sets, in particular to the clustering of author
affiliations in publications, for a recommendation system on InspireHEP, the
largest High Energy Physics library in the world, based at CERN.
Keywords:
visualization, clustering, machine learning, diagram, data analysis
Index:
Yes
Number of index pages:
10-11
Contains images:
Yes
Number of references:
47
Number of pages:
55
document.pdf (1 MB) Open in new window