Analytics and Job market: NLP, clustering and statistical analysis on data related job openings

Postgraduate Thesis uoadl:2885415 355 Read counter

Unit:
Κατεύθυνση Οικονομικά, Διοικητικά και Πληροφοριακά Συστήματα Επιχειρήσεων
Library of the Faculty of Economics and of the Faculty of Business Administration
Deposit date:
2019-11-12
Year:
2019
Author:
Mitsika Marina
Supervisors info:
Σωτήρης Παπακωνσταντίνου, Διδάκτωρ, Τμήμα Οικονομικών Επιστημών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Original Title:
Analytics and Job market: NLP, clustering and statistical analysis on data related job openings
Languages:
English
Translated title:
Analytics and Job market: NLP, clustering and statistical analysis on data related job openings
Summary:
This dissertation conducts a study on data related jobs and is divided into two main parts. The former refers to the theoretical approach and research carried out on data related jobs characteristics as they are formed in the global job market and the second section attempts to analyse and classify a data sample of job postings according to their titles and their required qualifications, by applying text processing techniques and clustering methodologies.
The target of postings classification is to perform a statistical analysis on the demand each category’s technologies and methodologies appear to display.
Dealing with the specific topic aims exploring and answering questions pertinent to the criteria companies perceive as prerequisites for a candidate to own and so be selected among the others and get hired. In particular, technologies, methodologies, studies type and the educational level every applicant should possess are examined.
This thesis attempts to respond to the preceding queries by implementing text processing techniques, (Bag of Words), employing clustering algorithms (Κ-means clustering, Hierarchical clustering) and carrying out a statistical analysis of the emerging results through frequency diagrams utilization.
It should be a priori stated that the case study results confirm to a satisfactory degree the thesis assumptions presented in the first theoretical section. That is, the analysis of the chosen sample illustrates that the requirements of certain data related job postings exhibit a high similarity level, the technologies are accurately classified in the postings categories while the ads number is significantly increasing in the course of time.
Main subject category:
Technology - Computer science
Keywords:
Clustering, K-means, Hierarchical, NLP, Job Market, Analytics
Index:
No
Number of index pages:
0
Contains images:
Yes
Number of references:
27
Number of pages:
58
Marina Mitsika Thesis.pdf (2 MB) Open in new window