Web application for morphosyntactic recognition of verb forms «VerbTagGr++»

Graduate Thesis uoadl:1324165 382 Read counter

Unit:
Τομέας Υπολογιστικών Συστημάτων και Εφαρμογών
Library of the School of Science
Deposit date:
2016-03-11
Year:
2016
Author:
Ζέρβα Ιωάννα
Κωλέτση Βασιλική
Μωραγιάννη Νικολέτα
Supervisors info:
Αφροδίτη Τσαλγατίδου, Μαρία Γρηγοριάδου, Πηνελόπη Λεμπέση
Original Title:
Διαδικτυακό Περιβάλλον Στατιστικής Μορφοσυντακτικής Αναγνώρισης των Μονολεκτικών Ρηματικών Τύπων της Νέας Ελληνικής «VERBTAGGR++»
Languages:
Greek
Translated title:
Web application for morphosyntactic recognition of verb forms «VerbTagGr++»
Summary:
The object of the present BA thesis is the implementation of a web-based tool
which performs morphosyntactic recognition of the one-word Modern Greek verbal
forms and is available in Greek and English. The linguistic data it uses
originate in Penelope Lembessi’s PhD dissertation entitled "Statistical
Morphosyntactic Disambiguation and Lemmatization of the Modern Greek Verbal
Class" (Marc Bloch University, Strasbourg, 2005). This thesis is based on
Stauroula Kroustalli’s thesis entitled “Web application for morphosyntactic
recognition of verb forms «VerbTagGr»”. Within the framework of Kroustalli’s
thesis, only the data concerning verb forms ending in unaccentuated '-α' and
'-ν' had been included in the tool. Within the framework of this thesis, all
verbs are included. Consequently, recognition is provided for all forms.
Although the tool today concerns solely morphosyntactic recognition of verb
forms, future expansion of it is foreseen, so that it will further support
lemmatization of forms as well. Thus, for every input verb, the tool provides
data originating in Lembessi’s PhD dissertation which are necessary for the
production of the canonical form (lemma). A first approach has been made for
lemmatization, but in order for it to work, more steps should be made.
Moreover, special software has been implemented, which processes and checks the
correctness of the linguistic data. The technologies which have been used are:
Java (J2SE 1.4.2), JSP, Web Services, MySQL and Tomcat.
Keywords:
Morphosyntactic, Computational, Linguistics, Web Service, Lemmatization
Index:
Yes
Number of index pages:
115, 116
Contains images:
Yes
Number of references:
16
Number of pages:
117
document.pdf (5 MB) Open in new window

 


attachments.zip
2 MB
File access is restricted.