Basic quantitative characteristics of the Modern Greek language using the Hellenic National Corpus

Επιστημονική δημοσίευση - Άρθρο Περιοδικού uoadl:2995279 29 Αναγνώσεις

Μονάδα:
Ερευνητικό υλικό ΕΚΠΑ
Τίτλος:
Basic quantitative characteristics of the Modern Greek language using the Hellenic National Corpus
Γλώσσες Τεκμηρίου:
Αγγλικά
Περίληψη:
Modern Greek is one of the least quantitatively studied modern European languages and the goal of this paper is to fill this relative void. We use the Hellenic National Corpus (HNC), which is a growing corpus that currently includes 33 million words. The corpus and all the tools used in our work were developed by the Institute for Language and Speech Processing (ILSP). In this paper we focus on three main areas: the lists of the 1000 most common words and lemmas, word length and letter frequency. We also make some comparisons with earlier work, in which we had used the previous 13 million word edition of the HNC. © Taylor & Francis Group Ltd.
Έτος δημοσίευσης:
2005
Συγγραφείς:
Mikros, G.
Hatzigeorgiu, N.
Carayannis, G.
Περιοδικό:
Journal of Quantitative Linguistics
Τόμος:
12
Αριθμός / τεύχος:
2-3
Σελίδες:
167-184
Επίσημο URL (Εκδότης):
DOI:
10.1080/09296170500172478
Το ψηφιακό υλικό του τεκμηρίου δεν είναι διαθέσιμο.