Supervisors info:
Ιωάννης Εμίρης, Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Summary:
The prediction of the delay of monthly payments concerning long-term customers with
or without contracts, is valuable to financial planning, cash-flow forecasting, making
strategic choices to reduce losses and factoring in general. Especially for small and
medium enterprises, it has been estimated that up to half of theιρ invoices are paid late, thus creating a significant problem. The estimation of the expected delay combined with the corresponding probability, allows the ranking of customers according to the risk of loss. Usually, the type of product or service offered by the company, affects the available
features, which consequently have different importance in the prediction process.
Additionally, often the volume of data collected from the customers is huge and they are
distributed on different databases and have varying quality. In this thesis, both classification (late, non-late) and regression models (days until the
bill is settled) are evaluated, using minimal information from the current bill and the
customer’s history. Furthermore, additional features are generated that summarize the
customer’s profile up to a specific date and capture recent trends, without including any
information not known at the time of issuing the bill. Thus, the focus is on the customer’s
behaviour without a strict time component, as in the classic time-series. Initially, basic machine learning algorithms that are often encountered in relevant
applications in the literature are evaluated and then ensemble learning methods are
tested, utilizing the basic models. Finally, their performance is compared to that of
models that use classic time-series.
Keywords:
late payment prediction, customer profile, time-series, classification, regression