Domain-and Structure-Agnostic End-to-End Entity Resolution with JedAI

Επιστημονική δημοσίευση - Άρθρο Περιοδικού uoadl:3070870 21 Αναγνώσεις

Μονάδα:
Ερευνητικό υλικό ΕΚΠΑ
Τίτλος:
Domain-and Structure-Agnostic End-to-End Entity Resolution with JedAI
Γλώσσες Τεκμηρίου:
Αγγλικά
Περίληψη:
We present JedAI, a new open-source toolkit for endto-end Entity Resolution. JedAI is domain-agnostic in the sense that it does not depend on background expert knowledge, applying seamlessly to data of any domain with minimal human intervention. JedAI is also structure-agnostic, as it can process any type of data, ranging from structured (relational) to semi-structured (RDF) and un-structured (free-text) entity descriptions. JedAI consists of two parts: (i) JedAI-core is a library of numerous state-of-the-art methods that can be mixed and matched to form (thousands of) end-to-end workflows, allowing for easily benchmarking their relative performance. (ii) JedAI-gui is a user-friendly desktop application that facilitates the composition of complex workflows via a wizard-like interface. It is suitable for both lay and power users, offering concrete guidelines and automatic configuration, as well as manual configuration options, visual exploration, and detailed statistics for each method's performance. In this paper, we also delve into the new features of JedAI's latest version (2.1), and demonstrate its performance experimentally. © 2020 Association for Computing Machinery. All rights reserved.
Έτος δημοσίευσης:
2020
Συγγραφείς:
Papadakis, G.
Tsekouras, L.
Thanos, E.
Giannakopoulos, G.
Palpanas, T.
Koubarakis, M.
Περιοδικό:
ACM SIGMOD Record
Εκδότης:
ASSOCIATION FOR COMPUTING MACHINERY
Τόμος:
48
Αριθμός / τεύχος:
4
Σελίδες:
30-36
Λέξεις-κλειδιά:
Information systems; Software engineering, Automatic configuration; Configuration options; Desktop applications; Entity resolutions; Human intervention; Relative performance; State-of-the-art methods; Visual exploration, Benchmarking
Επίσημο URL (Εκδότης):
DOI:
10.1145/3385658.3385664
Το ψηφιακό υλικό του τεκμηρίου δεν είναι διαθέσιμο.