Data Democratisation with Deep Learning: Structured Query Translation from and to Natural Language

Postgraduate Thesis uoadl:3320444 173 Read counter

Unit:
Κατεύθυνση Μεγάλα Δεδομένα και Τεχνητή Νοημοσύνη
Πληροφορική
Deposit date:
2023-04-10
Year:
2023
Author:
Katsogiannis-Meimarakis Georgios
Supervisors info:
Γεωργία Κούτρικα, Διευθύντρια Ερευνών, Ερευνητικό Κέντρο Αθηνά
Original Title:
Data Democratisation with Deep Learning: Structured Query Translation from and to Natural Language
Languages:
English
Translated title:
Data Democratisation with Deep Learning: Structured Query Translation from and to Natural Language
Summary:
While data guides and influences many human activities, the barriers posed by the tools that are needed to retrieve it, such as Structured Query Language (SQL), make data inaccessible for many users. To lift these barriers, researchers have been working on creating natural language interfaces that would allow users to access databases solely through natural language.

Natural language interfaces employ Text-to-SQL systems that can translate a natural language question from the user to an SQL query that can retrieve the data they need. Recently, novel Text-to-SQL systems are adopting deep learning methods with very promising results. At the same time, several challenges remain open, making this area an active and flourishing field of research and development. To make real progress in building Text-to-SQL systems, we need to de-mystify what has been done, understand how and when each approach can be used, and, finally, identify the research challenges ahead of us. We present a detailed taxonomy of neural Text-to-SQL systems that will enable a deeper study of all the parts of such a system. This taxonomy will allow us to make a better comparison between different approaches, as well as highlight specific challenges in each step of the process, thus enabling researchers to better strategize their quest towards the ``holy grail" of database accessibility.

However, how can the user verify that the generated SQL query matches their intent if they are not familiar with SQL? To tackle this problem, a system that can translate the SQL query back to natural language is needed (also known as an SQL-to-Text system). We explore the SQL-to-Text problem, we examine its challenges and peculiarities, and present a Transformer-based model that can generate fluent query explanations. Additionally, we look into the difficulties of automatically evaluating the performance of such a system and we examine how different metrics behave in the SQL-to-Text setting.
Main subject category:
Technology - Computer science
Keywords:
Semantic Parsing, Natural Language Generation, Databases, Deep Learning, Metric Learning, Machine Translation
Index:
Yes
Number of index pages:
4
Contains images:
Yes
Number of references:
136
Number of pages:
89
katsogiannis_master_thesis.pdf (1 MB) Open in new window