Recipe Web Scraper

Graduate Thesis uoadl:3257816 49 Read counter

Unit:
Department of Informatics and Telecommunications
Πληροφορική
Deposit date:
2023-01-24
Year:
2023
Author:
PETRIDOU ANNA
Supervisors info:
Ντούλας Αλέξανδρος, Επίκουρος Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Original Title:
Recipe Web Scraper
Languages:
Greek
English
Translated title:
Recipe Web Scraper
Summary:
The purpose of this project is to create an application, that will collect data from many different sources and offer them collectively to the user in such a way and with functionalities that will facilitate him.
So, this work concerns an application that scans 5 large websites with cooking recipes, collects as much information as it needs from them and passes it to a database. Then, through the backend and frontend that have been created, the application offers the users a series of functions and features that they can take advantage of using the user interface that we have implemented.
More specifically, we initially created the 5 web crawlers, which basically scan the websites, filtering all their links and saving only those related to recipes, in a separate text file. The implementation of the crawlers was done with python using "scrapy".
Then, the 5 web scrapers that we have implemented in java, scan each link they read from the text files and store the desired information for each recipe, in a MySQL database. The scraping of the desired data from each java site is done with the help of the "Jsoup" tool.
After passing all the necessary information to the database, the REST API that we implemented in java with "spring boot", provides a series of search functions to the user, which takes care of being delivered to the frontend that we implemented.
Finally, the user interface, i.e., the frontend, for which we used “Vue.js”, makes sure that the data and functions of the application are presented in a user-friendly way.
The result of this work is the creation of a project, going through all the phases of development, seeing, and learning the course for a complete cycle of planning and implementing an application, from which a functional but also useful application results.
Main subject category:
Technology - Computer science
Keywords:
scraping, crawling, backend, frontend, user-interface
Index:
Yes
Number of index pages:
7
Contains images:
Yes
Number of references:
38
Number of pages:
91
RecipeWebScraper.pdf (9 MB) Open in new window