Unit:
Department of Informatics and TelecommunicationsΠληροφορική
Author:
MYSTRIOTIS DIMITRIOS
Supervisors info:
Παναγιώτης Σταματόπουλος, Επίκουρος Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Original Title:
Review of the MuZero Algorithm with Implementation on Quoridor
Translated title:
Review of the MuZero Algorithm with Implementation on Quoridor
Summary:
This thesis discusses the development of the MuZero algorithm by DeepMind and its application in the game of Quoridor. The algorithm is a deep reinforcement learning algorithm that expands on previous algorithms to achieve exceptional performance in learning and planning. The key difference from its predecessors is the ability to operate in complex environments without any prior knowledge. All knowledge of game rules and dynamics is learned through interactions with the environment. The algorithm is trained through self-play, where it learns by playing games against itself, and uses the generated data to improve its performance. The thesis also discusses the environment of Quoridor, a competitive two-player strategy board game, and the application of the MuZero algorithm to it.
Main subject category:
Technology - Computer science
Keywords:
Machine learning, Reinforcement learning, deep learning, neural networks, Markov decision process, Monte Carlo tree search, deep reinforcement learning, board games