Pergamos - Library and Information Center of National and Kapodistrian University of Athens

Unit:

Κατεύθυνση Στατιστική και Επιχειρησιακή Έρευνα
Library of the School of Science

Deposit date:

2023-09-18

Year:

2023

Author:

Zacharis Dimitrios

Supervisors info:

Απόστολος Μπουρνέτας, Καθηγητής, Τμήμα Μαθηματικών ΕΚΠΑ,
Παναγιώτης Μερτικόπουλος, Καθηγητής Τμήμα Μαθηματικών ΕΚΠΑ,
Αντώνης Οικονόμου, Καθηγητής, Τμήμα Μαθηματικών ΕΚΠΑ

Original Title:

Mathematical Models and Algorithms for Contextual Multi-armed Bandit Problems

Languages:

English

Translated title:

Mathematical Models and Algorithms for Contextual Multi-armed Bandit Problems

Summary:

This thesis deals with a special class of bandit problems, contextual bandits, and algorithms for learning problems. Contextual bandits belong to the field of reinforcement learning and in such a problem, the algorithm has to make decisions about
actions based on contexts, which include information about the current state of the system
and possibly previous results collected. The goal of the algorithm is to learn a policy
that, over time, will select actions with the greatest potential payoff.
The paper will discuss basic concepts, definitions, examples and applications related to
bandit problems. After analyzing the main results on stochastic and adversarial bandits,
we will deepen the thesis on theorems and algorithms related to contextual bandits, with a focus on
Thompson Sampling algorithm. We will also design simulation studies to evaluate the algorithms
Thompson and LinUCB on different classes of reinforcement learning problems.

Main subject category:

Science

Keywords:

Bandit problems,regret,contextual bandits,contexts,learner,reward,environment,stochastic bandits,LinUCB,Thompson Sampling,adversarial bandits

Index:

Number of index pages:

Contains images:

Yes

Number of references:

Number of pages:

File:

https://pergamos.lib.uoa.gr/uoa/dl/object/3356103/file.pdf