Summary:
The goal of this thesis is to present and analyze mathematically various algorithms
of Multi-Armed Bandit (MAB) problems that applied to Recommendation Systems
(RecSys).
The MAB problems is about finding a selection policy that the reward of any
choice is unknown. Considering the problem and during the policy process, the
reward of these choices is estimated, so the optimal choice will be discovered and
therefore the maximum reward of the process.
A Recommendation System’s goal is to suggest various options or products
to a user based on his needs and requirements. The knowledge and the usage of
suggestion criteria are based on the past interactions of the user inside the system
and also the correct user correlation with product categories.