Abstract
Reinforcement Learning describes a machine learning paradigm which is applied to find an optimal sequence of actions to achieve a given goal. An agent receives an reward for performed actions and has to find a policy which maximizes the expected value of the sum of cumulative rewards. In this paper linear function approximation and SARSA Learning are explained and implemented. They are applied to the computer game Breakout. Experiments with different lengths of training episodes and different hyperparemeters were performed and the results presented. Because of the selected features very good results were achieved after only a short training periode

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2019 Matthias Haselmaier, Alexander Schwarz, Tim Hallyburton
