The course is connected to the following study programs

Teaching language

English

Course contents

This course will teach the foundations of applied reinforcement learning and deep reinforcement learning. They will, at course completion, be able to build and train stateof-the-art reinforcement learning algorithms, understand the importance of hyperparameter tuning, and understand when to apply different algorithms. The students will also learn the theoretical foundation of reinforcement learning and concepts used to improve the performance of reinforcement learning algorithms, including deep neural networks.

 

The course adapts to the state-of-the-art literature and covers theoretical and practical topics of reinforcement learning research. These topics include:

  • Fundamentals of reinforcement learning, including Markov decision processes, model/model-free RL, and function approximation.
  • Multi-armed bandits, policy iteration, dynamic programming.
  • Reinforcement learning architectures, including TD-learning, direct policy search, and deep reinforcement learning.
  • Working end-to-end pipelines, including performance evaluation, policy loss, value loss, and entropy.

Learning outcomes

On successful completion of this course, the student should:

  • Understand the key features of reinforcement learning
  • Be able to distinguish reinforcement learning from other AI and non-interactive machine learning
  • Be able to determine if reinforcement learning is suitable for a given application problem and be able to define it formally using concepts like state space, action space, dynamics, and reward models. In addition, the student should be able to identify and justify which class of algorithms is best suited to tackle the problem.
  • Implement common reinforcement learning algorithms in a high-level programming language
  • Understand concepts and metrics in RL such as regret, sample complexity, computational complexity, empirical performance, convergence, etc.
  • Be familiar with the exploration-exploitation trade-off.
  • Be able to distinguish and grasp concepts such as model-based and model-free RL, Policy-based, and Value-based RL.
  • Be able to use RL algorithms on practical problems.

Teaching methods

Combination of lectures, assignments, paper studies, lab, report writing, and self-study. The tasks are done individually or in small groups of 2 students with group supervision.

The workload for the average student is approximately 200 hours.

Evaluation

The person responsible for the course decides, in cooperation with the student representative, the form of student evaluation and whether the course is to have a midway or end of course evaluation in accordance with the quality system for education, chapter 4.1.

Offered as Single Standing Module

Yes, if there are places available.

Admission Requirement if given as Single Standing Module

Admission requirements for the course are the same as for the master’s programme in ICT.

Assessment methods and criteria

Graded portfolio assessment, individually or in groups. Groups are given joint grades.

Information about the portfolio content will be given in Canvas by the start of the semester.

Last updated from FS (Common Student System) June 30, 2024 6:45:16 PM