IKT725 Deep Reinforcement Learning

ECTS Credits:: 5
Responsible department:: Faculty of Engineering and Science
Course Leader:: Baltasar Enrique Beferull Lozano
Lecture Semester:: Spring
Duration:: 1 term

Schedule (TP) Reading list (Leganto)

The course is connected to the following study programs

PhD Programme in Engineering and Science

Prerequisites

Probability and Stochastic processes, Linear algebra, Optimization, Calculus, basic Signal Processing and Programming. Students can use different programming languages and libraries for the implementation of algorithms, such as: Python (e.g. TensorFlow, Pytorch) or Matlab libraries.

Recommended prerequisites

IKT 724 Deep Learning and IKT 720 Optimization

Course contents

This course offers an in-depth study on both theoretical and algorithmic foundations of Deep Reinforcement Learning (DRL). The contents of the course are relevant for applications in multiple domains with complex decision-making tasks, such as robotic control, autonomous vehicles and navigation, resource allocation in wireless networks, optimizing inventory or industrial processes, resource management in computing clusters, traffic control, games, finance, medicine, personalized recommendations, bidding and advertising, web navigation, among others.

The course covers the following main topics:

Introduction: From supervised learning to decision making, formal framework of reinforcement learning (RL), Markov Decision Process (MDP), motivation, comparison with exact dynamic programming framework, applications in different domains, learnable functions, role of deep learning (DL) for RL, on/off policy methods, overview of schemes and algorithms of Deep Reinforcement Learning (DRL).
Tutorials and Review of TensorFlow/Pytorch and Neural Nets.
Policy-based algorithms: Policy-gradient derivation, Stochastic policy gradient, Deterministic policy gradient.
Value-based algorithms: The Q- and V- Functions, Q-learning, Deep Q-networks.
Model-based algorithms
Combined methods: actor-critic methods, trust region optimization and proximal policy optimization, integrating model-free and model-based algorithms.
Practical details: algorithm implementation, parallelization methods, performance evaluation, choice of hyperparameters, choice of neural network architectures.
Other advanced topics: connection between RL-based control and inference, exploration methods, inverse reinforcement learning, concept of generalization, transfer learning and multi-task learning, meta-learning.

Learning outcomes

Upon successful completion of this course, the students should:

Be able to decide whether a given application problem should be formulated as a Deep Reinforcement Learning (DRL) problem.
Be able to correctly define the problem formulation, design the most suitable algorithm from the different possible classes of DRL algorithms, providing a justification.
Understand the multiple criteria for analysing and evaluating the DRL algorithms on the relevant metrics: regret, sample complexity, computational complexity, empirical performance, convergence.
Implement in code the main DRL algorithms and apply it to solve several practical problems in different application domains, evaluating experimentally their performance.

Examination requirements

Compulsory attendance is the only requirement.

Teaching methods

Lectures, homework exercises, final project, self-study.

Admission for external candidates

Yes

Assessment methods and criteria

The final grade: pass (A or B) or fail (based on 60% of Homework grade + 40% of final project grade). Passing the course is contingent upon attending all lectures, successfully finishing homework problems, and the final project, where the assessment criteria will be based on both technical correctness and clarity of exposition.

Last updated from FS (Common Student System) July 1, 2024 2:01:51 AM