IKT725 Deep Reinforcement Learning

Studiepoeng:: 5
Ansvarlig avdeling:: Fakultet for teknologi og realfag
Emneansvarlig:: Baltasar Enrique Beferull Lozano
Undervisningssemester:: Vår
Undervisningsspråk:: English
Varighet:: 1 semester

Timeplan Pensumliste (Leganto)

Emnet er tilknyttet følgende studieprogram

Ph.d.-program i teknologi og realfag

Undervisningsspråk

English

Forkunnskapskrav

Probability and Stochastic processes, Linear algebra, Optimization, Calculus, basic Signal Processing and Programming. Students can use different programming languages and libraries for the implementation of algorithms, such as: Python (e.g. TensorFlow, Pytorch) or Matlab libraries.

Anbefalte forkunnskaper

IKT 724 Deep Learning and IKT 720 Optimization

Innhold

This course offers an in-depth study on both theoretical and algorithmic foundations of Deep Reinforcement Learning (DRL). The contents of the course are relevant for applications in multiple domains with complex decision-making tasks, such as robotic control, autonomous vehicles and navigation, resource allocation in wireless networks, optimizing inventory or industrial processes, resource management in computing clusters, traffic control, games, finance, medicine, personalized recommendations, bidding and advertising, web navigation, among others.

The course covers the following main topics:

Introduction: From supervised learning to decision making, formal framework of reinforcement learning (RL), Markov Decision Process (MDP), motivation, comparison with exact dynamic programming framework, applications in different domains, learnable functions, role of deep learning (DL) for RL, on/off policy methods, overview of schemes and algorithms of Deep Reinforcement Learning (DRL).
Tutorials and Review of TensorFlow/Pytorch and Neural Nets.
Policy-based algorithms: Policy-gradient derivation, Stochastic policy gradient, Deterministic policy gradient.
Value-based algorithms: The Q- and V- Functions, Q-learning, Deep Q-networks.
Model-based algorithms
Combined methods: actor-critic methods, trust region optimization and proximal policy optimization, integrating model-free and model-based algorithms.
Practical details: algorithm implementation, parallelization methods, performance evaluation, choice of hyperparameters, choice of neural network architectures.
Other advanced topics: connection between RL-based control and inference, exploration methods, inverse reinforcement learning, concept of generalization, transfer learning and multi-task learning, meta-learning.

Læringsutbytte

Upon successful completion of this course, the students should:

Be able to decide whether a given application problem should be formulated as a Deep Reinforcement Learning (DRL) problem.
Be able to correctly define the problem formulation, design the most suitable algorithm from the different possible classes of DRL algorithms, providing a justification.
Understand the multiple criteria for analysing and evaluating the DRL algorithms on the relevant metrics: regret, sample complexity, computational complexity, empirical performance, convergence.
Implement in code the main DRL algorithms and apply it to solve several practical problems in different application domains, evaluating experimentally their performance.

Vilkår for å gå opp til eksamen

Compulsory attendance is the only requirement.

Undervisnings- og læringsformer

Lectures, homework exercises, final project, self-study.

Tilbys som enkeltemne

Ja. Med forbehold om ledig plass/kapasitet.

Eksamen

The final grade: pass (A or B) or fail (based on 60% of Homework grade + 40% of final project grade). Passing the course is contingent upon attending all lectures, successfully finishing homework problems, and the final project, where the assessment criteria will be based on both technical correctness and clarity of exposition.

Sist hentet fra Felles Studentsystem (FS) 18. juli 2024 08:06:09