Decision Making Systems and Reinforcement Learning
DATA 890, 2025 Spring, UNC-CH, School of Data Science and Society
Overview
This graduate-level course is designed for students with interests in machine learning, artificial intelligence, and statistical methodologies. Advanced undergraduate students are also encouraged to enroll. Sequential decision-making systems, especially those powered by reinforcement learning, are essential for the development of autonomous AI systems and a core application of modern machine learning.The course covers foundational theories and concepts in decision-making algorithms, with a focus on reinforcement learning (RL) techniques. Key topics include the principles of Markov Decision Processes (MDPs), Q-learning, and policy-based algorithms, along with hands-on analysis and exploration of their applications.
Prerequisites
Calculus (MATH 522 or similar), Linear Algebra (MATH 347, 577 or similar), Probability Theory (MATH 535, 635 or similar), Python Programming (COMP 116 or similar), Machine Learning (COMP 562, 755 or similar)
Textbook (optional)
- Reinforcement Learning: An Introduction, by Andrew Barto and Richard S. Sutton, book link
- Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal et al. book link
Logistics
- Time: Tuesday and Thursday 11:00AM - 12:15PM, 2025 Spring
- Location: ITS Manning 1101
- Instructor: Weitong Zhang, Email
- Office hours:
- After the lecture, 12:15PM - 13:00PM, Tuesday and Thursday
- Additional office hours would be posted seperatly or by appointment
Grading Policy
Grades will be computed based on the following factors:
- Attendence: 10%
- Reading report: a one-page-around report on one (or two) paper(s) published in year 2023/2024/2025 on
- Empirical Topics (10%)
- Theoretical Topics (10%)
- Advanced Topics / RL + GenAI (10%)
- Final Project (60%) (detailed ratio subject to change)
- Completeness (30%)
- Peer-reviewed Quality (30%)
Related Courses (optional)
- UIUC, Statistical Reinforcement Learning, by Prof. Nan Jiang, course link
- Princeton, Foundations of Reinforcement Learning, by Prof. Chi Jin, course link
- UC Berkeley, Deep Reinforcement Learning, by Prof. Sergey Levine, course link
- Cornell, Foundations of Reinforcement Learning, by Prof. Wen Sun, course link
Schedule
Dates (T, TH) | Tuesday | Thursday | Reading | Assignment |
---|---|---|---|---|
01/06, 01/10 | no class held | Overview, logistics, supervised behavior learning | ||
01/14, 01/16 | Markov Decision Processes and Planning in RL | Coding foundations, behavioral cloning | ||
01/21, 01/23 | Value Iteration and Value-based methods | Practical deep RL with Q-learning | ||
01/28, 01/30 | Policy Iteration and Policy Gradient Methods | Actor-Critic Algorithms | ||
02/04, 02/06 | Advanced Policy-based algorithms | Model-based RL, MPC and World Model | ||
02/11, 02/13 | Theoretical Foundation: Multi-arm bandits | Exploration and Value-Iteration in Tabular MDP | ||
02/18, 02/20 | Function Approximation, Linear Bandits | Least Square Value-Iteration in Linear MDPs (I) | ||
02/25, 02/27 | Least Square Value-Iteration in Linear MDPs (II) | RL with General Function Approximation | ||
03/04, 03/06 | Offline RL: Distribution Shift and Pessimism | Offline RL algorithms: CQL, IQL and more | ||
03/18, 03/20 | Hybrid RL, safety and constraint in RL | Unsupervised RL and reward free exploration | ||
03/25, 03/27 | Game Theory, two-player zero-sum game | Multi-Agent, general sum game and federated RL | ||
04/01, 04/03 | Inverse RL and Reward Modeling | Introduction to LLM and Diffusion Models | ||
04/08, 04/10 | RL with Human Feedback; Alignment | RL with Sequence Models, In context RL | ||
04/15, 04/17 | RL with Diffusion Models | Well-being day - no class held | ||
04/22, 04/24 | Advanced Topics in RL / Guest Lectures | Afterwords, Challenges and Open Problems | ||
04/29, 05/01 | In-Class Final Presentation | Examination Days - no class held |