| Event | Date | Description | Course Materials |
|---|---|---|---|
| Lecture | R 8/23 | 1b. Multi-agent systems. Introduction to reinforcement learning. Course overview. | [slides] |
| H0 | R 8/23 | Homework 0 released | [Hw0] |
| H1 | 8/28 | Homework 1 released | [Homework 1] |
| Lecture | 8/28 | 2a. Deep RL and Markov Decision Processes. | [slides] Sutton and Barton, Ch 3 |
| Lecture | 8/30 | Traffic control and dynamics. Microscopic models. Macroscopic models. Formulating MDPs. | [slides] Treiber and Kesting, Ch 6.1 to 6.3, 10.1 & 10.2, and 11.3 |
| Discussion | Flow, SUMO, OpenAI Gym setup | [slides] links to tutorials |
|
| Lecture | 9/4 | Tabular MDPs. Value iteration. Policy iteration. | [slides] Sutton and Barton, Ch 4.1-4.4 |
| Lecture | 9/6 | Math review. Linear algebra. Calculus review. Matrix calculus. Basic optimization (GD, SGD). | [slides] |
| Discussion | |||
| Lecture | 9/11 | Introduction to Neural Networks | [slides] Link 1 Link 2 |
| Lecture | 9/13 | Approximate Dynamic Programming | [slides] |
| H1 | 9/13 | Homework 1 due | |
| H2 | 9/13 | Homework 2 released | |
| Lecture | 9/18 | Traffic data, and estimation | [slides] Treiber and Kesting, Ch 16 |
| Lecture | 9/20 | Traffic congestion | [slides] Treiber and Kesting, Ch 15 |
| Lecture | 9/25 | Policy optimization: Derivative-free methods + Finite Difference Methods | [slides] ??? |
| H2 | 9/27 | Homework 2 due | |
| H3 | 9/27 | Homework 3 released | |
| Lecture | 9/27 | Policy optimization: Policy Gradients I | [slides] ??? |
| Lecture | 10/2 | Policy optimization: Policy Gradients II | |
| Lecture | 10/4 | Cooperative Multi-agent RL: Nash Equilibria, Non-stationarity, Multi-agent architectures, Project Introductions | |
| Project | 10/9 | Project Proposals due | |
| Lecture | 10/9 | Advanced Cooperative Multi-agent RL: Techniques for handling non-stationarity | |
| Lecture | 10/11 | Monte-Carlo Tree Search | |
| Lecture | 10/16 | Project Proposal Presentations | |
| Lecture | 10/18 | Advanced Cooperative Multi-agent RL: Hierarchical RL, Self-play, GANs | |
| H3 | 10/18 | Homework 3 due | |
| H4 | 10/18 | Homework 4 released | |
| Lecture | 10/23 | Policy optimization: Policy Gradients III | |
| Lecture | 10/25 | Exploration + Reward shaping | |
| Lecture | 10/30 | Best practices in RL | |
| Lecture | 11/1 | Introduction to Game Theory | |
| H4 | 11/1 | Homework 4 due | |
| 11/6 | No Class | ||
| 11/8 | No Class | ||
| Lecture | 11/13 | Lecture from Flow team | |
| Project | 11/13 | Project Update Due | |
| Lecture | 11/15 | Guest lecture | |
| 11/20 | Open Project Office Hours | ||
| 11/22 | Thanksgiving | No Class | ||
| 11/27 | Guest lecture or Open Project Office Hours | ||
| 11/29 | Guest lecture or Open Project Office Hours | ||
| 12/04 | RRR | ||
| 12/06 | Project paper / Presentations | ||
| 12/11 | Finals Week | ||
| 12/13 | Finals Week |