Reinforcement Learning Fundamentals

Learn how agents interact with environments to make sequential decisions using reward-based learning strategies.

📄️ Q-Learning

Mastering the Bellman Equation, Temporal Difference learning, and the Exploration-Exploitation trade-off.

Optimizing the policy directly: understanding the REINFORCE algorithm, stochastic policies, and the Policy Gradient Theorem.

Combining value-based and policy-based methods for stable and efficient reinforcement learning.

Scaling Reinforcement Learning with Deep Learning using Experience Replay and Target Networks.