Skip to main content

One doc tagged with "policy-gradients"

View all tags

Policy Gradients

Optimizing the policy directly: understanding the REINFORCE algorithm, stochastic policies, and the Policy Gradient Theorem.