In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.
- 5 stars84.26%
- 4 stars12.93%
- 3 stars2%
- 2 stars0.53%
- 1 star0.26%
來自PREDICTION AND CONTROL WITH FUNCTION APPROXIMATION的熱門評論
more detailed explanation of some of the assignments and how state values are got with tile coding but overall a great experience!
Martha and Adam are excellent instructors. This course is so well organized and presented. I have learned a lot! Thanks very much!
Give nive theoretical foundation. I found RL courses are abstract, but the programming assignment give a nice conceptualization.
Adam & Martha really make the walk through Sutton & Barto's book a real pleasure and easy to understand. The notebooks and the practice quizzes greatly help to consolidate the material.