In this video, Brian explains how to use reinforcement learning to train a bipedal robot to walk. He starts by introducing the reinforcement learning workflow and then walks through an example of using RL to get a robot to walk in a straight line. He then discusses the limitations of this approach and how we can modify the problem by combining the benefits of traditional control design with reinforcement learning. Finally, he shows how to train a robot to avoid obstacles using a rich sensor.
Which of the following is not a reward function used in the video?
What is the name of the algorithm used to train the agent?
What is the disadvantage of the policy learned in the video?
Previous
Next