Coursify
Create New CourseGalleryContact

Singapore Armed Forces

Unit 1
Training
SAF TrainingBasic TrainingSpecialist Training
Unit 2 • Chapter 1

Q Learning Reinforcement Learning

Video Summary

Q-learning is a reinforcement learning technique used for learning the optimal policy in a Markov decision process. In Q-learning, the agent iteratively updates the Q-values for each state-action pair using the Bellman equation until the Q-function converges to the optimal Q-function, Q*. This iterative approach is called value iteration. Q-learning is used to find the optimal policy by learning the optimal Q-values for each state-action pair.

Knowledge Check

What is the objective of Q learning?

In Q learning, what is the trade-off between exploration and exploitation?

What is an epsilon greedy strategy?