Q-Learning
What is the role of the learning rate parameter in Q-learning?
How does increasing the grid size in a Q-learning problem affect the number of steps per iteration?
How does increasing the grid size in a Q-learning problem affect the number of iterations required for convergence?
Which of the following is an advantage of Q-learning over other reinforcement learning methods?
How does the choice of reward function affect the behavior of an agent?