Value Iteration
Value Iteration is an example which programming method
Which of the following is true regarding Value Iteration?
Which of the following is 'false' regarding optimal-value function?
Adding constants to reward effects optimal policy in :
Value Iteration always find the optimal policy, when run to convergence
Optimal policy can be reached before all the Vk(s) (value function of state) reaches their optimal value.