Virtual Labs

Value Iteration is an example which programming method

a: Greedy Explanation

Explanation

b: Dynamic Programming Explanation

Explanation

c: Object-oriented programming Explanation

Explanation

d: None of them Explanation

Explanation

Which of the following is true regarding Value Iteration?

a: The utility of many states do not change in one iteration, but the process has to continue as long as there is change in some states Explanation

Explanation

b: Discount factor effects the convergence of the algorithm Explanation

Explanation

c: Sometimes the corresponding policy has already converged to optimal, but the values have not converged and therefore we have to continue the value iteration process Explanation

Explanation

d: All of the above Explanation

Explanation

Which of the following is 'false' regarding optimal-value function?

a: It is the maximum Value function over all policies Explanation

Explanation

b: For every finite MDP, optimal-value function is unique Explanation

Explanation

c: For a finite MDP, optimal-value function is not guaranteed to be unique Explanation

Explanation

d: All of the above Explanation

Explanation

Adding constants to reward effects optimal policy in :

a: Both Episodic tasks and continuous tasks Explanation

Explanation

b: Continuous tasks only Explanation

Explanation

c: Episodic tasks only Explanation

Explanation

d: None of them Explanation

Explanation

Value Iteration always find the optimal policy, when run to convergence

a: True Explanation

Explanation

b: False Explanation

Explanation

c: Cannot Say Explanation

Explanation

d: None of them Explanation

Explanation

Optimal policy can be reached before all the Vk(s) (value function of state) reaches their optimal value.

a: True, after certain stage policy doesnot depend upon the utility values of state. Explanation

Explanation

b: True, because policy can remain the same while the utility value of state changes Explanation

Explanation

c: False, only after all the states reach their optimal utility values the optimal policy can be declared Explanation

Explanation

d: None of them Explanation

Explanation