Value Iteration

Step 1: Modify Cell States

  • In the practice section, double click/single click on cells to change them.
    • Terminal States: Goal states like the charging station.
    • Blocked States: Inaccessible areas or obstacles.

Step 2: Adjust Parameters and Grid Size

  • Use the Control Menu to modify algorithm parameters and grid size.

Step 3: State Value Function Representation

  • The grid displays the State Value Function for each state, showing potential rewards for directions: Left, Up, Right, and Down.

Step 4: Progressing Through Iterations

  • Click "Next Value" to calculate the state function for the next state in the current iteration.
    • When a terminal state is reached or the maximum number of steps per iteration is exceeded, the iteration count increases and steps reset to 0.

Step 5: Moving to Next Iteration

  • Select "Next Iteration" to advance to the following cycle of the algorithm.

Step 6: Policy Representation

  • The arrows in the left grid indicate the policy currently being learned.

Step 7: Convergence to Optimal Policy

  • A message will be displayed when the State Value Functions of all states converge, indicating the selection of an Optimal Policy.