Policy Iteration Practice

Instructions

Reset: Reinitializes the simulation to its default state.
Next Value: Computes the value for the subsequent state in the grid.
Next Iteration: Completes the current iteration by calculating values for all states.
Adjustments: Modify grid size, rewards, and discount factor using the dropdowns provided.
Create Obstacles: Click any cell in the left grid to make it an obstacle.
Set Rewards: Double-click a cell to toggle its reward state: green (reward: +1), red (reward: -1), or neutral.
Visualization: The animation on the left grid shows the calculation for the current state, while the right grid displays the corresponding state value.

Calculation of value function of a state appears here

Policy Representation

Iterations :

Sub Iterations :

Discount:

Reward :

Grid Size :