Q-Learning
Step 1: Modify Cell States
- In the practice section, double click/single click on cells to modify them.
- Terminal States: Mark goal states (e.g., charging stations) by single-clicking to turn them green.
- Blocked States: Mark obstacles or inaccessible areas by double-clicking to turn them black.
Step 2: Adjust Parameters and Grid Size
- Modify algorithm parameters and grid size through the Control Menu.
Step 3: Q Value Representation
- The grid represents the Q values for each action: Left, Up, Right, and Down.
Step 4: Advancing Steps in Iteration
- Click "Next Value" to proceed to the next step in the current iteration.
- The step count will increase with each click.
- Upon reaching a terminal state or the maximum steps allowed per iteration, the iteration count will increase, and the step count resets to 0.
Step 5: Progressing to the Next Iteration
- Click "Next Iteration" to move on to the subsequent iteration.
Step 6: Policy Visualization
- The arrows in the left grid show the currently learned policy based on the Q values.
Step 7: Achieving Optimal Policy
- A notification will be displayed when the Q values for all actions converge, indicating the selection of an Optimal Policy.