Virtual Labs

Tools

Q-Learning

Step 1: Modify Cell States

In the practice section, double click/single click on cells to modify them.
- Terminal States: Mark goal states (e.g., charging stations) by single-clicking to turn them green.
- Blocked States: Mark obstacles or inaccessible areas by double-clicking to turn them black.

Step 2: Adjust Parameters and Grid Size

Modify algorithm parameters and grid size through the Control Menu.

Step 3: Q Value Representation

The grid represents the Q values for each action: Left, Up, Right, and Down.

Step 4: Advancing Steps in Iteration

Click "Next Value" to proceed to the next step in the current iteration.
- The step count will increase with each click.
- Upon reaching a terminal state or the maximum steps allowed per iteration, the iteration count will increase, and the step count resets to 0.

Step 5: Progressing to the Next Iteration

Click "Next Iteration" to move on to the subsequent iteration.

Step 6: Policy Visualization

The arrows in the left grid show the currently learned policy based on the Q values.

Step 7: Achieving Optimal Policy

A notification will be displayed when the Q values for all actions converge, indicating the selection of an Optimal Policy.