[Mllab] exercise sheet 5 - doubt - mllab@ins.uni-bonn.de

30 Jun 2019

      Dear Prof. Garke,

I have a doubt about the first task of the current sheet.
While my implementation of the value iteration algorithm works fine for the
TicTacToe, it does not work for the LGame.

Playing around with the LGame in the given template, I have seen that in
all the states of LGame.unique_states the player who has to play is always
Player 1.
Similarly the only possible winner in all the states is Player 2.

In the Bellman equation the reward therefore will always be zero as Player
1 can never win (since Player 2 never moves and therefore cannot lose). I
am sure I am misunderstanding something but I am not able to figure out
what.

Thank you for your time,
Best regards,
Valerio Cini

[Mllab] exercise sheet 5 - doubt

Valerio Cini

Jannik Schürg

tags

participants (2)