A Java machine learning demonstration.


This code learns how to play a perfect game of tic-tac-toe
without instructions on how to do so from the developer.

This program demonstrates that machine learning is possible.

Anyone can code up a program to play tic-tac-toe,
I wouldn't waste your time with that.

I want the computer to learn the rules and correct
its own behavior by itself.

Get the java source here:
Runner.java

Code written by Eric Leschinski:




Quick demonstration of how this works:

These are Index Positions of the board, 0 through 8:
 0 | 1 | 2     
-----------
 3 | 4 | 5 
-----------
 6 | 7 | 8
Below is a 'State' of the tic-tac-toe board.
 X |   | O
---|---|---
   | X | X
---|---|---
 O |   | O
Consider the above state, your X and it's your turn. Where do you move?

Here is where the my learning algorithm surprised me with a new discovery. When my learning algorithm was presented with this situation, it took into account it's own "guess and learn" inadequacies and chooses a situation that forces a board win over an immediate win. Which I did not expect.
cupContents (the probability of picking a position):
'11333333333377777777773333333377
777777777777777777777777777777'

Moving to position 1 results in a loss. 1 is small in cupcontents, that's as I expect. 7 is the most likely choice and 3 second most likely.
Why is index position 7 a more desirable play? Position 3 is the immediate win!

Is the learning algorithm broken? Not at all, position 7 creates the triangle 'X' formation which forces a win even if X's learning is sub-optimal. The learning algorithm is really evolving. It is making provision for what it knows it might not know. Humans do the same thing, we choose a bird in the hand over 2 in the bush.

Humans understand that in complex situations, the liklihood for mistakes rise, making simpler lower reward states more desirable, my program has discovered that truth in it's own way.

I talk about this in much more depth on my programming topics blog: http://sentientmachine.blogspot.com/2009/07/l07-java-code-to-learn-perfect-game-of.html