Snake Genetic Model

February 14, 2020

Overview

I wanted to build a neural network to play snake. My original idea was to use reinforcement learning, but I found a genetic solution and it seemed pretty simple to recreate.

Credit where it's due

I am basically porting this awesome repo to python in order to understand how the solution works.

My Final Product

Best performer

Code here.

Goals

To get a neural network performing decently on a task.
Refresh my memory of how to use tesorflow and keras. (I've used them before, but never had much success with any of my projects.)
Enforce some python best practices I haven't been as rigorous about in the past. These include writing unit tests and following standard project structure.

Genetic algorithm

This model uses a genetic algorithm. That means that rather than optimizing weights by training on lots of examples, I am simply randomly initializing a bunch of players and letting them play. Their fitness is based on a scoring function, which considers both the overall performace and the length of the game played. I take the top performers and "breed" them to make the next generation of players. This process is repeated until the fitness stabilizes.

Performance metric

When a player plays, the game state iterates a variable (call it duration) each time the snake moves (i.e., for each time the game state advances by one frame). Each time the snake catches a prize, the game state iterates the score (call it score). The fitness metric function for the game is tunable by the user (located in the InitConfig class), but what I have settled on is of the form

fitness = np.log(1 + A*duration) + B*score

Breeding

Each generation, the program spawns generation_size players. Of those, it takes the top number_to_breed performers based on the above metric, and breed pairs at random with replacement to form generation_size more players for the next generation. To breed a pair of top performers, it chooses some of the weights from one and some from the other. I also apply Gaussian noise to introduce some randomness to the child. The variance of this noise can be thought of as a mutation rate, and is configurable by the user.

Remark

The repo I'm borrowing from did this in a nonrandom fashion, by performing a "crossover" operation for producing the weights in the child for each layer. In this operation, an index is selected uniformly at random, all child weights lexicographically before that index are drawn from the left parent layer, while all others are drawn from the right parent layer. I tried also selecting values uniformly, but the former algorithm yielded better breeding.

The neural network

The architecture of the network is somewhat customizable by setting the parameters num_hidden_layers and hidden_layer_size in the InitConfig class. I use a standard fully connected network with an input layer of size 24, num_hidden_layers hidden layers each of size hidden_layer_size, and an output of dimension 4. Unfortunately, these cannot be changed without starting over. The inputs encode that the snake is looking in 8 directions (four cardinal and four diagonal). In each direction, it is looking for a prize, distance to a wall, and itself. The outputs encode which direction to move.