Sunday, 30 December 2012

Some analysis of the game shut the box

This Christmas, +Zoe Prytherch and I got my mother a small board game called: "Shut the Box" (also known as "Tric-Trac", "Canoga", "Klackers", "Zoltan Box", "Batten Down the Hatches", or "High Rollers" according to the wiki page). It's a neat little game with all sorts of underlying mathematics.
Here's a picture of us playing:

Here's a close up of the box:

A good description of the game can be found on the wiki page but basically the game can be described as follows:
  • It is a Solitaire game that can be played in turns and scores compared (but no strategies arise due to interactions of players).
  • The aim is to "shut" as many tiles (each being one of the digits from 1 to 9) as possible.
  • On any given go, a player rolls two dice and/or has a choice of rolling one dice if the tiles 7,8,9 are "shut".
  • The sum total of the dice roll indicates the sum total of tiles that must be "shut". If I roll 11, I could put down [2,9],[3,8],[1,2,8] etc...
  • The game ends when a player can't "shut" tiles corresponding to the dice roll (including the case when all tiles are "shut").
Here's a plot of a game we played with a few of my mother's friends:

You can see that myself and Jean did fairly poorly while Zoë made the difference just at the end to win :)

The game could be modelled as a Markov decision process but over the past few days I decided to code it in Sage and investigate whether or not I could find out anything cool with regards to strategies. If you're not aware of Sage I thoroughly recommend you taking a look, it's an awesome open source mathematics package. In this instance I used the Partitions function to rapidly obtain various combinations of dice rolls and/or tile options to get the results I wanted.

The code is all in a github repo and I feel I've written an ok README file explaining how it works so if you're interested in playing with it please do! The code basically has two parts, the first allows you to play using the program instead of a game board (with prompts telling you what your available plays are).

Here's a quick screencast of me demonstrating some of the code:

The second part of the code allows for strategies to be written that play the game automatically ("autoplay"). I've considered 4 strateges:
  • Random (Just randomly pick any potential tile combination)
  • Shortest (Choose the tile combination that has shortest length: i.e. [3] instead of [1,2] ~ in case of a tie pick randomly)
  • Longest (Choose the tile combination that has longest length: i.e. [1,2] instead of [3] ~ in case of a tie pick randomly)
  • Greedy. This chooses the tile combination that ensures the best possible chance of the game not ending on the next go. This is calculated by summing the probabilities of obtaining a dice roll that could be obtained.
I've been leaving my computer run the code in the background for a while now so here are some early results. I've got over 90000 instances for Greedy and Random with over 40000 instances for the other two which I thought of later (EDIT: I know have over 400000 instances...). I've collected (as and when one of my home boxes was ideal) more data since writing this and the file is available for anyone to play with here. If I've made any mistakes with my analysis, I'd love to hear it!.

First of all what does the average score look like:

Greedy_Score   Random_Score     Longest_Score   Shortest_Score  
Min.   : 0.00  Min.   : 0.00    Min.   : 0.00   Min.   : 0.00  
1st Qu.: 6.00  1st Qu.:14.00    1st Qu.:18.00   1st Qu.: 7.00   
Median :11.00  Median :21.00    Median :24.00   Median :11.00
Mean   :11.42  Mean   :20.45    Mean   :24.16   Mean   :12.06  
3rd Qu.:16.00  3rd Qu.:27.00    3rd Qu.:30.00   3rd Qu.:17.00  
Max.   :43.00  Max.   :43.00    Max.   :43.00   Max.   :43.00  

Shortest and Greedy seem to do a fair bit better then the other two. If we take a look at the distribution of the scores that is confirmed:

If we also take a look at the length of the game (i.e. how many total dice rolls there were):

Greedy_Length  Random_Length  Longest_Length  Shortest_Length
Min.   :1.000  Min.   :1.000  Min.   :1.00    Min.   :1.00    
1st Qu.:4.000  1st Qu.:3.000  1st Qu.:2.00    1st Qu.:4.00    
Median :5.000  Median :3.000  Median :3.00    Median :5.00    
Mean   :4.897  Mean   :3.414  Mean   :2.86    Mean   :4.82    
3rd Qu.:6.000  3rd Qu.:4.000  3rd Qu.:4.00    3rd Qu.:6.00    
Max.   :9.000  Max.   :9.000  Max.   :8.00    Max.   :9.00    

We see that games last longer when using Greedy and Shortest. If we look at the distribution we see (I've updated this graph since first publishing this post):

It looks like Greedy might allow for slightly longer games than Shortest. Whether or not playing Shortest or Greedy is actually any different seems interesting. A simple analysis of variance (ANOVA) seems to indicate that there is a significant effect:
Df Sum Sq Mean Sq F value P
stbaov$Method 1 11337 11337 196.3 2e-16
Residuals 127169 7345725 58

This is all experimental of course and analysing whether or not my suggested greedy strategy is indeed the optimal strategy would require some markov decision process modelling. I might look in to this with an undegraduate project student next year.

The tricky part about using the greedy method whilst actually playing the game with friends is that it requires a fair bit of calculation. Nothing that Sage can't handle in a split second but something that could keep people waiting a fair bit if you were to work it out with a pen and paper. Despite the significant statistical difference between the methods this analysis seems to indicate that choosing the shortest set of tiles is a pretty good strategy so if in doubt I recommend choosing that...

If anyone cares enough about all this to fork the code or suggest a different strategy I'd be delighted :) In case you missed it above, here's the github repo:

(On a technical note some of the pre analysis is done using python and the actual graphs seen here are done using R)