That's Right, I Game - NFHN Reader

Making an AI to play Mario Golf

Mario Golf is the first of six current entries in the Mario Golf series (no, NES Open Tournament Golf doesn't count). It was developed by Camelot software and published in 1999, and received positive reviews as both a Mario game and a golf simulator. I own the japanese release of the game, both because it's cheaper and because the box art rivals "All your base are belong to us".

I've been wanting a reason to gain more exposure to machine learning and neural networks, and teaching a computer to golf seemed like a great opportunity. Golf has a small number of clearly defined inputs and outputs, which in theory makes training a machine learning model simple and effective. In practice, getting good data was less straightforward than I imagined, but I ended up with an AI that I'm alright with. I'll be going through the details of both creating an AI and linking it to an emulator in this post, but if you're only interested in the results you can skip to the end.

I'll be emulating the game using Project64 and mostly writing the code in Perl because I need to learn the language. As a result, I'd highly discourage studying my code too much.

Inputs & Outputs

The first step of the project was figuring out how to read data from an emulator to process, and then control the game based on the results. Fortunately, Perl makes reading data from a seperate process in Windows very easy through the Win32 module. Finding which data to read is also a well-documented process. Project64, like many emulators, has a memory searching tool which allows you to find the ram addresses of values. Finding the values you need is then a matter of guessing how the game's variables are stored in memory, and narrowing your search based on how you believe those variables have changed. Here's a table of the memory addresses I found (some may be a bit off because of weird <4 byte differences between addresses according to the emulator and Win32 that I can't be bothered to recheck):

And here's the Perl code to read them:

There were two values I was hoping to find that I couldn't. The first is some sort of register that controls controller input. While I found a lot of addresses that mirrored controller input, writing values to them didn't seem to affect the game. It's possible that I have to write to the address at the start of the frame for it to have an effect, but regardless it doesn't seem like there's a straightforward way to control controller input through memory.

The next value is lie, or the slope of the ground the ball is on. This affects the trajectory of the shot and needs to be adjusted for. But while I couldn't find the lie, I did find the position of the player, who circles around the ball as you angle your shot. Using the positions of the player and the ball, I can get a good enough value for the slope.

Output is pretty simple. Because I can't modify memory to simulate key presses, I used a Perl module called Win32::GuiTest, which has a subroutine called SendRawKey to simulate pressing or releasing a certain key. I can't simulate joystick inputs using this method which limits my control of the contact point of each stroke, but apart from that it works well. The code is messy and repetitive so I won't include it here, but if you really want to see it it's in my Github profile.

Gathering Training Data

My initial method of gathering training data was to go to various points in different holes and determine the shot which landed closest to the flag. I made a vaguely binary search-y algorithm to find an optimal rotation, power level, and contact point for each stroke and collected data from 64 different save states. This failed miserably. After days of troubleshooting, I had figured out a few flaws in my method.

While gathering data in actual courses is realistic, there's way too much variation in conditions I don't record to determine meaningful trends, like the random decrease in power based on the surface I hit from or the slope the ball landed on.
The method I was using to find the optimal shot required about 100 strokes. This probably could have been optimized, but I would still be taking tens of strokes per data point.
Because I couldn't simulate joystick input, there were only nine contact points available (-1, 0, or 1 in both dimensions). I think this was too limited to effectively predict.

I scrapped the contact point, but the other two problems were more difficult to address. To solve the second problem, I could just take a random shot and then pretend that the hole was wherever it landed. This would still be limited by the issue of random course conditions, though. If only there were a way to take shots from a tee on a completely flat course.

Using the driving range does come with its own problems. There won't be any elevation change between the tee and the ground and the surface will always be flat, meaning that the only parameters about the course that I can change are the distance to the "hole" and the wind. The only other parameters I'd really need to consider are the lie and the elevation change to the hole, though, and at this point I was happy to adjust for those outside of the neural network. I ended up running a script to have Wario whack the ball around like a maniac and gathered over 5000 data points overnight.

The Basics of a Neural Network

There are a lot of methods that fall under the "AI" umbrella, but I'll be using a feed-forward neural network. If you're unfamiliar with neural networks, I'll briefly describe them here. If you'd like to learn more about them, there are plenty of other resources more qualified to explain them than I am.

Many of the problems you run into in computing are straightforward. You have inputs and a function, and need to produce outputs.

Input	Function	Output
n = 2	f(x) = 3x + 1	f(n) = ?

But sometimes, you'll have inputs and outputs, but no function. Let's say you're doing a study on toothpaste brands, and you know the concentrations of flouride and sorbitol in each brand, and the number of dentists that recommend it.

Flouride (ppm)	Sorbitol (%)	Dentists (/10)
1130	70	2
1185	30	3
1250	20	5
1350	30	9
1550	10	7
1590	40	8
1675	50	6
1715	50	4

You want to make a toothpaste that 10/10 dentists recommend, so you need to figure out the relationship between ingredients and reception. The tricky part is that this relation could take any form. It could be linear, quadratic, cubic, exponential, logarithmic, or some combination of types. How can you teach a program to learn any arbitrary relationship? The answer is neural networks.

In a paragraph, neural networks work like this:
A neural network is made of layers of nodes, with the first layer providing the inputs and the last layer producing the outputs. To get the value of each node, you perform algebra on the nodes in the previous layer and then pass the result through an activation function, like tanh or ReLU. To train the network, you can compare the values of the output nodes to their expected values and adjust the connections in the network accordingly.

The actual implementation of a neural network is more complicated, but that's the gist. Applying a neural network to our toothpaste problem (2 inputs -> 2 hidden nodes (sigmoid) -> 1 output node (sigmoid)), we get the following results:

We can see that dentists prefer toothpaste with an amount of flouride around 1440 ppm and high amounts of sorbitol. Rather than bothering them with a flood of toothpastes to rate, you can use past data to approximate what they like.

Teaching the AI

And now, it's time for the moment you've all been waiting for...

To train the AI, I used a pretty simple feed-forward neural network. It has:

7 inputs

Distance to hole / club's max distance
Distance to hole / 1W max distance (yes, that's probably slightly redundant)
Wind's x direction relative to the player
Wind's z direction relative to the player
Lateral component of club's launch vector (high for driver, low for sand wedge)
Vertical component of club's launch vector (low for driver, high for sand wedge)
Club's draw/fade (how much the ball hooks after being hit)

One hidden layer with 6 nodes using ReLU as an activation function

2 outputs again using ReLU

Rotation (mapped from [-7, 7] to [0, 1])
Power (mapped from [0, 31] to [0, 1])

After training the network for around 30000 epochs, I reached a mean squared error of just over 0.0025. This means that on average, the power level was off by 1.6 and the rotation was off by 0.7, although one or the other could have dominated the error.

This problem was pretty simple, so I only needed one hidden layer. I probably could have gotten away with using fewer hidden nodes as well, but this is what I settled on. I mainly tested the network with sigmoid and ReLU as activation functions, and ReLU performed faster. I'd guess this means that relationship between the inputs and outputs is pretty linear, although that could be entirely wrong. Regardless, the network performs at least well enough to take logical shots.

To account for elevation changes, I just added the change in elevation to the distance so that the AI would hit softer shots if the hole is below it and longer shots if it's above it. To account for the lie, I figured out how much the ground sloped in the x direction relative to the player and rotated a little to compensate. I tried adjusting the parameters dealing with the launch vector based on the slope, but that seemed to perform poorly for whatever reason. This network was trained on all clubs except for the putter. The putter is different enough that it needs its own logic, and rather than figure out how to collect meaningful data to train another network I just coded it myself.

I trained the neural network in Python because I couldn't get any good modules working in Perl, so I have my Perl script send a request to a Python script with the inputs to the network. To avoid slow startup times from both Python and Perl, both scripts run continuously and requests and responses are sent by updating txt files. The Python code is a mess and the neural network part is covered earlier in this post, so I'll just share the Perl script:

Is a neural network overkill for such a simple problem? Probably. But would it be fun to slog through developing an accurate model otherwise? No. And if I graph the error of the AI over the course of creating it, it probably looks something like this:

Maybe the real neural network was the knowledge we gained along the way.