Are neural networks data or code? Game developers always have an interesting twist...
There is no answer citing whether or not a neural network is data or code, instead a game developer will tell you to use what works. Hard-coded designs operate quite a bit faster than data consuming designs for many different reasons, but for the same reasons they lose their flexibility. That doesn't always have to be the case, as some new and improved .NET classes start to provide us with more dynamic ways of creating optimized code. Let's break the neural network problem down.
A neural network in an object-oriented world consists of a overall container or network class that consumes a series of network layers. Each network layer in turns contains a series of nodes or neurons that contain an activation function. Finally a whole bunch of connections are drawn, with weights, from the current layer to the previous layer. In this manner, the input layer is actually accessed by the next layer. If this sounds odd, don't worry, there is a lot of performance work to back up the entire process.
The important issue here is the amount of data that is jumping around. In an object-oriented design, you are most likely going to be thrashing your memory running and training the network. An ideal approach would offset as much memory access as possible, perhaps by using code constants and storing the smallest number of intermediate values on the stack itself. When you start to say it like that it sure doesn't sound like a data-driven approach anymore. It sounds like a really large inlined function, and that is exactly what a neural network in code would be.
Many users don't realize that a neural network can be reduced to code. After all, the values constantly change right? Well, once a network is trained you can fix it, turn off any training, and remove the backpropagation routines. Now the only thing that changes each round is the input values and the output values. In effect, it becomes an inlined function with some output parameters for any extra output neurons. This trained neural network is what you actually have running in most of your video games. It runs fast, it never breaks, and it is usually initialized with a bunch of gameplay data from playtesting before the game ships. If new techniques are found afterwards, oh well, the computer will never learn.
Some games do ship with an adaptive neural network in play. These are generally thinker games where it doesn't matter how long the AI takes in order to process a move. They do suffer the impacts of having the neural network break though. You can trick the computer into training a particular way and then rob it. There are some games based on another AI tool, called genetic programming, that use fitness against the player to determine if the computer should perform better. In these games you can play just a bit better than the computer, so that it doesn't pick it's best options to combat you. It considers you a mediocre player and thus plays you that way. Neural networks behave in a similar manner, and are trained with data from the game. If you provide poor data samples by having an intimate knowledge of the mechanics, the game starts to break down.
Lately, I've been looking more and more at neural networks as code, and less as data. There are some excellent training routines that allow batching and are actually fairly good at post-processing an encounter and then improving afterwards. I think humans play this way too. You get better by thinking about your performance after the fact, and only gradually increase skills while actually playing a game. The benefits I'm implying are the ability to have faster in-game code networks that take advantage of all of the various processor caches and locality rules of today's microprocessors, but at the same time retrain the network later and reduce it back down to new code. Neural networks programmed in this manner are even more adaptive than existing data networks because of some features that I'll note now:
- A neural network compiler is able to reduce unused connections without the indirection of a connection graph between the nodes. Modern day data networks consider a fully connected graph a performance enhancement because conditional connection checks and connection data look-ups do not have to be made. Compiled neural networks remove redundancy and operate without connection look-ups.
- Batch training is run on a generic data based neural network using the previously compiled network's weights. Any previously reduced connections can now be used if required by the new data. Neural networks discover features and more neurons or layers allows for the discovery of more features in data at the cost of speed. Reduction of connections and removal of extra layers improves performance. Batch training can draw a balance as compiled networks can be retrained to grow or lose connections as deemed necessary.
- Locality, stack-reuse, and a whole bunch of other performance wins. By inlining the entire network you get great locality between all of the data needed by a particular expression evaluation in the network. Further, stack space used for intermediate storage can be rolled between layers (re-used on a pair-wise basis) rather than allocating space for every layer. In short enough stack space for the two largest layers is all that needs to be allocated, and possibly less.
I said that game developers have an interesting twist in the beginning. The twist is that neural networks are a scalpel and you don't always need a scalpel. You can cut your steak with a scalpel, sure, the butter spreads just as nicely with one, and in a pinch you can ward off an assailant. Point is there are other tools that are specialized to those circumstances and that works just as well as the neural network would. What they are great for is verifying that you can cut the steak. If you can cut the steak with a scalpel then maybe you can cut it with a steak-knife as well. Often complex problems are reduced down to a very simple and solvable problem set that can then be implemented in a finite-state machine or swizzle. Even further, if you have a particular method in your game you can test, using a neural network, for hidden features that you hadn't found yet. In this way neural networks can reduce themselves out of the final equation (thank you for doing that) and even give you piece of mind that your existing algorithm really is good enough to handle all of the features in the data (thanks again). If neural network compilers every catch on, maybe we'll see them optimizing the final code into a finite state machine or fixed function swizzle!