ReInventing Neural Networks - Part 3






4.60/5 (7 votes)
Now that we got the basics over with, it's time for improvement!
The Full Series
- Part 1: We create the whole
NeuralNetwork
class from scratch. - Part 2: We create an environment in Unity in order to test the neural network within that environment.
- Part 3: We make a great improvement to the neural network already created by adding a new type of mutation to the code.
Introduction
Before I start, I got to acknowledge that I'm 16 years old, because I kinda noticed people get hiccups when they know that, so I thought... maybe if people knew, I would get something more out of it. Just sayin.
Welcome Back Fellas! About a month ago, I got Part 2 of this series posted. After that, I got really caught up doing a bunch of other stuff which couldn't let me continue doing what I really want to do (Sorry about that). However, a few days ago, Michael Bartlett sent me an email asking about a GA operator called safe mutation (or SA for short) (referring to this paper).
Background
To follow along this article, you'll need to have basic C# and Unity programming knowledge. Also you're going to need to have read Part 1 and Part 2 of this series.
Understanding the Theory
According to the paper sent by Michael, it appears that there are two types of SM operators:
- Safe Mutation through Rescaling: In a nutshell, that's just like tuning the weight a bit to know how much this bit affects the output, and according to that, make an informed mutation.
- Safe Mutation through Gradients: That's a similar approach to the one used in backpropagation to get the gradient of each weight, and according to that, also make an informed mutation.
That still, however didn't give me enough information to start coding, so I kept looking till I found this paper. This paper shows how great of an improvement you can get if you just use some of the information that's right in front of you. If you check Figure 4 in that paper, you can see that unbiased mutation gives the worst results, while node mutation gives exponentially better results! Heck yeah, it even matches crossover!
It's even compared with safe mutation (through gradients) and backpropagation on figures 7 and 8. It's either a tie with safe mutation, and node mutation crushes backpropagation...
Ok then... So what's that magical "Mutate Nodes" operator? Well... normal mutation just kinda selects some weights and mutates them as follows:
However, node mutation selects a few nodes and mutates all of the weights coming to it:
Using the Code
Well, the code is pretty straight forward. First, add the MutateNodes
function to the NeuralSection
class:
/// <summary> /// Mutate the NeuralSection's Nodes. /// </summary> /// <param name="MutationProbablity">The probability that a node is going /// to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount">The maximum amount /// a Mutated Weight would change.</param> public void MutateNodes(double MutationProbablity, double MutationAmount) { for (int j = 0; j < Weights[0].Length; j++) // For each output node { if (TheRandomizer.NextDouble() < MutationProbablity) // Check if we are going // to mutate this node { for (int i = 0; i < Weights.Length; i++) // For each input node // connected to the // current output node { Weights[i][j] = TheRandomizer.NextDouble() * (MutationAmount * 2) - MutationAmount; // Mutate the // weight connecting both nodes } } } }
Then, add the caller function to the NeuralNetwork
class:
/// <summary> /// Mutate the NeuralNetwork's Nodes. /// </summary> /// <param name="MutationProbablity">The probability that a node /// is going to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount"> /// The maximum amount a Mutated Weight would change.</param> public void MutateNodes(double MutationProbablity = 0.3, double MutationAmount = 2.0) { // Mutate each section for (int i = 0; i < Sections.Length; i++) { Sections[i].MutateNodes(MutationProbablity, MutationAmount); } }
This is pretty much it! That's how the NeuralNetwork.cs should look now:
using System; using System.Collections.Generic; using System.Collections.ObjectModel; public class NeuralNetwork { public UInt32[] Topology // Returns the topology in the form of an array { get { UInt32[] Result = new UInt32[TheTopology.Count]; TheTopology.CopyTo(Result, 0); return Result; } } ReadOnlyCollection<UInt32> TheTopology; // Contains the topology of the NeuralNetwork NeuralSection[] Sections; // Contains the all the sections of the NeuralNetwork Random TheRandomizer; // It is the Random instance used // to mutate the NeuralNetwork private class NeuralSection { private double[][] Weights; // Contains all the weights of the section // where [i][j] represents the weight from // neuron i in the input layer and neuron j // in the output layer private Random TheRandomizer; // Contains a reference to the Random instance // of the NeuralNetwork /// <summary> /// Initiate a NeuralSection from a topology and a seed. /// </summary> /// <param name="InputCount">The number of input neurons in the section.</param> /// <param name="OutputCount">The number of output neurons in the section.</param> /// <param name="Randomizer">The Ransom instance of the NeuralNetwork.</param> public NeuralSection(UInt32 InputCount, UInt32 OutputCount, Random Randomizer) { // Validation Checks if (InputCount == 0) throw new ArgumentException("You cannot create a Neural Layer with no input neurons.", "InputCount"); else if (OutputCount == 0) throw new ArgumentException("You cannot create a Neural Layer with no output neurons.", "OutputCount"); else if (Randomizer == null) throw new ArgumentException("The randomizer cannot be set to null.", "Randomizer"); // Set Randomizer TheRandomizer = Randomizer; // Initialize the Weights array Weights = new double[InputCount + 1][]; // +1 for the Bias Neuron for (int i = 0; i < Weights.Length; i++) Weights[i] = new double[OutputCount]; // Set random weights for (int i = 0; i < Weights.Length; i++) for (int j = 0; j < Weights[i].Length; j++) Weights[i][j] = TheRandomizer.NextDouble() - 0.5f; } /// <summary> /// Initiates an independent Deep-Copy of the NeuralSection provided. /// </summary> /// <param name="Main">The NeuralSection that should be cloned.</param> public NeuralSection(NeuralSection Main) { // Set Randomizer TheRandomizer = Main.TheRandomizer; // Initialize Weights Weights = new double[Main.Weights.Length][]; for (int i = 0; i < Weights.Length; i++) Weights[i] = new double[Main.Weights[0].Length]; // Set Weights for (int i = 0; i < Weights.Length; i++) { for (int j = 0; j < Weights[i].Length; j++) { Weights[i][j] = Main.Weights[i][j]; } } } /// <summary> /// Feed input through the NeuralSection and get the output. /// </summary> /// <param name="Input">The values to set the input neurons.</param> /// <returns>The values in the output neurons after propagation.</returns> public double[] FeedForward(double[] Input) { // Validation Checks if (Input == null) throw new ArgumentException("The input array cannot be set to null.", "Input"); else if (Input.Length != Weights.Length - 1) throw new ArgumentException ("The input array's length does not match the number of neurons in the input layer.", "Input"); // Initialize Output Array double[] Output = new double[Weights[0].Length]; // Calculate Value for (int i = 0; i < Weights.Length; i++) { for (int j = 0; j < Weights[i].Length; j++) { if (i == Weights.Length - 1) // If is Bias Neuron Output[j] += Weights[i][j]; // Then, the value of the neuron // is equal to one else Output[j] += Weights[i][j] * Input[i]; } } // Apply Activation Function for (int i = 0; i < Output.Length; i++) Output[i] = ReLU(Output[i]); // Return Output return Output; } /// <summary> /// Mutate the NeuralSection. /// </summary> /// <param name="MutationProbablity">The probability that a weight /// is going to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount">The maximum amount a Mutated Weight /// would change.</param> public void Mutate(double MutationProbablity, double MutationAmount) { for (int i = 0; i < Weights.Length; i++) { for (int j = 0; j < Weights[i].Length; j++) { if (TheRandomizer.NextDouble() < MutationProbablity) Weights[i][j] = TheRandomizer.NextDouble() * (MutationAmount * 2) - MutationAmount; } } } /// <summary> /// Mutate the NeuralSection's Nodes. /// </summary> /// <param name="MutationProbablity">The probability that a node /// is going to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount">The maximum amount a Mutated Weight /// would change.</param> public void MutateNodes(double MutationProbablity, double MutationAmount) { for (int j = 0; j < Weights[0].Length; j++) // For each output node { if (TheRandomizer.NextDouble() < MutationProbablity) // Check if we are // going to mutate this node { for (int i = 0; i < Weights.Length; i++) // For each input node // connected to the current // output node { Weights[i][j] = TheRandomizer.NextDouble() * (MutationAmount * 2) - MutationAmount; // Mutate the weight // connecting both nodes } } } } /// <summary> /// Puts a double through the activation function ReLU. /// </summary> /// <param name="x">The value to put through the function.</param> /// <returns>x after it is put through ReLU.</returns> private double ReLU(double x) { if (x >= 0) return x; else return x / 20; } } /// <summary> /// Initiates a NeuralNetwork from a Topology and a Seed. /// </summary> /// <param name="Topology">The Topology of the Neural Network.</param> /// <param name="Seed">The Seed of the Neural Network. /// Set to 'null' to use a Timed Seed.</param> public NeuralNetwork(UInt32[] Topology, Int32? Seed = 0) { // Validation Checks if (Topology.Length < 2) throw new ArgumentException("A Neural Network cannot contain less than 2 Layers.", "Topology"); for (int i = 0; i < Topology.Length; i++) { if (Topology[i] < 1) throw new ArgumentException("A single layer of neurons must contain, at least, one neuron.", "Topology"); } // Initialize Randomizer if (Seed.HasValue) TheRandomizer = new Random(Seed.Value); else TheRandomizer = new Random(); // Set Topology TheTopology = new List<uint>(Topology).AsReadOnly(); // Initialize Sections Sections = new NeuralSection[TheTopology.Count - 1]; // Set the Sections for (int i = 0; i < Sections.Length; i++) { Sections[i] = new NeuralSection (TheTopology[i], TheTopology[i + 1], TheRandomizer); } } /// <summary> /// Initiates an independent Deep-Copy of the Neural Network provided. /// </summary> /// <param name="Main">The Neural Network that should be cloned.</param> public NeuralNetwork(NeuralNetwork Main) { // Initialize Randomizer TheRandomizer = new Random(Main.TheRandomizer.Next()); // Set Topology TheTopology = Main.TheTopology; // Initialize Sections Sections = new NeuralSection[TheTopology.Count - 1]; // Set the Sections for (int i = 0; i < Sections.Length; i++) { Sections[i] = new NeuralSection(Main.Sections[i]); } } /// <summary> /// Feed Input through the NeuralNetwork and Get the Output. /// </summary> /// <param name="Input">The values to set the Input Neurons.</param> /// <returns>The values in the output neurons after propagation.</returns> public double[] FeedForward(double[] Input) { // Validation Checks if (Input == null) throw new ArgumentException("The input array cannot be set to null.", "Input"); else if (Input.Length != TheTopology[0]) throw new ArgumentException("The input array's length does not match the number of neurons in the input layer.", "Input"); double[] Output = Input; // Feed values through all sections for (int i = 0; i < Sections.Length; i++) { Output = Sections[i].FeedForward(Output); } return Output; } /// <summary> /// Mutate the NeuralNetwork. /// </summary> /// <param name="MutationProbablity">The probability that a weight /// is going to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount">The maximum amount a mutated weight would change.</param> public void Mutate(double MutationProbablity = 0.3, double MutationAmount = 2.0) { // Mutate each section for (int i = 0; i < Sections.Length; i++) { Sections[i].Mutate(MutationProbablity, MutationAmount); } } /// <summary> /// Mutate the NeuralNetwork's Nodes. /// </summary> /// <param name="MutationProbablity">The probability that a node /// is going to be mutated. (Ranges 0-1)</param> /// <param name="MutationAmount">The maximum amount a Mutated Weight would change.</param> public void MutateNodes(double MutationProbablity = 0.3, double MutationAmount = 2.0) { // Mutate each section for (int i = 0; i < Sections.Length; i++) { Sections[i].MutateNodes(MutationProbablity, MutationAmount); } } }
And... No I'm not gonna leave you like that. Now it's time to make tiny little changes to the cars demo made in Unity previously, so that we can see the difference for ourselves. Let's first go to EvolutionManager.cs in our Unity project and add this variable at the beginning of the script:
[SerializeField] bool UseNodeMutation = true; // Should we use node mutation?
Let's also put this variable to use by replacing the call to Car.NextNetwork.Mutate()
inside the StartGeneration()
function with that:
if(UseNodeMutation) // Should we use Node Mutation Car.NextNetwork.MutateNodes(); // Mutate its nodes else Car.NextNetwork.Mutate(); // Mutate its weights
This way, EvolutionManager.cs should end up looking like that:
using System.Collections; using System.Collections.Generic; using UnityEngine; using UnityEngine.UI; public class EvolutionManager : MonoBehaviour { public static EvolutionManager Singleton = null; // The current EvolutionManager Instance [SerializeField] bool UseNodeMutation = true; // Should we use node mutation? [SerializeField] int CarCount = 100; // The number of cars per generation [SerializeField] GameObject CarPrefab; // The Prefab of the car to be created // for each instance [SerializeField] Text GenerationNumberText; // Some text to write the generation number int GenerationCount = 0; // The current generation number List<Car> Cars = new List<Car>(); // This list of cars currently alive NeuralNetwork BestNeuralNetwork = null; // The best NeuralNetwork // currently available int BestFitness = -1; // The FItness of the best // NeuralNetwork ever created // On Start private void Start() { if (Singleton == null) // If no other instances were created Singleton = this; // Make the only instance this one else gameObject.SetActive(false); // There is another instance already in place. // Make this one inactive. BestNeuralNetwork = new NeuralNetwork(Car.NextNetwork); // Set the BestNeuralNetwork // to a random new network StartGeneration(); } // Sarts a whole new generation void StartGeneration () { GenerationCount++; // Increment the generation count GenerationNumberText.text = "Generation: " + GenerationCount; // Update generation text for (int i = 0; i < CarCount; i++) { if (i == 0) Car.NextNetwork = BestNeuralNetwork; // Make sure one car uses the best network else { Car.NextNetwork = new NeuralNetwork(BestNeuralNetwork); // Clone the best // neural network and set it to be for the next car if(UseNodeMutation) // Should we use Node Mutation Car.NextNetwork.MutateNodes(); // Mutate its nodes else Car.NextNetwork.Mutate(); // Mutate its weights } Cars.Add(Instantiate(CarPrefab, transform.position, Quaternion.identity, transform).GetComponent<Car>()); // Instantiate // a new car and add it to the list of cars } } // Gets called by cars when they die public void CarDead (Car DeadCar, int Fitness) { Cars.Remove(DeadCar); // Remove the car from the list Destroy(DeadCar.gameObject); // Destroy the dead car if (Fitness > BestFitness) // If it is better that the current best car { BestNeuralNetwork = DeadCar.TheNetwork; // Make sure it becomes the best car BestFitness = Fitness; // And also set the best fitness } if (Cars.Count <= 0) // If there are no cars left StartGeneration(); // Create a new generation } }
After stirring it all up and hitting play, you get this:
Points of Interest
It was fantastic to see such a great improvement from the last article in the training just by making a really simple addition. Now that we've got some improvement, it's your turn to tell me what you think about all this. And what do you think I should do next? And, should I make youtube videos regarding AI and stuff like that, or should I stick with articles?
History
- Version 1.0: Main implementation