This is the 4th article of creating .NET library, here are links for Part 1 & 2:
Build Simple AI .NET Library - Part 1 - Basics First
Build Simple AI .NET Library - Part 2 - Machine Learning Introduction
Build Simple AI .NET Library - Part 3 - Perceptron
My objective is to create a simple AI library that covers couple of advanced AI topics such as Genetic algorithms, ANN, Fuzzy logics and other evolutionary algorithms. The only challenge to complete this series would be having enough time working on code and articles.
Having the code itself might not be the main target however, understanding these algorithms is. wish it will be useful to someone someday.
Article Introduction - Part 2 "Beyond Perceptron"
At last article, we have created Perceptron that acts as Binary Linear Classifier, will continue discussion about Perceptron to create more complicated layout for more complicated problems.
Strongly advise to review Build Simple AI .NET Library - Part 2 - Machine Learning Introduction before moving any further in this article.
More Perceptron Examples
As mentioned, Perceptron is simplest processing element of ANN, yet it is still powerful algorithm however it is very limited. Remember, it's mainly used only as binary linear classifier.
What about other complicated classifications problems, what about binary but non-linear classifications. will see through couple of examples how to develop further complex layouts of Perceptron.
To be all in same page; let's verify some definitions:
Binary Classifier - is a problem where output is having only 2 possible answers, classifications or groups. However, classification can be linear or non-linear
Linear Classifier - If inputs are linearly separated. You can draw straight line to separate both groups
Non-linear Classifier - Incase classification is not possible via straight line
Problem Dimensions - Inputs matrix (Vector) can be considered as features of problem being optimized, for last example of article 3, we had 2 inputs, one is X and other one is Y which are coordinates of each point. other way to view inputs is by considering number of inputs as dimensions of the problem. So 4D problem means it has 4 different features (AI can resolve problems of higher dimensions that are impossible to visualize by human brain)
Using Perceptron to Optimize Binary Functions
To better understand Perceptron and its limitation, will check its use in optimization of binary functions as NOT, OR AND & XOR
So this is 1-Diemsional problem. Let's design Perceptron as following
h(x) = W0 + W1 * X1
As output is 0 or 1, Step Activation function is a good choice
Then Y=StepFunction (h(x))
From NOT truth table above, output Y is 1 when X is 0 so h(x) shall be >= 0 if X=0
h(x) = W0 + W1 * X1 >= 0 When X=0
W0 >= 0 for X =0 let's select W0 =1
h(x) = 1 + W1 * X1
Now, second possible value of Y is 0 for X = 1
h(x) < 0 for X =1
1 + W1 * X1 < 0 for X =1
1 + W1 < 0 for X=1
W1 < -1 so let's select W1 = -1.5
h(x) = 1 - 1.5 * X
This is 2-dimenstional problem, let's plot X1 & X2
These are linearly separated groups as straight line can be drawn to separate both groups as the following one
Again, will use Step activation function for this Perceptron
h(x) = W0 + W1 * X1 + W2 * X2
From truth table, we know that Y=o for X1=X2=0 which means
h(x) < 0 for X1=X2=0
W0 < 0 for X1=X2=0 - Let's select W0 as -0.5
h(x) = -0.5 + W1 * X1 + W2 * X2
Selecting one line from the graph that intercepts with X1 at 0.5 and X2 at 0.5 (other lines can work as well as separators)
From Truth table, Y=1 for X1=1 and X2 =0 then h(x) >= 0 for X1=1 and X2 =0
-0.5+ W1 * X1 + W2 * X2 >= 0 for X1=1 and X2 =0
-0.5+ W1 * 1 + W2 * 0 >= 0 for X1=1 and X2 =0
-0.5+ W1 >= 0 for X1=1 and X2 =0
W1 >= 0.5 for X1=1 and X2 =0 let's select W1 = 1
h(x) = -0.5 + 1 * X1 + W2 * X2
Also, Y=1 for X1=0 and X2 =1 then h(x) >= 0 for X1=0 and X2 =1
-0.5 + 1 * X1 + W2 * X2 >= 0 for X1=0 and X2 =1
-0.5 + W2 >= 0 for X1=0 and X2 =1
W2 >= 0.5 for X1=0 and X2 =1 let's select W2 = 1
h(x) = -0.5 + X1 + X2
Let's confirm truth table:
|X1||X2||Desired||h(x) = -0.5 + X1 + X2||Y|
Similarly, it is 2D problem and Perceptron shall be
by following same above OR Procedure we may conclude values for W0, W1 & W2
one possible combination is
h(x) = -1.5 + X1 + X2
To verify truth table
|X1||X2||Desired||h(x) = -1.5 + X1 + X2||Y|
So, final Perceptron shall be
This is a problem, this function can not be linearly separated; there is no single line can separate the 2 groups.
Then Perceptron can not resolve this problem and this is the main and major limitation of Perceptron (only binary linear classifications)
Yet Perceptron is powerful algorithm and can be used maybe in other formations to optimize complicated problems.
Let's be back to XOR function and try to understand this function more. will use
Venn diagrams to help on that. Venn diagrams are graphical representation of different logical operations (here is more about Venn diagrams)
Venn diagram for OR gate shall be
and here is for AND
Here is XOR
From Venn diagrams we may extract the meaning of XOR gate as the result of UNION (OR) excluding INTERSECTION area in other words
A XOR B = (A + B) - (A.B)
We already used Perceptron to implement AND & OR functions above, so why we do not use more than one Perceptron to implement above function. One possible implementation could be
Already 2D AND function is implemented and we may use the same
We did not implement 3D OR function. To do so, let's start by simplifying XOR function truth table
|X1||X2||X1 AND X2||Desired|
So we need to find weights of h(x) of OR function Perceptron that fulfill above table where
h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X3 (X3 = X1 AND X2)
Activation function will be Step as well.
Let's start with last combination of X1=0, X2=0 & X1 AND X2 = 0 then Y = 0
h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X3 <0 for X1=0, X2=0 & X1 AND X2 = 0
W0 <0 for X1=0, X2=0 & X1 AND X2 = 0 Let's select W0 = -1
For combination of X1=1, X2=0 & X1 AND X2 = 0 then Y = 1
h(x) = -1 +W1 * X1 + W2 * X2 + W3 * X3 >= 0 for X1=1, X2=0 & X1 AND X2 = 0
-1 +W1 >= 0 for X1=1, X2=0 & X1 AND X2 = 0
W1 >= 1 Let's select W1 = 2
For combination of X1=0, X2=1 & X1 AND X2 = 0 then Y = 1
h(x) = -1 + 2 * X1 + W2 * X2 + W3 * X3 >= 0 for X1=0, X2=1 & X1 AND X2 = 0
-1 +W2 >= 0 for X1=0, X2=1 & X1 AND X2 = 0
W1 >= 1 Let's select W2 = 2
For combination of X1=1, X2=1 & X1 AND X2 = 1 then Y = 0
h(x) = -1 + 2 * X1 + 2 * X2 + W3 * X3 < 0 for X1=1, X2=1 & X1 AND X2 = 1
-1 + 2 + 2 +W3 < 0 for X1=1, X2=1 & X1 AND X2 = 1
3 +W3 < 0 for X1=1, X2=1 & X1 AND X2 = 1
W3 < -3 for X1=1, X2=1 & X1 AND X2 = 1 Let's select W3 = -4
h(x) =-1 +2 * X1 + 2 * X2 - 4 * X3 (X3 = X1 AND X2)
Final Perceptron Network shall be
Well, let's try to reformulate the graphical representation of above layout. each Perceptron shall be denoted by its function.
Instead of having 1 AND Perceptron, let's add 1 Perceptron that generates X1 and other one to generate X2:
Now, let's add dummy Perceptrons to receive inputs and just pass it to next level of Perceptrons
Clearly above is better representation, and it's called MLP (Multi-Layer Perceptron Network). This is exactly the common layout for ANN (Artificial Neural Network)
Inputs are being received by set of Perceptrons equal to number of inputs, this is called
Output is being generated by Perceptron, where 1 Perceptron per each output. This is called
Processing Perceptrons in the middle between input layer to output layer is called
Each ANN can have only 1 Input Layer and 1 Output Layer but could have one or multiple hidden layers. Number of hidden layers is based on complexity of problem being optimized.
We have demonstrated that by adding 1 Perceptron in addition to output Perceptron could add additional power to the network
A lot of algorithms are there for ANN training and ANN itself has many types as well. will discuss the most common types and algorithms in next articles
However, it is all start from Perceptron concept and build up on it hence, it was important to get as much details as possible about Perceptron although the examples might not seem to be complicated.