Click here to Skip to main content
14,975,234 members
Articles / Artificial Intelligence / Neural Networks
Article
Posted 17 Sep 2017

Stats

10.7K views
15 bookmarked

Build Simple AI .NET Library - Part 4 - Beyond Perceptron

Rate me:
Please Sign up or sign in to vote.
5.00/5 (12 votes)
17 Sep 2017CPOL8 min read
This is a series of articles demonstrating .NET AI library from scratch

Series Introduction

This is the 4th article of creating .NET library, here are links for Part 1 & 2:

Build Simple AI .NET Library - Part 1 - Basics First

Build Simple AI .NET Library - Part 2 - Machine Learning Introduction

Build Simple AI .NET Library - Part 3 - Perceptron

My objective is to create a simple AI library that covers couple of advanced AI topics such as Genetic algorithms, ANN, Fuzzy logics and other evolutionary algorithms. The only challenge to complete this series would be having enough time working on code and articles.

Having the code itself might not be the main target however, understanding these algorithms is. wish it will be useful to someone someday.

Article Introduction - Part 2 "Beyond Perceptron"

At last article, we have created Perceptron that acts as Binary Linear Classifier, will continue discussion about Perceptron to create more complicated layout for more complicated problems.

Strongly advise to review Build Simple AI .NET Library - Part 2 - Machine Learning Introduction before moving any further in this article.

More Perceptron Examples

As mentioned, Perceptron is simplest processing element of ANN, yet it is still powerful algorithm however it is very limited. Remember, it's mainly used only as binary linear classifier.

What about other complicated classifications problems, what about binary but non-linear classifications. will see through couple of examples how to develop further complex layouts of Perceptron.

To be all in same page; let's verify some definitions:

  • Binary Classifier - is a problem where output is having only 2 possible answers, classifications or groups. However, classification can be linear or non-linear
  • Linear Classifier - If inputs are linearly separated. You can draw straight line to separate both groups

Image 1

  • Non-linear Classifier - Incase classification is not possible via straight line

Image 2

Image 3

  • Problem Dimensions - Inputs matrix (Vector) can be considered as features of problem being optimized, for last example of article 3, we had 2 inputs, one is X and other one is Y which are coordinates of each point. other way to view inputs is by considering number of inputs as dimensions of the problem. So 4D problem means it has 4 different features (AI can resolve problems of higher dimensions that are impossible to visualize by human brain)

Using Perceptron to Optimize Binary Functions

To better understand Perceptron and its limitation, will check its use in optimization of binary functions as NOT, OR AND & XOR

NOT Function

Image 4

So this is 1-Diemsional problem. Let's design Perceptron as following

Image 5

h(x) = W0 + W1 * X1

As output is 0 or 1, Step Activation function is a good choice 

Image 6

Then Y=StepFunction (h(x))

From NOT truth table above, output Y is 1 when X is 0 so h(x) shall be >= 0 if X=0

h(x) = W0 + W1 * X>= 0 When X=0

W0  >= 0 for X =0 let's select W0 =1

h(x) = 1 + W1 * X1

Now, second possible value of Y is 0 for X = 1

h(x) < 0 for X =1

1 + W1 * X< 0 for X =1

1 + W< 0 for X=1

W< -1 so let's select W= -1.5

Finally, h(x) = 1 - 1.5 * X

Image 7


OR Function

Image 8

 

This is 2-dimenstional problem, let's plot X1 & X2 

Image 9

These are linearly separated groups as straight line can be drawn to separate both groups as the following one

Image 10

Again, will use Step activation function for this Perceptron

Image 11

h(x) = W0 + W1 * X+ W2 * X2

Y= Step(h(x)

From truth table, we know that Y=o for X1=X2=0 which means

h(x) < 0 for X1=X2=0

W0 < 0 for X1=X2=0 - Let's select W0 as -0.5

h(x) = -0.5 + W1 * X+ W2 * X2

Selecting one line from the graph that intercepts with X1 at 0.5 and X2 at 0.5 (other lines can work as well as separators)

From Truth table, Y=1 for X1=1 and X2 =0 then h(x) >= 0 for X1=1 and X2 =0

-0.5+ W1 * X+ W2 * X>= 0 for X1=1 and X2 =0

-0.5+ W1 * 1 + W2 * 0 >= 0 for X1=1 and X2 =0

-0.5+ W1  >= 0 for X1=1 and X2 =0

 W1  >= 0.5 for X1=1 and X2 =0 let's select  W1  = 1

h(x) = -0.5 + 1 * X+ W2 * X2

Also, Y=1 for X1=0 and X2 =1 then h(x) >= 0 for X1=0 and X2 =1

-0.5 + 1 * X+ W2 * X>= 0 for X1=0 and X2 =1

-0.5 + W2  >= 0 for X1=0 and X2 =1

W2  >= 0.5 for X1=0 and X2 =1  let's select  W2  = 1

Finally h(x) = -0.5 + X1 + X2

Let's confirm truth table:

X1X2Desiredh(x) = -0.5 + X+ X2Y
1111.51
1010.51
0110.51
000-0.50

AND Function

Image 12

Image 13

Similarly, it is 2D problem and Perceptron shall be

Image 14

by following same above OR Procedure we may conclude values for W0, W1 & W2

one possible combination is h(x) = -1.5 + X1 + X2

To verify truth table

X1X2Desiredh(x) = -1.5 + X+ X2Y
1110.51
100-0.50
010-0.50
000-1.50

So, final Perceptron shall be

Image 15

 


XOR Function

Image 16

Image 17

This is a problem, this function can not be linearly separated; there is no single line can separate the 2 groups.

Then Perceptron can not resolve this problem and this is the main and major limitation of Perceptron (only binary linear classifications)

Yet Perceptron is powerful algorithm and can be used maybe in other formations to optimize complicated problems.

Let's be back to XOR function and try to understand this function more. will use Venn diagrams to help on that. Venn diagrams are graphical representation of different logical operations (here is more about Venn diagrams)

Venn diagram for OR gate shall be

Image 18

and here is for AND

Image 19

Here is XOR

Image 20

From Venn diagrams we may extract the meaning of XOR gate as the result of UNION (OR) excluding INTERSECTION area in other words

A XOR B = (A + B) - (A.B)

We already used Perceptron to implement AND & OR functions above, so why we do not use more than one Perceptron to implement above function. One possible implementation could be

Image 21

AND function

Already 2D AND function is implemented and we may use the same

Image 22

 

OR Function

We did not implement 3D OR function. To do so, let's start by simplifying XOR function truth table

X1X2X1 AND X2Desired
1110
1001
0101
0000

So we need to find weights of h(x) of OR function Perceptron that fulfill above table where

h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X3   (X3 = X1 AND X2)

Activation function will be Step as well.

Let's start with last combination of X1=0, X2=0 & X1 AND X2 = 0 then Y = 0

h(x) = W0 +W1 * X1 + W2 * X2 + W3 * X<0 for  X1=0, X2=0 & X1 AND X2 = 0

W0 <0 for  X1=0, X2=0 & X1 AND X2 = 0  Let's select W0 = -1

For combination of X1=1, X2=0 & X1 AND X2 = 0 then Y = 1

h(x) = -1 +W1 * X1 + W2 * X2 + W3 * X>= 0 for  X1=1, X2=0 & X1 AND X2 = 0

-1 +W1   >= 0 for  X1=1, X2=0 & X1 AND X2 = 0  

W1   >= 1 Let's select W1 = 2

For combination of X1=0, X2=1 & X1 AND X2 = 0 then Y = 1

h(x) = -1 + 2 * X1 + W2 * X2 + W3 * X>= 0 for  X1=0, X2=1 & X1 AND X2 = 0

-1 +W2   >= 0 for  X1=0, X2=1 & X1 AND X2 = 0  

W1   >= 1 Let's select W2 = 2

For combination of X1=1, X2=1 & X1 AND X2 = 1 then Y = 0

h(x) = -1 + 2 * X1 + 2 * X2 + W3 * X< 0 for  X1=1, X2=1 & X1 AND X2 = 1

-1 + 2 + 2 +W3   < 0 for  X1=1, X2=1 & X1 AND X2 = 1 

3 +W3   < 0 for  X1=1, X2=1 & X1 AND X2 = 1 

W3   < -3 for  X1=1, X2=1 & X1 AND X2 = 1  Let's select W3 = -4

Final h(x) =-1 +2 * X1 + 2 * X2 - 4 * X3   (X3 = X1 AND X2)

Image 23

Final Perceptron Network shall be

Image 24

Well, let's try to reformulate the graphical representation of above layout. each Perceptron shall be denoted by its function.

Instead of having 1 AND Perceptron, let's add 1 Perceptron that generates X1 and other one to generate X2:

Image 25

Now, let's add dummy Perceptrons to receive inputs and just pass it to next level of Perceptrons

Image 26

Clearly above is better representation, and it's called MLP (Multi-Layer Perceptron Network). This is exactly the common layout for ANN (Artificial Neural Network)

Inputs are being received by set of Perceptrons equal to number of inputs, this is called Input Layer

Output is being generated by Perceptron, where 1 Perceptron per each output. This is called Output Layer

Processing Perceptrons in the middle between input layer to output layer is called Hidden Layer

Image 27

 

Each ANN can have only 1 Input Layer and 1 Output Layer but could have one or multiple hidden layers. Number of hidden layers is based on complexity of problem being optimized.

We have demonstrated that by adding 1 Perceptron in addition to output Perceptron could add additional power to the network

A lot of algorithms are there for ANN training and ANN itself has many types as well. will discuss the most common types and algorithms in next articles

However, it is all start from Perceptron concept and build up on it hence, it was important to get as much details as possible about Perceptron although the examples might not seem to be complicated.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Gamil Yassin
Engineer
Egypt Egypt
Electrical engineer, programmer on my free time.

Comments and Discussions

 
Questiondo you have code framework to demo your ideas? Pin
Southmountain30-Sep-17 6:27
MemberSouthmountain30-Sep-17 6:27 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.