Click here to Skip to main content
15,850,446 members
Articles / Artificial Intelligence

TensorFlow Tutorial 4: Creating a Neural Net

Rate me:
Please Sign up or sign in to vote.
2.00/5 (1 vote)
16 Oct 2018CPOL3 min read 6.4K   4   5

By Dante Sblendorio

What is a neural net, and how can you create one? Keep reading to find out. Below, I explain the basics of setting up a neural net using TensorFlow.

A neural net consists of three key components: the input layer, the hidden layer(s), and the output layer. Each layer has an arbitrary number of nodes (or neurons). In the example in the previous section, the input layer is const1 and const2. The matrix addition can be thought of as the hidden layer, and the output layer is the output. In the case of our wine data, the input data is the 13 chemical features and the output layer is the Class. The hidden layer can be thought of as a sophisticated ensemble of mathematical functions that behave as filters, extracting the relevant features for determining the correct Class. The structure of the neural net is inspired by biology, specifically the neural connections in the human brain. For the sake of this article, I won’t go into depth about the mathematical structure of the hidden layer(s). It is sufficient to think of it as a mathematical “black box” that extracts hidden meaning from the data. (However, if you want to learn more, this is a thorough online textbook on neural networks and deep learning.)

We split the dataset into a training and test set. This allows us to “train” the mathematical operators within the hidden layer to converge on ideal values that correctly predict the correct Class based on the 13 features for each observation. We then inject the test set into the neural net and evaluate the accuracy to determine how well the net has been trained. Splitting the data in this way provides a way to avoid overfitting or underfitting the data, thereby giving a true estimation of the accuracy of the net. In order to prepare the data for TensorFlow, we perform some slight manipulation:

from sklearn.model_selection import train_test_split

#this prepares our Class variable for the NN
def convertClass(val):
   if val == 1:
       return [1, 0, 0]
   elif val == 2:
       return [0, 1, 0]
   else:
       return [0, 0, 1]
   
Y = wine_df.loc[:,'Class'].values
Y = np.array([convertClass(i) for i in Y])
X = wine_df.loc[:,'Alcohol':'Proline'].values

#we split the dataset into a test and training set
train_x, test_x, train_y, test_y = train_test_split(X,Y , test_size=0.3, random_state=0)
train_x = train_x.transpose()
train_y = train_y.transpose()
test_x = test_x.transpose()
test_y = test_y.transpose()

Now that the data is prepped, we define several functions that form the foundation of the neural net. First, we define a function that establishes the initial parameters of our net. Here we also define how many nodes are in each of the hidden layers (I chose to have two hidden layers). Since we have three possible values for Class, we have three nodes in the output layer. Next we define a forward propagation function. All this does is send the 13 features through the net, and subject them to the mathematical operations within the hidden layer.

We also need to define a cost function. This is a critical function that allows us to “train” the network. It is a single value that describes how well the net does at predicting the correct Class. If the cost value is high, the mathematical operates adjust, and the data is fed through the net again. The data is fed through the net until the cost value converges.

def init_parameters(num_input_features):

   num_hidden_layer =  32
   num_hidden_layer_2 = 16
   num_output_layer_1 = 3
   
   tf.set_random_seed(1)
   W1 = tf.get_variable("W1", [num_hidden_layer, num_input_features], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b1 = tf.get_variable("b1", [num_hidden_layer, 1], initializer = tf.zeros_initializer())
   W2 = tf.get_variable("W2", [num_hidden_layer_2, num_hidden_layer], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b2 = tf.get_variable("b2", [num_hidden_layer_2, 1], initializer = tf.zeros_initializer())
   W3 = tf.get_variable("W3", [num_output_layer_1, num_hidden_layer_2], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b3 = tf.get_variable("b3", [num_output_layer_1, 1], initializer = tf.zeros_initializer())
   
   parameters = {"W1": W1,
                 "b1": b1,
                 "W2": W2,
                 "b2": b2,
                 "W3": W3,
                 "b3": b3}
   
   return parameters
			
def for_prop(X, parameters):
			
   W1 = parameters['W1']
   b1 = parameters['b1']
   W2 = parameters['W2']
   b2 = parameters['b2']
   W3 = parameters['W3']
   b3 = parameters['b3']
   
   # propagates values through NN using Rectified Linear Unit as activation function          
   Z1 = tf.add(tf.matmul(W1, X), b1)                     
   A1 = tf.nn.relu(Z1)                                    
   Z2 = tf.add(tf.matmul(W2, A1), b2)                     
   A2 = tf.nn.relu(Z2)                                   
   Z3 = tf.add(tf.matmul(W3, A2), b3)                    
   return Z3

def c(Z3, Y):
   logits = tf.transpose(Z3)
   labels = tf.transpose(Y)
   cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels))
   return cost

We also need to define a function that produces a random subset of observations within the training set. I mentioned previously that data is fed into the neural net until the cost value converges; each iteration a random subsample is picked out of the total training set and injected into the net. In this function we create batches of subsamples:

def rand_batches(X, Y, batch_size = 128, seed = 0):
   m = X.shape[1]
   batches = []
   np.random.seed(seed)
   
   # shuffle
   permutation = list(np.random.permutation(m))
   shuffled_X = X[:, permutation]
   shuffled_Y = Y[:, permutation].reshape((Y.shape[0],m))

   # partition the shuffled data
   num_batches = math.floor(m/batch_size)
   for k in range(0, num_batches):
       batch_X = shuffled_X[:, k * batch_size : k * batch_size + batch_size]
       batch_Y = shuffled_Y[:, k * batch_size : k * batch_size + batch_size]
       batch = (batch_X, batch_Y)
       batches.append(batch)
   
   # handle end case
   if m % batch_size != 0:
       batch_X = shuffled_X[:, num_batches * batch_size : m]
       batch_Y = shuffled_Y[:, num_batches * batch_size : m]
       batch = (batch_X, batch_Y)
       batches.append(batch)
   
   return batches

To generate your entry code for challenge 4, paste the following code in your Jupyter notebook:

member_number = 12345678

one = [member_number, int(member_number/5), int(member_number/100)]

two = [0.02, 0.05, 0.08]

a = tf.placeholder(tf.float32, shape=(3))

b = tf.placeholder(tf.float32, shape=(3))

result = tf.tensordot(a, b, 1)

with tf.Session() as sess:

   print(int(result.eval(feed_dict={a: one, b: two})))

And replace 12345678 with your CodeProject member number before you run the code. The number that is printed will be your entry code for this challenge. Please click here to enter the contest entry code.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer CodeProject Solutions
Canada Canada
The CodeProject team have been writing software, building communities, and hosting CodeProject.com for over 20 years. We are passionate about helping developers share knowledge, learn new skills, and connect. We believe everyone can code, and every contribution, no matter how small, helps.

The CodeProject team is currently focussing on CodeProject.AI Server, a stand-alone, self-hosted server that provides AI inferencing services on any platform for any language. Learn AI by jumping in the deep end with us: codeproject.com/AI.
This is a Organisation

4 members

Comments and Discussions

 
QuestionconvertClass: off-by-one error? Pin
Thomas Daniels27-Oct-18 10:21
mentorThomas Daniels27-Oct-18 10:21 
AnswerRe: convertClass: off-by-one error? Pin
ferench28-Oct-18 12:24
ferench28-Oct-18 12:24 
AnswerRe: convertClass: off-by-one error? Pin
Ryan Peden26-Nov-18 6:57
professionalRyan Peden26-Nov-18 6:57 
Questionissue with first code block. Pin
Jesse Zwerling22-Oct-18 8:06
professionalJesse Zwerling22-Oct-18 8:06 
In that first block of code, wine_df is not defined, so this line:
Python
Y = wine_df.loc[:,'Class'].values
will throw an error.
wine_df is defined in some code from the previous tutorial, but even if these are take together, the next line has an issue:
Python
Y = np.array([convertClass(i) for i in Y])
as np has not been defined.
Might be helpful to list out the full code in the beginning to give some context, and then break it into bite sized chunks for the various section of the tutorial.
AnswerRe: issue with first code block. Pin
jratcliff774022-Oct-18 10:44
jratcliff774022-Oct-18 10:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.