Click here to Skip to main content
13,598,498 members
Click here to Skip to main content
Add your own
alternative version

Stats

21K views
52 bookmarked
Posted 22 Nov 2017
Licenced CPOL

Deep Learning

, 22 Nov 2017
Rate this:
Please Sign up or sign in to vote.
Deep learning convolutional neural network by tensorflow python, complete and easy understanding.

What is Deep Learning ?

Actually deep learning is a branch of machine learning. Machine learning includes some different types of algorithms which get a few thousands data and try to learn from them in order to predict new events in future. But deep learning applies neural network as extended or variant shapes. Deep learning has a capacity of handling million points of data.

The most fundamental infrastructure of deep learning could be; its ability to pick the best features. Indeed, deep learning summarized data and compute the result based on compressed data. It is what is really needed in artificial intelligence, especially when we have huge data base with dramatically computation.

Deep learning has sequential layers which is inspired from neural network. These layers have nonlinear function with the duty of feature selection. Each layer has an output which will be used as input for next layers. Deep learning applications are computer vision (such as face or object recognition), speech recognition, natural language process(NLP) and cyber threat detection. 

Deep Learning vs Machine Learning

The major differences between machine learning and deep learning is that; in ML we need to human manual intervention to select feature extraction while in DL it will be done by its intuitive knowledge which has been embede inside its architecture. This differences make a dramatically influence in their performance either in precision or speed. Because there are always human error in manually feature detection, therefore DL can be best option for giantic data computation.

The common factor between DL and ML is that both of them are working in supervised and unsupervised. DL is just based on NN while changes its shape and operation in CNN - RNN andd etc. But ML has different algorithms which are based on statistical and mathematical science. Although it doesn't mean that DL is merely on neural network, DL can also uses of various ML algorithms in order to increase performance by making hybrid functions. For instance DL can apply Support Vector Machine (SVM) as its own activation function instead of softmax. [1]

Feature Engineering Importance

We try to make machine as an independent tool in artificial intelligence to think which needs less programmer intervention. The most characteristic of an automate machine is; the way he thinks, if his way of thinking has the most similarity to human brain so he will win in the race of best machine. So let’s to see what is the pillar attribute in making accurate decision. Remember our childhood, when we saw objects but we had no idea about their properties such has name, exact size, weight and so on. But we could categorize them quickly by noticing one important things. For example, by looking at one animal we noticed that it is "Dog" as soon as we heard is sound which is "barking" or we noticed it is "Cat" when we heard its "meowing". So here animal sound has a most effective influence rather than size because as experience when we see animal with similar size to other animal our brain starts to pay attention the most distinguish feature which is sound. On the other hand, when we see the most taller animal in zoo we ignore all of other features and we say “Yes, it is giraffe”.

It is a miracle in brain because it can inference situation and according different condition in same problem such as “animal detection” make one feature as his final key to make decision according to that and given result by this attitude will be accurate and also quickly. Another story to make clear the feature engineering importance is “Twenty Questions Game” if you did not play it till now please look at: here

The player will win if has the ability to ask proper question and according to the recent answers he should make and improve the next question. The questions are sequentially and the next question is 100% depends on previous answer. Previous answers have the duties to make filtration ad clarification for player to reach the goal. Each question is as a hidden layer in neural network which are connected to the next layers and their output will be used as input for the next layers. Our first question always starts as “Is it alive?” and by this question we remove half of possibilities. This omitting and dropping lead us to asking better question in new category, obviously we cannot ask the next one without previous answer which made a clarification and filtration in our brain. This story happens somehow in deep learning convolutional neural network.

Deep Learning and human brain

Deep learning is an imitation of human brain with almost in the aspect of precision and speed. Convolutional Neural Networks (CNN) is inspired from brain cortex. AS you see in below picture visual cortex layer has covered all of entire visual field. These sensitive cells have the role of kernel or filter matrix which we will pay attention to them later in this article. God created these cells to extract important data which are coming from eyes.

Assume students have exam and they are preparing themselves, they start to read the book while they pick up important part of book and write it on notes or by highlighting them. In both they tend to reduce the volume of book and summarized 100 pages into two pages which are easily to use it as reference and review it. The similar scenario happens on DL CNN, this time we need a smaller matrix to filter and remove data.

Requierment: 

I strongly recommend and please you to read carefully the first and second below articles, because their concept will be needed and I assumed that you know everything about linear regression and neural network. 

1. Machine Learning - Linear Regression Gradient Descent
2. Machine Learning - Neural Network
3. Machine Learning - Naive Bayes Classifier
4. Machine Learning - Random Forest - Decision Tree
5. Machine Learning - Markov Chain - Mont Carlo
6. Big Data MapReduce Hadoop Scala on Ubuntu Linux by Maven intellj idea

How Deep Learning - Convolutional Neural Network Works?

Deep learning is neural network which has more than two hidden layers. Please if you are new in neural network study this link. There are more data because of more layers which causes overfitting.  Overfitting happens when we made our model from training data set as really complete and match to test set and always there is one answer inside model. One of the good characteristic of model is to be generalized not to be complete coincident.

We cannot or even we can it is wrong to make a complete model. Let’s see wat happens when we want to assign an “Y” inside our model. We must ignore to be too much idealistic in making model and tend to make it general rather than specifically, in order to reach this point, we can apply cross validation. Cross validation is model evaluation method. The best way is using K-fold cross validation which tries to divide train set to k parts and in each iteration, k is belong to test and the rest of k-1 is train set, therefore the chance of matching will be decreased. There are some specific solutions instead of K-fold cross validation in convolutional neural network in order to avoid overfitting such as drop out and regularization. 

Fully connected in DL means that each neuron in one hidden layer has connection to all of neurons to the next layer. In the case of applying drop out in training time some of the neurons will be turned off and after finishing training on the prediction time all neurons will be turned on. So DL tries to omit and remove redundant data and obscure their role and enhance and bold the role of important features. Such as below picture when left picture has high resolution but within passing time DL CNN tries to keep on important pixel and make its smaller. 

Assume students have exam and they are preparing themselves, they start to read the book while they pick up important part of book and write it on notes or by highlighting them. In both they tend to reduce the volume of book and summarized 100 pages into two pages which are easily to use it as reference and review it. The similar scenario happens on DL CNN, this time we need a smaller matrix to filter and remove data.

We can transform data to smaller data -which is easier to rely on it for making decision- with the aid of smaller matrix and rotating all over of original and primitive matrix. We do some mathematical calculation by moving filter matrix around primitive matrix. For example, in below picture 12 data points will be reduced to just 3 data points by rotating one matrix 3 times in all over of primitive matrix. These computation can be maximized or taking average of data.

One CNN Dimensional

There is no such as one dimensional matrix in real world but because of presenting its way I prefer to start with 1D Matrix. I want to make dimensional reduction with the aid of red matrix on blue matrix. So blue matrix is real data set and red one is filter matrix. I want to transform blue matrix with 5 elements to 3 elements. I push red matrix from left to the right (I push it in each step just one element). Whenever there are coincident I multiply two related elements and in the case of more than one matching elements, I sum up them together. As a notice red matrix was [2 -1 1] and after flip it (kernel) becomes [1 -1 2].

To reduce matrix, I am looking for valid results and they happen when all of red or filter elements are covered by blue one. I just pick up [3 5]    

import numpy as np

x = np.array([0,1,2,3])
 
w = np.array([2,-1,1])

result = np.convolve(x,w)
result_Valid = np.convolve(x,w, "valid")
print(result)
print(result_Valid)

Two CNN Dimensional

There is similar story in two dimensional matrixes. The kernel matrix [[-1, 0], [2, 1]] will be changed [[1, 2], [0, -1]] to after flipping. Because in all steps in below pictures filter matrix is inside original train matrix, so all of commutated elements are valid.

from scipy import signal as sg

print(sg.convolve([[2, 1, 3],
                   [5, -2, 1],
                   [0, 2, -4]], [[-1, 0],[2, 1]]))

print(sg.convolve([[2, 1, 3],
                   [5, -2, 1],
                   [0, 2, -4]], [[-1, 0],[2, 1]], "valid"))

Deep Learning Code Sample by Digit Recognition

I want to introduce you best competition community KAGGLE which is famous around data scientist. There are many competitions which are worthy to practice your abilities in machine learning and deep learning. Also there are awards or whoever can accomplish code for recent challenges. There are kernels which have been written by authors and also you can contribute on those and they are good sources for learning artificial intelligence in R and Python. Moreover, you can use its data set as reference and test your code with prepared data.

I want to practice convolutional please click here.

Download training and test data set

Please Go to this link to get training and testing data set. Obviously you must sign up on kaggle site and then try to join this competition.

# -*- coding: utf-8 -*-
"""
Created on Sun Nov 19 05:59:50 2017

@author: Mahsa
"""
import numpy as np
from numpy.random import permutation
import pandas as pd
import tflearn
from tflearn.layers.core import input_data,dropout,fully_connected,flatten
from tflearn.layers.conv import conv_2d,max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression
from sklearn.cross_validation import train_test_split


train_Path = r'D:\digit\train.csv'
test_Path = r'D:\digit\test.csv'
  
#Split arrays or matrices into random train and test subsets
#http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

def split_matrices_into_random_train_test_subsets(train_Path):
    train = pd.read_csv(train_Path)
    train = np.array(train)
    train = permutation(train)
    X = train[:,1:785].astype(np.float32) #feature
    y = train[:,0].astype(np.float32) #label
    return train_test_split(X, y, test_size=0.33, random_state=42)

def reshape_data(Data,Labels):
    Data = Data.reshape(-1,28,28,1).astype(np.float32)
    Labels = (np.arange(10) == Labels[:,None]).astype(np.float32)
    return Data,Labels

X_train, X_test, y_train, y_test = split_matrices_into_random_train_test_subsets(train_Path)

X_train,y_train = reshape_data(X_train,y_train)
X_test,y_test = reshape_data(X_test,y_test)

test_x = np.array(pd.read_csv(test_Path))
test_x = test_x.reshape(-1,28,28,1)
  
def Convolutional_neural_network():
    network  = input_data(shape=[None,28,28,1],name='input_layer')
    network  = conv_2d(network, nb_filter=6,  filter_size=6, strides=1, activation='relu', regularizer='L2')  
    network  = local_response_normalization(network)
    network  = conv_2d(network, nb_filter=12, filter_size=5, strides=2, activation='relu', regularizer='L2') 
    network  = local_response_normalization(network)
    network  = conv_2d(network, nb_filter=24, filter_size=4, strides=2, activation='relu', regularizer='L2')
    network  = local_response_normalization(network)    
 
    network = fully_connected(network, 128, activation='tanh')
    network = dropout(network, 0.8)
    network = fully_connected(network, 256, activation='tanh')
    network = dropout(network, 0.8) 
    network = fully_connected(network, 10, activation='softmax') 
    
    sgd   = tflearn.SGD(learning_rate=0.1,lr_decay=0.096,decay_step=100)
    top_k = tflearn.metrics.top_k(3) #Top-k mean accuracy ,  Number of top elements to look at for computing precision
    
    network = regression(network, optimizer=sgd, metric=top_k, loss='categorical_crossentropy')
    return tflearn.DNN(network, tensorboard_dir='tf_CNN_board', tensorboard_verbose=3)
    
model = Convolutional_neural_network()
model.fit(X_train, y_train, batch_size=128, validation_set=(X_test,y_test), n_epoch=1, show_metric=True)

P = model.predict(test_x)

index = [i for i in range(1,len(P)+1)]
result = []
for i in range(len(P)):
    result.append(np.argmax(P[i]).astype(np.int))

res = pd.DataFrame({'ImageId':index,'Label':result})
res.to_csv("sample_submission.csv", index=False)

Increase deep learning performance with hardware by GPU

One common important factor among gamer developer, graphic designer and data scientist is matrices. Every data point either in images, video or complex data has a value in matric element. Whatever we do includes some mathematical operation to transforming matrices.

For usual processing Central Processing Unit is good answer, but in advanced mathematical and statistical operations with huge data CPU cannot tolerate and we have to use Graphics Processing unit (GPU) which was designed for mathematical difficult function. Because deep learning includes functions which needs complex computation such as convolution neural network, activation function , sigmoid softmax and Fourier Transform will be processed on GPU and the rest of other 95% will be moved on CPU which or mostly I/O procedures.  

GPU Activation

  1. Open start and bring "windows comand prompt cmd".
  2. Type "dxdiag"
  3. On the opening window look at "Display Tab
  4. If name is equal to "NVIDIA" or (NVIDIA GPU - AMD GPU - Intel Xeon Phi) other company, means that there is GPU card on the board.
  5. Lets try to set configuration .theanorc on the "C:\users\"yourname"\".theanorc "
  6. Set { device = gpu or cuda0 , floatX = float32 } in [global] section, and preallocate = 1 in [gpuarray]
  7. If you want to know more about it please look at here.

GPU Test Code
import os
import shuti

destfile = "/home/ubuntu/.theanorc"
open(destfile, 'a').close()
shutil.copyfile("/mnt/.theanorc", destfile) # make .theanorc file in the project directory  

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
 
vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))

f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in xrange(iters):
    r = f()
t1 = time.time()

print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r))


if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')

Increase deep learning performance with software libraries 

In order to enhance the CNN performances and also because it is not possible to shocked CPU or even GPU with gigantic data more than terabyte, we must use some strategies to break down data manually in some chunks for processing. I have used DASK to prevent out of ram memory crashes. It is responsible or time scheduling.   

import dask.array as da

X = da.from_array(np.asarray(X), chunks=(1000, 1000, 1000, 1000))

Y = da.from_array(np.asarray(Y), chunks=(1000, 1000, 1000, 1000))

X_test = da.from_array(np.asarray(X_test), chunks=(1000, 1000, 1000, 1000))

Y_test = da.from_array(np.asarray(Y_test), chunks=(1000, 1000, 1000, 1000))

References

[1] http://deeplearning.net/wp-content/uploads/2013/03/dlsvm.pdf

[2] https://leonardoaraujosantos.gitbooks.io

[3] https://github.com/Hassankashi?tab=repositories

[4] http://timdettmers.com/2015/07/27/brain-vs-deep-learning-singularity/

[5] https://blog.dominodatalab.com/gpu-computing-and-deep-learning/

[6] http://deeplearning.net/software_links/

[7] https://www.codeproject.com/Articles/1158306/Theano-Machine-Learning-on-a-GPU-on-Windows

[8] https://www.analyticsvidhya.com/blog/2015/02/avoid-over-fitting-regularization/

[9] https://github.com/tflearn/tflearn/tree/master/examples

Feedback

Feel free to leave any feedback on this article; it is a pleasure to see your opinions and vote about this code. If you have any questions, please do not hesitate to ask me here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Mahsa Hassankashi
Doctorandin Technische Universität Berlin
Iran (Islamic Republic of) Iran (Islamic Republic of)
I have been working with different technologies and data more than 10 years.
I`d like to challenge with complex problem, then make it easy for using everyone. This is the best joy.

ICT Master in Norway 2013
Doctorandin at Technische Universität Berlin in Data Scientist ( currently )
-------------------------------------------------------------
Diamond is nothing except the pieces of the coal which have continued their activities finally they have become Diamond.

*Article of The Community Spotlight in Microsoft ASP.NET, Wednesday, February 11, 2015, www.asp.net
*Article of The Day in Microsoft ASP.NET Tuesday, February 3, 2015, www.asp.net/community/articles
*1 Jan 2015: CodeProject MVP 2015
*22 Mar 2014: Best Web Dev Article of February 2014 - Second Prize


You may also be interested in...

Pro
Pro

Comments and Discussions

 
QuestionVery Very Good! Pin
Member 476220919-Feb-18 20:18
memberMember 476220919-Feb-18 20:18 
GeneralMy Vote of 5 Pin
RaviRanjanKr28-Nov-17 20:07
professionalRaviRanjanKr28-Nov-17 20:07 
GeneralRe: My Vote of 5 Pin
Mahsa Hassankashi29-Nov-17 9:07
memberMahsa Hassankashi29-Nov-17 9:07 
GeneralMy vote of 5 Pin
Humayun Kabir Mamun24-Nov-17 8:27
memberHumayun Kabir Mamun24-Nov-17 8:27 
GeneralRe: My vote of 5 Pin
Mahsa Hassankashi25-Nov-17 5:48
memberMahsa Hassankashi25-Nov-17 5:48 
QuestionTypo? Pin
rprimora24-Nov-17 7:06
memberrprimora24-Nov-17 7:06 
AnswerRe: Typo? Pin
Mahsa Hassankashi25-Nov-17 5:57
memberMahsa Hassankashi25-Nov-17 5:57 
GeneralRe: Typo? Pin
rprimora26-Nov-17 23:13
memberrprimora26-Nov-17 23:13 
QuestionMy vote of 5 Pin
Sibeesh Passion24-Nov-17 2:15
professionalSibeesh Passion24-Nov-17 2:15 
AnswerRe: My vote of 5 Pin
Mahsa Hassankashi24-Nov-17 2:34
memberMahsa Hassankashi24-Nov-17 2:34 
QuestionGreat Article Pin
MehdiNaseri23-Nov-17 19:23
professionalMehdiNaseri23-Nov-17 19:23 
AnswerRe: Great Article Pin
Mahsa Hassankashi24-Nov-17 2:34
memberMahsa Hassankashi24-Nov-17 2:34 
Questiontypos ? Pin
ppolymorphe23-Nov-17 16:28
mvpppolymorphe23-Nov-17 16:28 
AnswerRe: typos ? Pin
Mahsa Hassankashi24-Nov-17 2:32
memberMahsa Hassankashi24-Nov-17 2:32 
GeneralMy vote of 5 Pin
Akram Ben Hassan23-Nov-17 11:48
memberAkram Ben Hassan23-Nov-17 11:48 
GeneralRe: My vote of 5 Pin
Mahsa Hassankashi24-Nov-17 2:19
memberMahsa Hassankashi24-Nov-17 2:19 
GeneralNice work Pin
Akram Ben Hassan23-Nov-17 11:08
memberAkram Ben Hassan23-Nov-17 11:08 
GeneralRe: Nice work Pin
Mahsa Hassankashi24-Nov-17 2:19
memberMahsa Hassankashi24-Nov-17 2:19 
GeneralMy vote of 5 Pin
Igor Ladnik23-Nov-17 0:34
professionalIgor Ladnik23-Nov-17 0:34 
GeneralRe: My vote of 5 Pin
Mahsa Hassankashi23-Nov-17 6:23
memberMahsa Hassankashi23-Nov-17 6:23 
GeneralRe: My vote of 5 Pin
Reza Ruzbahani24-Nov-17 19:04
memberReza Ruzbahani24-Nov-17 19:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web04-2016 | 2.8.180621.3 | Last Updated 23 Nov 2017
Article Copyright 2017 by Mahsa Hassankashi
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid