Click here to Skip to main content
13,800,880 members
Click here to Skip to main content
Add your own
alternative version


2 bookmarked
Posted 20 Nov 2018
Licenced CPOL

IEI Tank AIoT Developer Kit and AWS Greengrass: Running Machine Learning Prediction on the Edge

, 20 Nov 2018
In this tutorial, we will setup a basic machine learning prediction model to run as an Amazon Web Services (AWS) Lambda function in an AWS Greengrass group.

Editorial Note

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.


In this tutorial, we will setup a basic machine learning prediction model to run as an Amazon Web Services (AWS)* Lambda function in an AWS Greengrass* group. We will use basic K-Means clustering to train the module for motor fault prediction. The Lambda function will utilize the resources of the Greengrass Core, which be setup on an IEI Tank* AIoT Developer Kit. The IEI Tank AIoT Developer Kit comes with preinstalled developer tools and SDKs like the OpenVINO™ toolkit, Intel® Media SDK and Intel® System Studio 2018 to help accelerate your path to deployment. The Lambda function will send status updates of its ML prediction process to the Greengrass group using MQTT messages.


IEI TANK with Ubuntu* 16.04 OS

AWS account

AWS Greengrass

AWS Greengrass* Setup

First, we will need to setup the Greengrass Core on the IEI TANK. Follow instructions in modules 1 and 2 in the linked documentation, Environment Setup for Greengrass and Installing the Greengrass Core Software in AWS Greengrass.

Go to AWS console, select Services from the top left ribbon, enter IoT in the search bar, and select IoT Core. On the IoT Core page, select Software from the bottom left. Download the AWS Greengrass Core SDK by clicking on Configure Download. Choose Python* 2.7 and click Download Greengrass Core SDK. After the package has loaded, untar it:

tar –xzvf greengrass-core-python-sdk-1.0.0.tar.gz

Go to the HelloWorld folder and unzip the file:

cd aws_greengrass_core_sdk/examples/HelloWorld

Contents of the unzipped folder will be used later in the tutorial to create a zip folder for AWS Lambda.

IEI Tank* Setup

Because AWS Greengrass needs Python* 2.7, we need to install packages specifically for Python 2.7:

sudo apt install python-pip
sudo pip2 install pandas numpy matplotlib scipy sklearn
sudo pip2 install -U pandas numpy matplotlib scipy sklearn

Clone the Motor-Defect-Detector GitHub* repository and go the Kmeans folder:

git clone
cd motor-defect-detector/Kmeans/

We will be using the Bearing Data Set for K-means basic model training and prediction. Download the Bearing Data Set by going to the website.

Install the apps to extract the files:

sudo apt-get install p7zip-full unrar

Unzip the data set:

7za x IMS.7z

Extract the rar files (only the first and second test sets are used in this tutorial):

unrar x 1st_test.rar 
unrar x 2nd_test.rar

Downgrading Code to Python* 2.7

Before we can use the GitHub repository code, we need to implement some changes to downgrade it from Python* 3.5 to Python 2.7, and run the training script. To modify the script on your own, follow these two steps.

In the Kmeans folder, open the script and add to the first line:

from __future__ import print_statement

Replace input to raw_input throughout the file, like the following:

filedir_testset1 = raw_input("enter the complete directory path for the testset1")

Alternatively, you can also get the completely modified training script from the Sample Code section of this article.

Training the Model

In the Kmeans folder, train the K-means model and follow the prompts:

enter the complete directory path for the testset1 /<path-to>/motor-defect-detector/Kmeans/1st_test/
enter the complete directory path for the testset2 /<path-to>/motor-defect-detector/Kmeans/2nd_test/

Training is done on the Bearing Data Set to improve prediction of motor defects. The method outputs the kmeanModel.npy file which will be used in the actual prediction of motor defects.

AWS* Lambda Setup

In this section, we will create a compressed folder and create the AWS Lambda function with it. Then, we will deploy the Lambda in our Greengrass group.

Copy the Greengrass files into the Kmeans folder:

cp –r <path-to>/aws_greengrass_core_sdk/examples/HelloWorld/greengrasssdk .

Create and move into the Kmeans folder from the Sample Code section of this article.

Compress files into a zip folder:

zip –r greengrasssdk/ kmeanModel.npy

Go to AWS console, click Services on top left, put Lambda in search bar and click on it. The Lambda Management Console will open. Click Create function:

If not selected, select Author from scratch and fill out outlined fields:

Click Create function.

Upload Change handler name to kmeans_test.function_handler. Click Save:

Click on Actions, select Create new version and add a version description. Click Publish:

Go to the IoT Core console. Choose Greengrass from left-side menu, select Groups underneath it, and select your group from the main window:

Select Lambdas from the left-side menu. Click Add Lambda on right top corner of the screen:

Select Use Existing Lambda:

Select kmeans_test from the menu and click Next:

Choose the version and click Finish:

Click on the dotted area and select Edit Configuration:

Change Memory Limit to 1024 MB, Timeout to 25 seconds, and choose Lambda lifecycle to be a long-lived function:

Locate the needed environmental variables. For example, to locate Python packages like numpy, run this command:

locate 2.7/dist-packages/numpy

Add environmental variables and paths to the packages and 2nd_test folder as values:

Click Update on the bottom of the page.

Click the little grey back button, select Resources. Click on blue button Add a local resource:

Create a local resource to access the Kmeans folder on your IEI Tank. Attach kmeans_test Lambda to it with read and write access:

Create two more local resources for the Python packages folder and the 2nd_test folder, with read-only access. You should see a similar screen when you’re done:

Go to Subscriptions. Click Add Subscription or Add your first Subscription:

For the source, choose from the Lambdas tab, and select kmeans_test. For the target, select IoT Cloud:

Click Next. Add hello/world for the topic and click Next:

Click Finish.

On the group header, click Actions, select Deploy and wait until it is successfully completed:

Go to the AWS IoT console. Select Test from the left-side menu. Type hello/world in the topic field, change MQTT payload display to display it as strings, and click Subscribe to topic:

After some time, messages should display on the bottom of the screen:


We have successfully setup the basic K-means model for motor defect detection as a Lambda function. As the next step, you can explore the capability for automatic updates. One Lambda is setup to look for new test sets, and once found, it will trigger the automatic download of the new sets and create a new learning script based on those sets. Then the model will be updated to give new, improved predictions.

Sample Code

from __future__ import print_function
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import cluster
from utils import cal_max_freq,create_dataframe,elbow_method
import os

    # reading all  the files from the testset1, and testset2
    filedir_testset1 = raw_input("enter the complete directory path for the testset1 ")
    filedir_testset2 = raw_input("enter the complete directory path for the testset2 ")
    all_files_testset1 = os.listdir(filedir_testset1)
    all_files_testset2 = os.listdir(filedir_testset2)

    # relative path of the dataset, after the current working directory
    path_testset2 = "2nd_test/"
    path_testset1 = "1st_test/"

    testset1_freq_max1,testset1_freq_max2,testset1_freq_max3,testset1_freq_max4,testset1_freq_max5 = cal_max_freq(all_files_testset1,path_testset1)
    testset2_freq_max1,testset2_freq_max2,testset2_freq_max3,testset2_freq_max4,testset2_freq_max5 = cal_max_freq(all_files_testset2,path_testset2)

except IOError:
    print("you have entered either the wrong data directory path for either testset1 or testset2")

result1 = create_dataframe(testset1_freq_max1,testset1_freq_max2,testset1_freq_max3,testset1_freq_max4,testset1_freq_max5,7)
result2 = create_dataframe(testset2_freq_max1,testset2_freq_max2,testset2_freq_max3,testset2_freq_max4,testset2_freq_max5,0)

result3 = create_dataframe(testset1_freq_max1,testset1_freq_max2,testset1_freq_max3,testset1_freq_max4,testset1_freq_max5,2)
result3 = result3[:1800]

result4 = create_dataframe(testset2_freq_max1,testset2_freq_max2,testset2_freq_max3,testset2_freq_max4,testset2_freq_max5,1)
result4 = result4[:800]

#creating the final result
print("creating the final result")
frames = [result1,result3,result2,result4]
result = pd.concat(frames)

X = result[["fmax1","fmax2","fmax3","fmax4","fmax5"]]

#elbow method: to calculate the optimal no of cluster

k_means = cluster.KMeans(n_clusters = 8,n_init = 10,max_iter = 1000,n_jobs = -1,random_state = 42)
kmeans_model =
label = kmeans_model.labels_

#plot the labels
print("plotting the labels")

#save the model
print("saving the model")
filename = "kmeanModel.npy",kmeans_model)

from __future__ import print_function

import time
from threading import Timer
import os
import greengrasssdk
import platform

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from utils import cal_max_freq, plotlabels

# Creating a greengrass core sdk client
client = greengrasssdk.client('iot-data')

# Retrieving platform information to send from Greengrass Core
my_platform = platform.platform()

def kmeans_test_run():
    client.publish(topic='hello/world', payload='Started kmeans test run.')
        filedir = os.environ.get("TESTSET2")
        client.publish(topic='hello/world', payload='Got data dir.')
        #filepath ="2nd_test/"
        filepath = os.environ.get("TESTSET2FOLDER")
        client.publish(topic='hello/world', payload='Got data folder.')
        # load the files
        all_files = os.listdir(filedir)
        client.publish(topic='hello/world', payload='Got all files.')
        freq_max1, freq_max2, freq_max3, freq_max4, freq_max5  =  cal_max_freq(all_files, filedir)
        client.publish(topic='hello/world', payload='Got all frequencies.')
    except IOError:
        print("you have entered either the wrong data directory path or filepath")
        client.publish(topic='hello/world', payload='Wrong data dir or folder.')

    # load the model
    filename = "kmeanModel.npy"
    model = np.load(filename).item()
    client.publish(topic='hello/world', payload='Loaded K-means model.')
    # checking the iteration
    if (filepath == "1st_test/"):
        rhigh = 8
        rhigh = 4
    testlabels = []
    for i in range(0,rhigh):
        print("Checking for the bearing",i+1)
        result = pd.DataFrame()
        result['freq_max1'] = list((np.array(freq_max1))[:,i])
        result['freq_max2'] = list((np.array(freq_max2))[:,i])
        result['freq_max3'] = list((np.array(freq_max3))[:,i])
        result['freq_max4'] = list((np.array(freq_max4))[:,i])
        result['freq_max5'] = list((np.array(freq_max5))[:,i])

        X = result[["freq_max1","freq_max2","freq_max3","freq_max4","freq_max5"]]

        label = model.predict(X)
        labelfive = list(label[-100:]).count(5)
        labelsix = list(label[-100:]).count(6)
        labelseven = list(label[-100:]).count(7)
        totalfailur = labelfive+labelsix+labelseven#+labelfour
        ratio = (totalfailur/100)*100
        if(ratio >= 25):
            client.publish(topic='hello/world', payload='Bearing is suspected to fail.')
            client.publish(topic='hello/world', payload='Bearing is in normal condition.')

    # Asynchronously schedule this function to be run again in 5 seconds
    Timer(5, kmeans_test_run).start()

# Start executing the function above

# This is a dummy handler and will not be invoked
# Instead the code above will be executed in an infinite loop for our example
def function_handler(event, context):

Learn More

About the Author

Rosalia Nyurguhun is a software engineer at Intel in the Core and Visual Computing Group, working on scale enabling projects for the Internet of Things.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Intel Corporation
United States United States
You may know us for our processors. But we do so much more. Intel invents at the boundaries of technology to make amazing experiences possible for business and society, and for every person on Earth.

Harnessing the capability of the cloud, the ubiquity of the Internet of Things, the latest advances in memory and programmable solutions, and the promise of always-on 5G connectivity, Intel is disrupting industries and solving global challenges. Leading on policy, diversity, inclusion, education and sustainability, we create value for our stockholders, customers and society.
Group type: Organisation

43 members

You may also be interested in...


Comments and Discussions

-- There are no messages in this forum --
Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web05 | 2.8.181215.1 | Last Updated 20 Nov 2018
Article Copyright 2018 by Intel Corporation
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid