This tutorial will walk you through the basics of using the Deep Learning Deployment Toolkit's Inference Engine (included in the Intel® Computer Vision SDK). Here, inference is the process of using a trained neural network to infer meaning from data (e.g., images). In the code sample that follows, a video (frame by frame) is fed to the Inference Engine (our trained neural network) which then outputs a result (classification of an image). Inference can be done using various neural network architectures (AlexNet*, GoogleNet*, etc.). This example uses a Single Shot MultiBox Detector (SSD) on GoogleNet model. For an example of how SSD is used see this article on the Intel® Developer Zone.
The Inference Engine requires that the model be converted to IR (Intermediate Representation) files. This tutorial will walk you through the basics taking an existing model (GoogleNet) and converting it to IR (Intermediate Representation) files using the Model Optimizer.
So what's different about running a neural network on the Inference Engine versus an out of the box framework?
- The Inference Engine optimizes inference allowing a user to run deep learning deployments significantly faster on Intel® architecture. For more information on the performance on Intel® Processor Graphics see this article
- Inference can run on hardware other than the CPU such as the built-in Intel® GPU or Intel® FPGA accelerator card.
What you’ll Learn
- How to install the OpenCL™ Runtime Package
- How to install the Intel® Computer Vision SDK
- How to generate the .bin and .xml (IR files) needed for the Inference Engine from a Caffe model
- Run the Inference Engine using the generated IR files in a C++ application
- Compare the performance of CPU vs GPU
Gather your materials
- 5th or greater Generation Intel® Core™ processor. You can find the product name in Linux* by running the ‘lscpu’ command. The ‘Model name:’ contains the information about the processor.
Note: The generation number is embedded into the product name, right after the ‘i3’, ‘i5’, or ‘i7’. For example, the Intel® Core™ i5-5200U processor and the Intel® Core™ i5-5675R processor are both 5th generation, and the Intel® Core™ i5-6600K processor and the Intel® Core™ i5 6360U processor are both 6th generation.
- Ubuntu* 16.04.3 LTS
- In order to run inference on the integrated GPU:
- A processor with Intel® Iris® Pro graphics or HD Graphics
- No discrete graphics card installed (required by the OpenCL™ platform). If you have one, make sure to disable it in BIOS before going through this installation process.
- No drivers for other GPUs installed, or libraries built with support for other GPUs
Install OpenCL™ Runtime Package and other dependencies
In order to run inference on the GPU, you need to first install the OpenCL™ Runtime package. These commands install the OpenCL™ Runtime package, as well as some package dependencies required by the Intel® CV SDK.
Note: These steps are for Ubuntu 16.04.3 or later. If you have a version older than 16.04.3, then you need to still install the Package dependencies below, then skip down to the Install Intel® CV SDK section for instructions on installing the OpenCL™ Driver.
Intel® CV SDK Package dependencies:
sudo apt-get update
sudo apt-get install build-essential ffmpeg cmake checkinstall pkg-config yasm libjpeg-dev curl imagemagick gedit mplayer unzip libpng12-dev libcairo2-dev libpango1.0-dev libgtk2.0-dev libgstreamer0.10-dev libswscale.dev libavcodec-dev libavformat-dev
OpenCL™ Runtime package:
unzip SRB5.0_linux64.zip -d SRB5.0_linux64
sudo apt-get install xz-utils
tar -C intel-opencl -Jxf intel-opencl-r5.0-63503.x86_64.tar.xz
tar -C intel-opencl -Jxf intel-opencl-devel-r5.0-63503.x86_64.tar.xz
tar -C intel-opencl -Jxf intel-opencl-cpu-r5.0-63503.x86_64.tar.xz
sudo cp -R intel-opencl/* /
Install Intel® CV SDK
- Go to https://software.seek.intel.com/computer-vision-software
- Register, then wait for a confirmation email. It can take several hours to get the email. So go take a break and come back once you're received the email.
If you've already registered for the Intel® CV SDK before, you should get access to download almost immediately.
From the link in the email, download the Ubuntu* package
Unzip the contents (to a folder in your directory of choice)
tar zxvf intel_cv_sdk_ubuntu_r3_2017.1.163.tgz
Note: If you are running a version older than Ubuntu 16.04.3, then to install the OpenCL™ Driver you need to run the install_OCL_driver.sh script in the downloaded folder before running the Intel® CV SDK installation. This script can take over half an hour to complete. It will re-build the kernel with the updated driver. Your computer will restart through the process. Make sure to backup your data before running this script. If you prefer not to continue with this kernel re-build, then we recommend you install Ubuntu 16.04.3 or later which only needs a few files installed (instructions above) and does not need a kernel re-build.
- In the cv sdk folder:
Enter super user mode
Then run the installation wizard
and follow the instructions.
Get the code
Clone this repository
Note: Make sure to exit super user mode before cloning this repository.
git clone https:
Caffe is required to convert a Caffe model using the Model Optimizer. This script will install a version of Caffe to the opt/intel/ssdcaffe folder that is compatible with the Model Optimizer.
From the computer-vision-inference-tutorials/1-run-model-optimizer directory
Enter in a terminal:
This should take somewhere between 10 and 20 minutes depending on your system.
Generate the .bin and .xml (IR files) for the Inference Engine
The Caffe model consists of two files: SSD_GoogleNetV2_Deploy.prototxt and SSD_GoogleNetV2_Deploy.caffemodel. You will convert them to IR files by running the Model Optimizer using the runMO.py script.
First download the files:
Go back into super user mode and run:
python runMO.py -w SSD_GoogleNetV2_caffe/SSD_GoogleNetV2.caffemodel -d SSD_GoogleNetV2_caffe/SSD_GoogleNetV2_Deploy.prototxt
Verify the creation of the IR files
Look in the /artifacts/VGG_VOC0712_SSD_300x300_deploy folder.
Check that VGG_VOC0712_SSD_300x300_deploy.xml and VGG_VOC0712_SSD_300x300_deploy.bin were created in that directory
Make sure to exit super user mode before continuing
Run the Inference Engine using the IR files in a C++ application
Make sure to exit super user mode before building the application
First set the paths:
Note: If you get an error related to "undefined reference to 'google::FlagRegisterer...", try uninstalling libgflags-dev:
sudo apt-get remove libgflags-dev
Before running, download the test video file to a new
./IEobjectdetection -i videos/vtest.avi -fr 200 -m artifacts/VGG_VOC0712_SSD_300x300_deploy/VGG_VOC0712_SSD_300x300_deploy.xml -d CPU -l pascal_voc_classes.txt
You should see a video play with people walking across and red bouding boxes around them.
You should also see the output in the console showing the objects found and the confidence level.
Here is what the flags mean for running the application. This can also be found by running:
-h Print a usage message
-i <path> Required. Path to input video file
-fr <path> Number of frames from stream to process
-m <path> Required. Path to IR .xml file.
-l <path> Required. Path to labels file.
-d <device> Infer target device (CPU or GPU)
-t <type> Infer type (SSD, etc)
-pc Enables per-layer performance report
-thresh <val>confidence threshold for bounding boxes 0-1
-b <val> Batch size
In this sample video, there are approximately 790 frames, so by setting the number of frames to 790 you'll see the whole video.
If you change the threshold of the confidence level to 0.1, you'll see a lot more bounding boxes around the people, but also many false ones.
./IEobjectdetection -i videos/vtest.avi -fr 200 -m SSD_GoogleNet_v2_fp32.xml -d CPU -l pascal_voc_classes.txt -thresh 0.1
Compare the performance of CPU vs GPU
You can enable the output of performance data to the console by using the
./IEobjectdetection -i videos/vtest.avi -fr 200 -m SSD_GoogleNet_v2_fp32.xml -d CPU -l pascal_voc_classes.txt -pc
You'll see the Total time it took to run.
Since you installed the OpenCL™ drivers to use the GPU, you can try running inference on the GPU and compare the difference.
IT'S BEST TO OPEN A NEW TERMINAL WINDOW SO YOU CAN COMPARE THE RESULTS
Make sure to source your environment variables first:
Using the GPU is set by this flag
./IEobjectdetection -i videos/vtest.avi -fr 200 -m SSD_GoogleNet_v2_fp32.xml -d GPU -l pascal_voc_classes.txt -pc
The Total time between CPU and GPU will vary on your system.
How it works
How does the Inference Engine work?
The Inference Engine takes a representation of a neural network model and optimizes it to take advantage of advanced Intel® instruction sets in the CPU, and also makes it compatible with the other hardware accelerators (GPU and FPGA). To do this, the model files (e.g., .caffemodel, .prototxt) are given to the Model Optimizer which then processes the files and outputs two new files: a .bin and .xml. These two files are used instead of the original model files when you run your application. In this example, the .bin and .xml files are provided.
In the above diagram, IR stands for Intermediate Representation, which is just a name for the .xml and .bin files that are inputs to the Inference Engine.
When you want to use another model in the Inference Engine, refer to this tutorial on how to get the necessary files using the Model Optimizer. https://software.intel.com/en-us/inference-trained-models-with-intel-dl-deployment-toolkit-beta-2017r3
If you run into issues, don't hesitate to contact us on the forum https://software.intel.com/en-us/forums/computer-vision
High level steps in the code:
- Check arguments
- Video pre-processing
- Load model to the Inference Engine
- Run Inference
- Parse the results
- Render the frame
The video pre-processing step takes the current frame, and resizes it using the
cv2.resize() function, https://docs.opencv.org/3.0-beta/modules/imgproc/doc/geometric_transformations.html?highlight=resize#cv2.resize,
It also converts it to a planar format (by default it is in packed format). Packed format is arranged like RGBRGBRGB, planar is arranged like RRRGGGBBB.
This format is required by the Inference Engine because that is the format used by the Intel® Math Kernel Library (Intel® MKL). If you want to know more see
https://software.intel.com/en-us/ipp-dev-reference-pixel-and-planar-image-formats and https://software.intel.com/en-us/ipp-dev-guide-channel-and-planar-image-data-layouts
Loading a model into the Inference Engine
There are three steps to loading a model into the Inference Engine: load the plugin, read the network and load the model into the plugin
Load the plugin
This is done by using the
InferenceEnginePluginPtr _plugin() function.
Read the network
The network object is initialized
and then a network is read using the
FLAGS_m means the
-m flag, which in this case is
The weights are then added to the network
Note: The .xml file contains the network, and the .bin file contains the weights.
Load the model into the plugin
Refer to the code for how to setup the input and blobs.
Inference is ran using
_plugin->Infer(inputBlobs, outputBlobs, &dsc)
Note: The image size is resized based on the expected size in the model .xml file for every fram before inference is ran.
For more information on how to generate IR files from Caffe and TensorFlow models see:
If you run into issues, don't hesitate to ask on our forum