Using the Model Optimizer to Convert MXNet Models

Intel

0/5 (0 vote)

Jan 23, 2019

CPOL

4082

The Model Optimizer is a cross-platform command-line tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices.

Introduction

The Model Optimizer process assumes you have a network model trained using a supported frameworks. The scheme below illustrates the typical workflow for deploying a trained deep learning model:

A summary of the steps for optimizing and deploying a model that was trained with the MXNet* framework:

Configure the Model Optimizer for MXNet* (MXNet was used to train your model).
Convert an MXNet model to produce an optimized Intermediate Representation (IR) of the model based on the trained network topology, weights, and biases values.
Test the model in the Intermediate Representation format using the Inference Engine in the target environment via provided Inference Engine validation application or sample applications.
Integrate the Inference Engine in your application to deploy the model in the target environment.

Model Optimizer Workflow

The Model Optimizer process assumes you have a network model that was trained with one of the supported frameworks. The workflow is:

Configure Model Optimizer for the MXNet* framework by running a configuration bash script for Linux* OS or a batch file for Windows* OS from the <INSTALL_DIR>/deployment_tools/model_optimizer/install_prerequisites folder:
- For Linux* OS:
```
install_prerequisites_mxnet.sh
```
- For Windows* OS:
```
install_prerequisites_mxnet.bat
```
For more details on configuring the Model Optimizer, see Configure the Model Optimizer.
Provide as input a trained model that contains the certain topology, described in the .json file, and the adjusted weights and biases, described in .params.
Convert the MXNet* model to an optimized Intermediate Representation.

The Model Optimizer produces as output an Intermediate Representation (IR) of the network can be read, loaded, and inferred with the Inference Engine. The Inference Engine API offers a unified API across a number of supported Intel® platforms. The Intermediate Representation is a pair of files that describe the whole model:

.xml: Describes the network topology
.bin: Contains the weights and biases binary data

Supported Topologies

The table below shows the supported models, with the links to the model repository, symbol file, and parameters file:

Model Name	Model File
VGG-16	Repo, Symbol, Params
VGG-19	Repo, Symbol, Params
ResNet-152 v1	Repo, Symbol, Params
SqueezeNet_v1.1	Repo, Symbol, Params
Inception BN	Repo, Symbol, Params
CaffeNet	Repo, Symbol, Params
DenseNet-121	Repo, Symbol, Params
DenseNet-161	Repo, Symbol, Params
DenseNet-169	Repo, Symbol, Params
DenseNet-201	Repo, Symbol, Params
MobileNet	Repo, Symbol, Params
SSD-ResNet-50	Repo, Symbol + Params
SSD-VGG-16-300	Repo, Symbol + Params
SSD-Inception v3	Repo, Symbol + Params
FCN8 (Semantic Segmentation)	Repo, Symbol, Params

Other supported topologies

Style transfer model can be converted using the instructions from the Convert a Style Transfer Model from MXNet* section.

Convert an MXNet* Model

To convert an MXNet model:

Go to the <INSTALL_DIR>/deployment_tools/model_optimizer directory.
To convert an MXNet* model contained in a model-file-symbol.json and model-file-0000.params, run the Model Optimizer launch script mo.py, specifying a path to the input model file:
```
python3 mo_mxnet.py --input_model model-file-0000.params
```

Two groups of parameters are available to convert your model:

Framework-agnostic parameters: Parameters used to convert any model trained in any supported framework
MXNet-specific parameters: Parameters used to convert only MXNet models

Using Framework-Agnostic Conversion Parameters

To adjust the conversion process, you can use the general (framework-agnostic) parameters:

	optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         The name of the input operation of the given model.
                        Usually this is a name of the input placeholder of the
                        model.
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --move_to_preprocess  Move mean values to IR preprocess section
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, e.g.: "node_name->True"
  --generate_deprecated_IR_V2
                        Force to generate legacy/deprecated IR V2 to work with
                        previous versions of the Inference Engine. The
                        resulting IR may or may not be correctly loaded by
                        Inference Engine API (including the most recent and
                        old versions of Inference Engine) and provided as a
                        partially-validated backup option for specific
                        deployment scenarios. Use it at your own discretion.
                        By default, without this option, the Model Optimizer
                        generates IR V3.

NOTE: Model Optimizer does not revert input channels from RGB to BGR by default as it was in 2017 R3 Beta release. The command line parameter --reverse_input_channels must be specified manually to perform reversion. For details, refer to When to Reverse Input Channels chapter.

The sections below provide details on using particular parameters and examples of CLI commands.

When to Specify Mean and Scale Values

Usually neural network models are trained with the normalized input data. This means that the input data values are converted to be in a specific range, for example, [0, 1] or [-1, 1]. Sometimes the mean values (mean images) are subtracted from the input data values as part of the pre-processing. There are two cases how the input data pre-processing is implemented:

The input pre-processing operations are a part of a topology. In this case, the application that uses the framework to infer the topology does not pre-process the input.
The input pre-processing operations are not a part of a topology and the pre-processing is performed within the application which feeds the model with an input data.

In the first case, the Model Optimizer generates the IR with required pre-processing layers and Inference Engine samples may be used to infer the model.

In the second case, information about mean/scale values should be provided to the Model Optimizer to embed it to the generated IR. Model Optimizer provides a number of command line parameters to specify them: --scale, --scale_values, --mean_values, --mean_file.

If both mean and scale values are specified, the mean is subtracted first and then scale is applied. Input values are divided by the scale value(s).

There is no a universal recipe for determining the mean/scale values for a particular model. The steps below could help to determine them:

Read the model documentation. Usually the documentation describes mean/scale value if the pre-processing is required.
Open the example script/application executing the model and track how the input data is read and passed to the framework.
Open the model in a visualization tool and check for layers performing subtraction or multiplication (like Sub, Mul, ScaleShift, Eltwise etc) of the input data. If such layers exist, the pre-processing is most probably the part of the model.

When to Specify Input Shapes

There are situations when the input data shape for the model is not fixed, like for the fully-convolutional neural networks. In this case, for example, TensorFlow* models contain -1 values in the shape attribute of the Placeholder operation. Inference Engine does not support input layers with undefined size, so if the input shapes are not defined in the model, the Model Optimizer fails to convert the model.

The solution is to provide the input shape(s) using the --input_shape command line parameter for all inputs of the model or provide the batch size using the -b command line parameter if the model contains just one input with undefined batch size only. In the latter case, the Placeholder shape for the TensorFlow* model looks like this [-1, 224, 224, 3].

When to Reverse Input Channels

Inference Engine samples load input images in BGR channels order. But the model may be trained on images loaded with the RGB channels order. In this case, inference results using the Inference Engine samples will be incorrect. The solution is to provide --reverse_input_channels command-line parameter. Then the Model Optimizer performs first convolution or other channel dependent operation weights modification so these operations output will be like the image is passed with RGB channels order.

Command-Line Interface (CLI) Examples Using Framework-Agnostic Parameters

Launching the Model Optimizer for <model>.params with debug log level: Use this to better understand what is happening internally when a model is converted:
```
python3 mo_mxnet.py --input_model <model>.params --log_level DEBUG
```
Launching the Model Optimizer for <model>.params with the output Intermediate Representation called result.xml and result.bin that are placed in the specified ../../models/:
```
python3 mo_mxnet.py --input_model <model>.params --model_name result --output_dir ../../models/
```
Launching the Model Optimizer for <model>.params and providing scale values for a single input:
```
python3 mo_mxnet.py --input_model <model>.params --scale_values [59,59,59]
```
Launching the Model Optimizer for model.params with two inputs with two sets of scale values for each input. A number of sets of scale/mean values should be exactly the same as the number of inputs of the given model:
```
python3 mo_mxnet.py --input_model <model>.params --input data,rois --scale_values [59,59,59],[5,5,5]
```
Launching the Model Optimizer for <model>.params with specified input layer (data), changing the shape of the input layer to [1,3,224,224], and specifying the name of the output layer:
```
python3 mo_mxet.py --input_model <model>.params --input data --input_shape [1,3,224,224] --output pool5
```
Launching the Model Optimizer for <model>.params with disabled fusing for linear operations with convolution, set by the --disable_fusing flag, and grouped convolutions, set by the --disable_gfusing flag:
```
python3 mo_mxnet.py --input_model <model>.params --disable_fusing --disable_gfusing
```
Launching the Model Optimizer for <model>.params, reversing the channels order between RGB and BGR, specifying mean values for the input and the precision of the Intermediate Representation to be FP16:
```
python3 mo_mxnet.py --input_model <model>.params --reverse_input_channels --mean_values [255,255,255] --data_type FP16
```
Launching the Model Optimizer for <model>.params with extensions from specified directories. In particular, from /home/ and from /home/some/other/path.
In addition, the following command shows how to pass the mean file to the Intermediate Representation. The mean file must be in a binaryproto format:
```
python3 mo_mxnet.py --input_model <model>.params --extensions /home/,/some/other/path/ --mean_file mean_file.binaryproto
```

Use MXNet*-Specific Conversion Parameters

The following list provides the MXNet*-specific parameters.

MXNet-specific parameters:
  --input_symbol <symbol_file_name>
                        Symbol file (for example, "model-symbol.json") that contains a topology structure and layer attributes
  --nd_prefix_name <nd_prefix_name>
                        Prefix name for args.nd and argx.nd files
  --pretrained_model_name <pretrained_model_name>
                        Name of a pretrained MXNet model without extension and epoch number. This model will be merged with args.nd and argx.nd files
  --save_params_from_nd
                        Enable saving built parameters file from .nd files
  --legacy_mxnet_model
                        Enable MXNet loader to make a model compatible with the latest MXNet version. Use only if your model was trained with MXNet version lower than 1.0.0

NOTE: By default, the Model Optimizer does not use the MXNet loader, as it transforms the topology to another format, which is compatible with the latest version of MXNet, but it is required for models trained with lower version of MXNet. If your model was trained with MXNet version lower than 1.0.0, specify the --legacy_mxnet_model key to enable the MXNet loader. However, the loader does not support models with custom layers. In this case, you must manually recompile MXNet with custom layers and install it to your environment.

Convert a Style Transfer Model from MXNet*

The tutorial explains how to generate a model for style transfer using the public MXNet* neural style transfer sample. To use the style transfer sample from the Intel® Distribution of OpenVINO™ toolkit, follow the steps below as no public pre-trained style transfer model is provided with the Intel Distribution of OpenVINO toolkit:

Download or clone the repository with an MXNet neural style transfer sample: Zhaw's Neural Style Transfer repository.
Prepare the environment required to work with the cloned repository:
1. Install packages dependency:
```
sudo apt-get install python-tk
```
2. Install Python* requirements:
```
pip3 install --user mxnet
pip3 install --user matplotlib
pip3 install --user scikit-image
```
Download the pre-trained VGG19 model and save it to the root directory of the cloned repository because the sample expects the model vgg19.params file to be in that directory.

Modify source code files of style transfer sample from cloned repository:

Go to the fast_mrf_cnn subdirectory:
```
cd ./fast_mrf_cnn
```

Open the symbol.py file and modify the decoder_symbol() function. Replace:

def decoder_symbol():
data = mx.sym.Variable('data')
data = mx.sym.Convolution(data=data, num_filter=256, kernel=(3,3), pad=(1,1), stride=(1, 1), name='deco_conv1')

with the following code:

def decoder_symbol_with_vgg(vgg_symbol):
data = mx.sym.Convolution(data=vgg_symbol, num_filter=256, kernel=(3,3), pad=(1,1), stride=(1, 1), name='deco_conv1')

Save and close the symbol.py file.
Open and edit the make_image.py file: Modify the __init__() function in the Maker class. Replace:
```
decoder = symbol.decoder_symbol()
```
with the following code:
```
decoder = symbol.decoder_symbol_with_vgg(vgg_symbol)
```
To join the pre-trained weights with the decoder weights, make the following changes: After the code lines for loading the decoder weights:
```
args = mx.nd.load('%s_decoder_args.nd'%model_prefix)
auxs = mx.nd.load('%s_decoder_auxs.nd'%model_prefix)
```
add the following line:
```
arg_dict.update(args)
```

Use arg_dict instead of args as a parameter of the decoder.bind() function. Replace the line:

self.deco_executor = decoder.bind(ctx=mx.cpu(), args=args, aux_states=auxs)

with the following:

self.deco_executor = decoder.bind(ctx=mx.cpu(), args=arg_dict, aux_states=auxs)

Replace all mx.gpu with mx.cpu in the decoder.bind() function.

To save the result model as a .json file, add the following code to the end of the generate() function in the Maker class:

self.vgg_executor._symbol.save('{}-symbol.json'.format('vgg19'))
self.deco_executor._symbol.save('{}-symbol.json'.format('nst_vgg19'))

Save and close the make_image.py file.

Run the sample with a decoder model according to the instructions from the README.md file in the cloned repository.
For example, to run the sample with the pre-trained decoder weights from the models folder and output shape, use the following code:
```
import make_image
maker = make_image.Maker('models/13', (1024, 768))
maker.generate('output.jpg', '../images/tubingen.jpg')
```
Where 'models/13' string is composed of the following sub-strings:
- 'models/' - path to the folder that contains .nd files with pre-trained styles weights and '13'
- Decoder prefix: the repository contains a default decoder, which is the 13_decoder.
You can choose any style from collection of pre-trained weights. The generate() function generates nst_vgg19-symbol.json and vgg19-symbol.json files for the specified shape. In the code, it is [1024 x 768] for a 4:3 ratio, and you can specify another, for example, [224,224] for a square ratio.
Run the Model Optimizer to generate an Intermediate Representation (IR):
1. Create a new directory. For example:
```
mkdir nst_model
```
2. Copy the initial and generated model files to the created directory. For example, to copy the pre-trained decoder weights from the models folder to the nst_model directory, run the following commands:
```
cp nst_vgg19-symbol.json nst_model
cp vgg19-symbol.json nst_model
cp ../vgg19.params nst_model/vgg19-0000.params
cp models/13_decoder_args.nd nst_model
cp models/13_decoder_auxs.nd nst_model
```
  NOTE: Make sure that all the .params and .json files are in the same directory as the .nd files. Otherwise, the conversion process fails.
3. Run the Model Optimizer for MXNet. Use the --nd_prefix_name option to specify the decoder prefix and --input_shape to specify input shapes in [N,C,W,H] order. For example:
```
python3 mo.py --input_symbol <path/to/nst_model>/nst_vgg19-symbol.json --framework mxnet --output_dir <path/to/output_dir> --input_shape [1,3,224,224] --nd_prefix_name 13_decoder --pretrained_model <path/to/nst_model>/vgg19-0000.params
```
4. The IR is generated (.bin, .xml, and .mapping files) in the specified output directory and ready to be consumed by the Inference Engine.

Supported Layers and the Mapping to Intermediate Representation Layers

Number	Symbol Name in MXNet*	Layer Name in the Intermediate Representation
1	BatchNorm	BatchNormalization
2	Crop	Crop
3	ScaleShift	ScaleShift
4	Pooling	Pooling
5	SoftmaxOutput	SoftMax
6	SoftmaxActivation	SoftMax
7	null	Ignored, does not appear in IR
8	Convolution	Convolution
9	Deconvolution	Deconvolution
10	Activation(act_type = relu)	ReLU
11	ReLU	ReLU
12	LeakyReLU	ReLU (negative_slope = 0.25)
13	Concat	Concat
14	elemwise_add	Eltwise(operation = sum)
15	_Plus	Eltwise(operation = sum)
16	Flatten	Flatten
17	Reshape	Reshape
18	FullyConnected	FullyConnected
19	UpSampling	Resample
20	transpose	Permute
21	LRN	Norm
22	L2Normalization	Normalize
23	Dropout	Ignored, does not appear in IR
24	_copy	Ignored, does not appear in IR
25	_contrib_MultiBoxPrior	PriorBox
26	_contrib_MultiBoxDetection	DetectionOutput
27	broadcast_mul	ScaleShift
28	sigmoid	sigmoid
29	Activation (act_type = tanh)	Activation (operation = tanh)
30	LeakyReLU (act_type = prelu)	PReLU
31	LeakyReLU (act_type = elu)	Activation (operation = elu)
32	elemwise_mul	Eltwise (operation = mul)
33	add_n	Eltwise (operation = sum)
34	ElementWiseSum	Eltwise (operation = sum)
35	_mul_scalar	Power
36	broadcast_add	Eltwise (operation = sum)
37	slice_axis	Crop
38	Custom	See Custom Layers in the Model Optimizer
39	_minus_scalar	Power
40	Pad	Pooling
41	_contrib_Proposal	Proposal
42	ROIPooling	ROIPooling

MXNet* Models with Custom Layers

Internally, when you run the Model Optimizer, it loads the model, goes through the topology, and tries to find each layer type in a list of known layers. Custom layers are layers that are not included in the list of known layers. If your topology contains any layers that are not in this list of known layers, the Model Optimizer classifies them as custom.

To learn how to create extensions or a custom layer from your MXNet* model, see the MXNet* Models With Custom Layers section in the Model Optimizer Developer Guide.

Frequently Asked Questions (FAQ)

The Model Optimizer provides explanatory messages if it is unable to run to completion due to issues like typographical errors, incorrectly used options, or other issues. The message describes the potential cause of the problem and gives a link to the Model Optimizer FAQ. The FAQ has instructions on how to resolve most issues. The FAQ also includes links to relevant sections in the Model Optimizer Developer Guide to help you understand what went wrong.

Summary

In this document, you learned:

Basic information about how the Model Optimizer works with MXNet* models
Which MXNet* models are supported
How to convert a trained MXNet* model using the Model Optimizer with both framework-agnostic and MXNet-specific command-line options

Legal Information

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at http://www.intel.com/ or from the OEM or retailer.

No computer system can be absolutely secure.

Intel, Arria, Core, Movidia, Pentium, Xeon, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used with permission by Khronos

*Other names and brands may be claimed as the property of others.