Here I show how to load and run an ONNX model using C# in ONNX Runtime. I also include a sample for this article contains a working console application that demonstrates all the techniques shown here.
In this article in our series about using portable neural networks in 2020, you’ll learn how to install ONNX on an x64 architecture and use it in C#.
Microsoft co-developed ONNX with Facebook and AWS. Both the ONNX format and ONNX Runtime have industry support to make sure that all the important frameworks are capable of exporting their graphs to ONNX and that these models can run on any hardware configuration.
The ONNX Runtime is an engine for running machine learning models that have been converted to the ONNX format. Both traditional machine learning models and deep learning models (neural networks) can be exported to the ONNX format. The runtime can run on Linux, Windows, and Mac, and can run on a variety of chip architectures. It can also take advantage of hardware accelerators such as GPUs and TPUs. However, there is not an install package for every combination of OS, chip architecture, and accelerator, so you may need to build the runtime from source if you are not using one of the common combinations. Check the ONNX Runtime website to get installation instructions for the combination you need. This article will show how to install ONNX Runtime on an x64 architecture with a default CPU and an x64 architecture with a GPU.
In addition to being able to run on many hardware configurations, the runtime can be called from most popular programming languages. The purpose of this article is to show how to use ONNX Runtime in C#. I’ll show how to install the onnxruntime package. Once ONNX Runtime is installed, I’ll load a previously exported MNIST model into ONNX Runtime and use it to make predictions.
Installing and Importing the ONNX Runtime
Before using the ONNX Runtime, you will need to install Microsoft.ML.OnnxRuntime which is a NuGet package. You will also need to install the .NET CLI installed if you do not already have it. The following command installs the runtime on an x64 architecture with a default CPU:
dotnet add package microsoft.ml.onnxruntime
To install the runtime on an x64 architecture with a GPU, use this command:
dotnet add package microsoft.ml.onnxruntime.gpu
Once the runtime has been installed, it can be imported into your C# code files with the following using
statements:
using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
The using
statement that pulls in the Tensor tools will help us create inputs for ONNX Models and interpret the output (prediction) of an ONNX model.
Loading ONNX Models
The snippet below shows how to load an ONNX model into ONNX Runtime running in C#. This code creates a session object that can be used to make predictions. The model being used here is the ONNX model that was exported from PyTorch.
There are a few things worth noting here. First, you need to query the session to get its inputs. This is done using the session’s InputMetadata
property. Our MNIST model only has one input parameter: an array of 784 floats that represent one image from the MNIST dataset. If your model has more than one input parameter then InputMetadata
will have an entry for each parameter.
Utilities.LoadTensorData();
string modelPath = Directory.GetCurrentDirectory() + @"/pytorch_mnist.onnx";
using (var session = new InferenceSession(modelPath))
{
float[] inputData = Utilities.ImageData[imageIndex];
string label = Utilities.ImageLabels[imageIndex];
Console.WriteLine("Selected image is the number: " + label);
var inputMeta = session.InputMetadata;
var container = new List<NamedOnnxValue>();
foreach (var name in inputMeta.Keys)
{
var tensor = new DenseTensor<float>(inputData, inputMeta[name].Dimensions);
container.Add(NamedOnnxValue.CreateFromTensor<float>(name, tensor));
}
// Run code omitted for brevity.
}
Not shown in the code above are the utilities that read the raw MNIST images and convert each image to an array of 784 floats. The label for each image is also read in from the MNIST dataset so that the accuracy of predictions can be determined. This code is standard .NET code, but you are still encouraged to check it out and use it. It will save you time if you need to read in images that are similar to the MNIST dataset.
Using ONNX Runtime for Predictions
The function below shows how to use the ONNX session that was created when we loaded our ONNX model:
{
// Load code not shown for brevity.
// Run the inference
using (var results = session.Run(container))
{
// Get the results
foreach (var r in results)
{
Console.WriteLine("Output Name: {0}", r.Name);
int prediction = MaxProbability(r.AsTensor<float>());
Console.WriteLine("Prediction: " + prediction.ToString());
}
}
}
Most neural networks do not return a prediction directly. They return a list of probabilities for each of the output classes. In the case of our MNIST model, the return value for each image will be a list of 10 probabilities. The entry with the highest probability is the prediction. An interesting test that you can do is compare the probabilities the ONNX model returns to the probabilities returned from the original model when it is run within the framework that created the model. Ideally, the change in model format and runtime should not change any of the probabilities produced. This would make a good unit test that is run every time a change occurs to the model.
Summary and Next Steps
In this article, I provided a brief overview of the ONNX Runtime and the ONNX format. I then showed how to load and run an ONNX model using C# in ONNX Runtime.
The code sample for this article contains a working console application that demonstrates all the techniques shown here. This code sample is part of a GitHub repository that explores the use of Neural Networks for predicting the numbers found in the MNIST dataset. Specifically, there are samples that show how to create Neural Networks in Keras, PyTorch, TensorFlow 1.0, and TensorFlow 2.0.
If you want to learn more about Exporting to the ONNX format and using ONNX Runtime, check out the other articles in this series.
References
Keith is a sojourner in the software industry. He has over 30 years of experience building and bringing applications to market. He has worked for startups and large enterprises in roles ranging from tech lead to business development manager. He is currently a senior engineer on BNY Mellon's Distribution Analytics team where he is building data pipelines from on-premise data sources to the cloud.