Half year ago, this post announced the new open source project ANNdotNET, which was ANN part of the GPdotNET v4 – artificial intelligence tool. On that day, I finished my new book about machine learning and genetic programming when I also released the new version of GPdotNET V5.0 genetic programming tool, without ANN and other non GP related modules. Now, my second big open source project achieved the first stable released.
ANNdotNET (http://github.com/bhrnjica/anndotnet) is a deep learning tool on .NET platform, which has similar workflow as GPdotNET. Both projects share several modules, mostly for data preparation, and model evaluation since all that stuff is the same.
ANNdotNET is a project which is more than a GUI tool, since it contains CMD tool, which can be part of a bigger cloud solution. There are several key concepts of the project which are worth mentioning here.
1. Machine Learning Configuration mlconfig File
The ANNdotNET is based on so called machine learning configuration file, where everything about data, training and learning parameters, as well as neural network layers are stored in the file called mlconfig file. Along mlconfig file, there are several other file types generated during development of the ml solution. The mlconfig file can be shared between cloud services in order to prepare and transform data, train, evaluate or export ml models. If you want to see more information about files in ANNdotNET, you can look at the wiki page of the project. Since the mlconfig file is independent of the tool, it can be executed with GUI or CMD tool, or any other custom tool, implemented on anndotnet API.
2. Machine Learning Project Explorer
In order to start developing ml solution with ANNdotNET, the first thing you do is create an annproject file, by selecting New option from the Application command. Under annproject, the user can create as many mlconfig files as he/she wants. The annproject and related mlconfig files are presented in the ML Project Explorer, where the user can manage them as ordinary list items.
3. ANNdotNET MLEngine – Machine Learning Engine
ANNdotNET introduces the ANNdotNET Machine Learning Engine (MLEngine) which is responsible for training and evaluation models defined in the mlconfig files.The ML Engine relies on Microsoft Cognitive Toolkit, CNTK open source library which is proved to be one of the best open source library for deep learning. Through all application’s components ML Engine exposed all great features of the CNTK, e.g., GPU support for training and evaluation, different kind of learners, but also extends CNTK features with more Evaluation functions (RMSE, MSE, Classification Accuracy, Coefficient of Determination, etc.), Extended Mini-batch Sources, Trainer and Evaluation models.
MLEngine is built on top of CNTK and .NET, with the ability to provide backed component for any cloud/on-premise ML solution.
4. Visual Neural Network Designer
ML Engine also contains the implementation of neural network layers which are supposed to be high level CNTK API very similar as layer implementation in Keras and other python based deep learning APIs. With this implementation, the ANNdotNET implements the Visual Neural Network Designer called ANNdotNET NNDesigner which allows the user to design neural network configuration of any size with any type of the layers. In the first release, the following layers are implemented:
- Normalization Layer – takes the numerical features and normalizes its values before getting to the network. More information can be found here.
- Dense – classic neural network layer with activation function
- LSTM – LSTM layer with option for peephole and self-stabilization
- Embedding – embedding layer
- Drop – drop layer
More layer types will be added in the future release. More information about Visual Network Designer can be found in the previous blog post.
5. Data Transformation
Along the ml related stuff, ANNdotNET implements a set of components for data transformation from raw dataset into mlready datasets. The user doesn’t worry about complex CNTK file format, one-hot encoding, and other data and variable transformation, e.g., handling missing values, data normalization, etc. Data transformation starts loading raw data file into ANNdotNET, then with set of GUI related options, the data can be completely prepared to mlready dataset. There are a set of short videos about how to quickly transform raw dataset into mlready dataset.
5. Model Evaluation, Saving Good Models & Retraining Trained Models
Once the model is trained, ANNdotNET provides a basic evaluation tool for evaluating trained models. The MLEvaluator contains a set of basic options in order to evaluate regression, binary or multi-class classification models. Without leaving ANNdotNET, the user has the ability to decide if the model is good or not by performing a set of statistics measures against model and related datasets (training, validation and test). Beside evaluation, ANNdotNET offers instantly evaluation during training phase, by providing an option for saving good models during training phase. In this way, ANNdotNET has the ability to select the best trained model regardless of the number of iterations. Different strategy for selecting the best model among a set of saved models will be implemented in the future. Also, any previous trained models can be trained again from the last check point. This is an important option in various scenarios. For example, to change some parameters and continue training. Also, this option has the ability to start training model on one machine or environment, and then continue with training on a different machine or environment.
ANNdotNET – is an open source project for deep learning on .NET Platform. This is a complete GUI solution for data preparation, training, evaluation and deployment ml models. ANNdotNET introduces the ANNdotNET Machine Learning Engine (
MLEngine) which is responsible for training and evaluation models defined in the mlconfig files. The
MLEngine relies on Microsoft Cognitive Toolkit, CNTK open source library which is proved to be one of the best open source libraries for deep learning. Through all application’s components,
MLEngine exposed all great features of the CNTK, e.g., GPU support for training and evaluation, different kind of learners.
MLEngine also extends CNTK features with more evaluation functions (RMSE, MSE, Classification Accuracy, Coefficient of Determination, etc.), Extended Mini-batch Sources, Trainer and Evaluation models.
The process of creating, training, evaluating and exporting models is provided from the GUI Application and does not require knowledge for supported programming languages.
The ANNdotNET is ideal in several scenarios:
- more focus on network development and training process using classic desktop approach, instead of focusing on coding
- less time spending on debugging source code, more focusing on different configuration and parameter variants
- ideal for engineers/users who are not familiar with programming languages
- in case the problem requires coding custom models, or training process, ANNdotNET CMD provides high level of API for such implementation
- all ml configurations developed with GUI tool, can be handled with CMD tool and vice versa
In case you like this project, star it on GitHub at http://github.com/bhrnjica/anndotnet. In case you want to use it in your academic paper, please cite it appropriately as specified at this link: https://doi.org/10.5281/zenodo.1461722
Bahrudin Hrnjica holds a Ph.D. degree in Technical Science/Engineering from University in Bihać.
Besides teaching at University, he is in the software industry for more than two decades, focusing on development technologies e.g. .NET, Visual Studio, Desktop/Web/Cloud solutions.
He works on the development and application of different ML algorithms. In the development of ML-oriented solutions and modeling, he has more than 10 years of experience. His field of interest is also the development of predictive models with the ML.NET and Keras, but also actively develop two ML-based .NET open source projects: GPdotNET-genetic programming tool and ANNdotNET - deep learning tool on .NET platform. He works in multidisciplinary teams with the mission of optimizing and selecting the ML algorithms to build ML models.
He is the author of several books, and many online articles, writes a blog at http://bhrnjica.net, regularly holds lectures at local and regional conferences, User groups and Code Camp gatherings, and is also the founder of the Bihac Developer Meetup Group. Microsoft recognizes his work and awarded him with the prestigious Microsoft MVP title for the first time in 2011, which he still holds today.