Click here to Skip to main content
13,904,127 members
Click here to Skip to main content
Add your own
alternative version

Stats

18.4K views
469 downloads
20 bookmarked
Posted 8 May 2015
Licenced CPOL

Integrate Windows Azure Face APIs in a C++ application

, 8 May 2015
Rate this:
Please Sign up or sign in to vote.
Learn how to integrate the new Windows Azure machine-learning APIs in a C++ application using C++ REST SDK

Microsoft has launched recently a new set of machine-learning APIs called "Project Oxford" that include functionality for face detection and recognition, speech recognition and synthesis, vision and understanding of natural languages. The face API have been demonstrated with a small application that went viral: how-old.net. In a few days tens of millions of users have tried hundred millions of pictures. Even if though age guessing was not that good the demo showed how easy these APIs can be integrated in any application.

The Project Oxford services are exposed as RESTful APIs and the SDK includes .NET and Java (for Android) REST wrappers. The documentation provides samples in variety of languages: JavaScript, C#, PHP, Python, Ruby, Curl, Java, ObjC. Since consuming RESTful services is also possible in C++ using the C++ REST SDK, I have decided to demonstrate how to integrate the face APIs in an MFC application. A similar approach can be taken to integrate speech, vision or natural language understanding.

Signing up for the service

The Project Oxford APIs are Windows Azure services available for free (and currently in a beta version), but all calls must be signed with a subscription key assigned to each Windows Azure account. In order to get this key you must sign-up for the each service individually (face, speech, vision, etc.).

In the Windows Azure portal you must go to Marketplace and choose New and then again select Marketplace.

You will be able to select an app service. Search for Face API and select that.

You must select a plan (the only one available now is for free), a name for the service and other things.

In the last page review the purchase and accept it.

Managing your subscription keys

Once the service is provisioned, you can view it under the Marketplace.

Use the Manage command to view and (if the case) regenerate your subscription keys for the different app services.

Copy the subscription key from here to use it with the service calls.

Face APIs

The Project Oxford APIs are documented here and the reference documentation for the Face API is available here.

The face API provides more than just face detection capabilities. It allows associating faces to persons, grouping persons into groups, identifying a person in a group based on one or more input faces, etc. Face services include:

  • Detection of human faces in an image
  • Verification that two faces represent the same person
  • Identification of a person in a group by faces
  • Dividing a list of faces into groups based on face similarities
  • Finding similar faces in a list of faces

In this article we will look only at face detection. Face detection means identifying human faces in an image. This returns the position of the face, position of eyes, nose and mouth (referred as face landmarks), and additional attributes such as head pose, gender and age. The last two are an experimental feature, and as how-old.net has shown guessing the age is not very accurate for the moment.

Face detection APIs have several limitations, including:

  • Supported images formats are BMP, PNG, JPEG and GIF
  • Image size must not exceed 4MB
  • Faces are only detected if they are larger than 36x36 pixels and smaller than 4096x4096 pixels; however, the maximum number of returned faces is 64 and because of various technical reasons not all faces may be detected

It is possible to detect faces in an image either specified by an URL with JSON content type, or uploaded as part of the request with content type application/octect-stream. In this article we will use the second option.

The detection API is documented here. For the following image, the service returns the JSON data shown below when requesting analysis of age, gender and head pose (and ignore the face landmarks for simplicity).

[
  {
    "faceId":"4ad67da7-c86b-4dc8-8565-a224cda71253",
    "faceRectangle":{
      "top":47,
      "left":53,
      "width":58,
      "height":58
    },
    "attributes":{
      "headPose":{
        "pitch":0.0,
        "roll":2.4,
        "yaw":-3.4
      },
      "gender":"male",
      "age":32
    }
  }
]

As a side note, in this case the face analysis only got the age wrong by several years (more than the actual age), which I expect to be somehow in a reasonable error margin. If the picture is smaller though, the analysis returns a different result, this time much closer to the actual age (at the time of taking the picture).

Demo C++ App

To demonstrate the use of these APIs in a C++ application I have prototyped a simple MFC application where you can load an image, run the face detection and then show the detected faces, age and gender on the image. The male faces would be identified with a blue rectangle and female faces with a red rectangle.

The application is very simplistic: it allows you to open a BMP or JPEG image which is then painted in the window's client area. There is no resizing or scrolling going on as this is only a demo. If you are interested, you can take a look at the attached source code to see how the loading and painting is done.

In order to consume the face REST APIs we need to use the C++ REST SDK. This is available as a NuGet package, so I used the Visual Studio's NuGet package manager to search for and install it.

Notice that the cpprest is an aggregate package that puts together all the individual packages targeting different platforms. The total size exceeds 1GB so you probably want to download only cpprestsdk.v120.windesktop.msvcstl.dyn.rt-dyn which is what you need to develop for Windows with Visual Studio 2013. (See C++ Rest SDK 2.5.0 release notes for more details.)

There are several components from the C++ REST SDK that we'll use: the http_client (used to connect to a HTTP service), json, async file streams and tasks. In order to use them we have to include several headers.

#include "cpprest\json.h"
#include "cpprest\http_client.h"
#include "cpprest\filestream.h"

using namespace concurrency;
using namespace concurrency::streams;

using namespace web;
using namespace web::http;
using namespace web::http::client;

In order to get the analysis of the faces in an image we have to do the following:

  • Load the image in a file stream
  • When the stream is available make the HTTP POST request to the detection API using an http_client object
  • When the response is available extract the json content from it (if successful)
  • When the result json is available parse it and use the result to draw rectangles and the age on faces

The description above indicates an asynchronous processing, which is possible with the PPL task programming model. We start an operation that returns a task and we set a continuation for each task that executes when the current task is done. Put all together, the code looks like this:

void detect_faces(
  std::function<void(web::json::value)> success,
  std::function<void(const char*)> error,
  utility::string_t const & filename,
  utility::string_t const & subscriptionKey,
  bool const analyzesFaceLandmarks,
  bool const analyzesAge,
  bool const analyzesGender,
  bool const analyzesHeadPose)
{
  file_stream<unsigned char>::open_istream(filename)
     .then([=](pplx::task<basic_istream<unsigned char>> previousTask)
     {
        try
        {
           auto fileStream = previousTask.get();

           auto client = http_client{U("https://api.projectoxford.ai/face/v0/detections")};

           auto query = uri_builder()
              .append_query(U("analyzesFaceLandmarks"), analyzesFaceLandmarks ? "true" : "false")
              .append_query(U("analyzesAge"), analyzesAge ? "true" : "false")
              .append_query(U("analyzesGender"), analyzesGender ? "true" : "false")
              .append_query(U("analyzesHeadPose"), analyzesHeadPose ? "true" : "false")
              .append_query(U("subscription-key"), subscriptionKey)
              .to_string();

           client
              .request(methods::POST, query, fileStream)
              .then([fileStream, success](pplx::task<http_response> previousTask)
              {
                 fileStream.close();

                 return previousTask.get().extract_json();
              })
              .then([success, error](pplx::task<json::value> previousTask)
              {
                 try
                 {
                    success(previousTask.get());
                 }
                 catch(http_exception const & e)
                 {
                    error(e.what());
                 }
              });
        }
        catch(std::system_error const & e)
        {
           error(e.what());
        }
     });
}

The detect_faces() function takes several parameters:

  • Two callbacks, one for success, when we pass the json value that we got back, and one in case of error, when we pass a string representing the error message
  • The path on disk of the image the analyze
  • The subscription key
  • Optional arguments that indicate what additional analysis to be performed

The function can be called as shown below:

auto doc = GetDocument();
auto path = doc->GetImagePath();
auto stdpath = path.GetBuffer(path.GetLength());
path.ReleaseBuffer();

auto error = [](const char* error){
  std::wostringstream ss;
  ss << error << std::endl;
  AfxMessageBox(ss.str().c_str()); };

auto werror = [](const wchar_t* error){
  std::wostringstream ss;
  ss << error << std::endl;
  AfxMessageBox(ss.str().c_str()); };

auto success = [this, werror](web::json::value object) {
  m_faces = faceapi::parse_face_result(object, werror);
  this->Invalidate();
};

faceapi::detect_faces(success, error, stdpath, U("your-subscription-key"), false, true, true, true);  

Notice that you have to use the subscription key you got when you signed up for the application service.

If the call is successful then we get a response with a JSON that looks like the example shown above. If the function failed then we also get a JSON value back with an error code and message. Such a message may look like this:

{
  "code":"InvalidImageSize",
  "message":"Image size is too small or too big."
}

Function parse_face_result() parses the JSON value we get back, either to extract the result of the analysis or the error message and returns a collection of face objects.

std::vector<faceapi::face> parse_face_result(
  web::json::value object,
  std::function<void(wchar_t const *)> error)
{
  std::vector<faceapi::face> faces;

  if(!object.is_null())
  {
     if(object.has_field(U("code")))
     {
        auto message = object.at(U("message")).as_string();
        error(message.c_str());
     }
     else 
     {
        auto arr = object.as_array();
        for(auto const & obj : arr)
        {
           try
           {
              auto face = faceapi::face{};
              face.faceId = obj.at(U("faceId")).as_string();

              auto const & fr = obj.at(U("faceRectangle"));
              face.faceRectangle.width = fr.at(U("width")).as_integer();
              face.faceRectangle.height = fr.at(U("height")).as_integer();
              face.faceRectangle.top = fr.at(U("top")).as_integer();
              face.faceRectangle.left = fr.at(U("left")).as_integer();

              auto const & attr = obj.at(U("attributes")).as_object();
              if(!attr.empty())
              {
                 face.attributes.age = attr.at(U("age")).as_integer();
                 face.attributes.gender = (attr.at(U("gender")).as_string() == U("male")) ? faceapi::gender::male : faceapi::gender::female;

                 auto const & hpose = attr.at(U("headPose")).as_object();
                 if(!hpose.empty())
                 {
                    face.attributes.headPose.pitch = hpose.at(U("pitch")).as_double();
                    face.attributes.headPose.roll = hpose.at(U("roll")).as_double();
                    face.attributes.headPose.yaw = hpose.at(U("yaw")).as_double();
                 }
              }

              faces.push_back(face);
           }
           catch(std::exception const &)
           {
           }
        }
     }
  }

  return faces;
}

The face type and other types are defined as follows:

namespace faceapi
{
   struct face_rectangle
   {
      int width = 0;
      int height = 0;
      int left = 0;
      int top = 0;
   };

   struct face_landmark
   {
      double x = 0;
      double y = 0;
   };

   struct face_landmarks
   {
      face_landmark pupilLeft;
      face_landmark pupilRight;
      face_landmark noseTip;
      face_landmark mouthLeft;
      face_landmark mouthRight;
      face_landmark eyebrowLeftOuter;
      face_landmark eyebrowLeftInner;
      face_landmark eyeLeftOuter;
      face_landmark eyeLeftTop;
      face_landmark eyeLeftBottom;
      face_landmark eyeLeftInner;
      face_landmark eyebrowRightInner;
      face_landmark eyebrowRightOuter;
      face_landmark eyeRightInner;
      face_landmark eyeRightTop;
      face_landmark eyeRightBottom;
      face_landmark eyeRightOuter;
      face_landmark noseRootLeft;
      face_landmark noseRootRight;
      face_landmark noseLeftAlarTop;
      face_landmark noseRightAlarTop;
      face_landmark noseLeftAlarOutTip;
      face_landmark noseRightAlarOutTip;
      face_landmark upperLipTop;
      face_landmark upperLipBottom;
      face_landmark underLipTop;
      face_landmark underLipBottom;
   };

   struct head_pose 
   {
      double roll = 0;
      double yaw = 0;
      double pitch = 0;
   };

   enum class gender 
   {
      female,
      male
   };

   struct face_attributes 
   {
      int age = 0;
      gender gender = gender::female;
      head_pose headPose;
   };

   struct face
   {
      std::wstring faceId;
      face_rectangle faceRectangle;
      face_attributes attributes;
   };

   void detect_faces(
      std::function<void(web::json::value)> success, 
      std::function<void(char const *)> error,
      utility::string_t const & filename, 
      utility::string_t const & subscriptionKey,
      bool const analyzesFaceLandmarks = false,
      bool const analyzesAge = false,
      bool const analyzesGender = false,
      bool const analyzesHeadPose = false);

   std::vector<face> parse_face_result(
      web::json::value object,
      std::function<void(wchar_t const *)> error);
}

With the face information available we can draw the rectangle and the age text on the image. This is done in the OnDraw() method of the view, but if you want to see how it's done look at the source code, as it is not that important for the purpose of this article.

The following image shows how the detection works on an image with a bunch of people (image source: wikipedia):

Reworking the code

The detect_faces() function takes some functions as parameter that it calls back on success or failure. This can be reworked so that it actually returns a task and then we set a continuation on the task to do something when the result is available. Also, exception handling could be moved out of this task to the last continuation, as any exception escaping from a task's body is caught and re-thrown out of a wait() or get() call on the last task. So the detect_faces() function can be re-implemented as this:

pplx::task<web::json::value> detect_faces_async(
  utility::string_t const & filename,
  utility::string_t const & subscriptionKey,
  bool const analyzesFaceLandmarks,
  bool const analyzesAge,
  bool const analyzesGender,
  bool const analyzesHeadPose,
  pplx::cancellation_token const & token)
{
  return file_stream<unsigned char>::open_istream(filename)
     .then([=](pplx::task<basic_istream<unsigned char>> previousTask)
     {
        if(!token.is_canceled())
        {
           auto fileStream = previousTask.get();

           auto client = http_client{U("https://api.projectoxford.ai/face/v0/detections")};

           auto query = uri_builder()
              .append_query(U("analyzesFaceLandmarks"), analyzesFaceLandmarks ? "true" : "false")
              .append_query(U("analyzesAge"), analyzesAge ? "true" : "false")
              .append_query(U("analyzesGender"), analyzesGender ? "true" : "false")
              .append_query(U("analyzesHeadPose"), analyzesHeadPose ? "true" : "false")
              .append_query(U("subscription-key"), subscriptionKey)
              .to_string();

           return client
              .request(methods::POST, query, fileStream, token)
              .then([fileStream](pplx::task<http_response> previousTask)
           {
              fileStream.close();

              return previousTask.get().extract_json();
           });
        }
        
        return pplx::task_from_result(json::value());
     });
}

In this case we'd have to rework the calling code also to the following:

auto doc = GetDocument();
auto path = doc->GetImagePath();
auto stdpath = path.GetBuffer(path.GetLength());
path.ReleaseBuffer();

auto error = [](const char* error){
  std::wostringstream ss;
  ss << error << std::endl;
  AfxMessageBox(ss.str().c_str()); };

auto werror = [](const wchar_t* error){
  std::wostringstream ss;
  ss << error << std::endl;
  AfxMessageBox(ss.str().c_str()); };

auto success = [this, werror](web::json::value object) {
  m_faces = faceapi::parse_face_result(object, werror);
  this->Invalidate();
};

faceapi::detect_faces_async(stdpath, U("your-subscription-key"), false, true, true, true)
  .then([this, werror, error](pplx::task<web::json::value> previousTask) {
     try 
     {
        m_faces = faceapi::parse_face_result(previousTask.get(), werror);
        this->Invalidate();
     }
     catch(std::exception const & e)
     {
        error(e.what());
     }
  });

A similar implementation can be put in place for the detection of faces in an image specified by an URL. In this case we no longer have to load a file from disk, but instead pass a JSON value in the body of the request.

pplx::task<web::json::value< detect_faces_from_url_async(
  utility::string_t const & url,
  utility::string_t const & subscriptionKey,
  bool const analyzesFaceLandmarks,
  bool const analyzesAge,
  bool const analyzesGender,
  bool const analyzesHeadPose,
  pplx::cancellation_token const & token)
{
  auto client = http_client{U("https://api.projectoxford.ai/face/v0/detections")};

  auto query = uri_builder()
     .append_query(U("analyzesFaceLandmarks"), analyzesFaceLandmarks ? "true" : "false")
     .append_query(U("analyzesAge"), analyzesAge ? "true" : "false")
     .append_query(U("analyzesGender"), analyzesGender ? "true" : "false")
     .append_query(U("analyzesHeadPose"), analyzesHeadPose ? "true" : "false")
     .append_query(U("subscription-key"), subscriptionKey)
     .to_string();

  auto content = web::json::value {};
  content[U("url")] = web::json::value(url);

  return client
     .request(methods::POST, query, content, token)
     .then([](pplx::task<http_response> previousTask)
  {
     return previousTask.get().extract_json();
  });
}

Conclusions

Project Oxford provides a series of machine learning APIs that are available for free (though in beta for now) and can be easily integrated into your applications. In this article I how shown what you have to do to start using the APIs and how you can consume some of the face APIs in a C++ application (with MFC) by using the C++ REST SDK.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Marius Bancila
Architect Visma Software
Romania Romania
Marius Bancila is the author of Modern C++ Programming Cookbook and The Modern C++ Challenge. He used to be a Microsoft MVP for VC++ and later Visual Studio and Development Technologies for 11 years. He works as a system architect for Visma, a Norwegian-based company. He is mainly focused on building desktop applications with VC++ and VC#. He keeps a blog at http://www.mariusbancila.ro/blog, focused on Windows programming. He is the co-founder of codexpert.ro, a community for Romanian C++ programmers. You can follow Marius on Twitter at @mariusbancila.

You may also be interested in...

Pro

Comments and Discussions

 
QuestionVisual C++ sample using Microsoft.ProjectOxford.SpeechRecognition package Pin
Gene Novacek12-Oct-16 13:39
memberGene Novacek12-Oct-16 13:39 
QuestionI am trying to do a similar thing in C++ and want to get the size_t of the filestream in C++ Pin
Member 1276512111-Oct-16 14:19
memberMember 1276512111-Oct-16 14:19 
Questionimage size is too small or too big Pin
Member 1274708718-Sep-16 20:56
memberMember 1274708718-Sep-16 20:56 
QuestionBroken Pin
Member 1247437020-Apr-16 14:32
memberMember 1247437020-Apr-16 14:32 
QuestionC++ Project Pin
Member 983646911-May-15 16:12
memberMember 983646911-May-15 16:12 
AnswerRe: C++ Project Pin
Marius Bancila21-May-15 11:19
professionalMarius Bancila21-May-15 11:19 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web01 | 2.8.190306.1 | Last Updated 8 May 2015
Article Copyright 2015 by Marius Bancila
Everything else Copyright © CodeProject, 1999-2019
Layout: fixed | fluid