Tracking Objects with Pixy Visual Sensor on Raspberry Pi using Windows 10 IoT Core

Victor Dashevsky

5.00/5 (10 votes)

Apr 28, 2018

CPOL

14 min read

37873

284

Processing the positioning information of visual objects detected by Pixy camera and received on Raspberry Pi via I2C, and using common design patterns in a C# program parsing robotics sensor data

Download source - 74 KB

Introduction

The release of Microsoft Windows 10 IoT Core in 2015 created new opportunities for C# developers to explore the world of robotics using Visual Studio and one of the most popular single board computers - Raspberry Pi. In this article, we will go over my C# code that integrates RPi with Pixy - the vision sensor geared for objects tracking - designed by Charmed Labs.

Based on Pixy's simplified data retrieval protocol, I'll show you how to receive and parse the information about visual object size and position on RPi over I2C bus using Windows 10 IoT Core application. In addition to implementing the technical side of the solution, I'll share my approach to architecturing the codebase. We will break the project into distinct layers, leverage the power of LINQ To Objects for processing data received from Pixy and put common design patterns to work for you. I am hoping that both robotic enthusiasts learning .NET as well as seasoned C# developers new to robotics will find something of interest here.

Background

Many of us when we first dive into robotics at some point have to pick our very first sensor to play with. Although I have not started with Pixy, for moderately experienced programmers, Pixy makes a reasonable first choice. It is crunching lots of visual info to deliver object positioning data to you in a compact format 50 times per second. Being able to track a visual object in your program with an investment of only $69 is pretty cool!

I completed this project over a year ago on Raspberry PI 2 and Visual Studio 2015 but these days you can use RPI 3 Model B+ and VS 2017. At the time of this writing, Pixy CMUcam5 that I used remains the latest version of the device.

For robotics enthusiasts new to Windows 10 IoT Core and C#, I’d like to add that the freely available development framework provided by Microsoft enables mastering the same technology as what numerous professional programmers use to build enterprise software and commercial web sites. Using VS.NET and applying Object Oriented Programming principles, you can build a large well organized system positioned for growth. Standard design patterns, NuGet packages, code libraries and ready-to-use solutions are available to us allowing to extend an experimental app way beyond its original scope. If you consider separation of concerns, segregation of logic within layers and loose coupling between them early in your design, you will be enjoying your growing project for years to come. This remains true whether building robotics apps professionally or as a hobby.

Using Pixy Visual Sensor

Pixy delivers coordinates of several visual objects with preset color signature straight to the RPi in the format explained here.

There are several ways to use this information. You can find code samples that display object boxes on the screen and samples that make Pixy follow your object using two servo motors. I built the latter but servo control goes beyond the scope of this article.

The source code included with this article is intended for feeding coordinates and size of an object to an autonomous robot. Based on preset object size, the distance to an object captured by Pixy and object angle translated from its coordinates, the RPi could then send signals to the motor drive to approach the object . Therefore, we will only be tracking a single object but you can alter this logic to your needs.

Here is a picture of Pixy attached to Pan and Tilt mechanism mounted on my robot:

Pixy can store up to 7 different color signatures in effect enabling tracking of 7 different objects with unique colors. Since in this application, we are only interested in a single object and because significant changes in lighting has an adverse effect on Pixy's color filtering algorithm, I am using all 7 for training Pixy on the same object under 7 different light conditions.

Prerequisites

You will need the following:

Pixy camera - you can buy it on Amazon, RobotShop or SparkFun for $69
Raspberry RPi2 or 3 with a power supply and connection wires
Visual Studio 2015 or 2017

The attached source code is wrapped into a ready-to-build Visual Studio solution, however, it is not expected to be your first Windows 10 IoT Core project. Those willing to experiment would already have a working project containing Universal Windows Platform (UWP) application built for ARM processor architecture and verified to work on Raspberry Pi. Note that my code assumes a headed application (see comments on the Timer type).

If you haven't played with RPi and Windows IoT Core yet, then "Hello, Blinky" is a popular first time project. If you don't find it at this link, look it up at https://developer.microsoft.com/en-us/windows/iot/samples.

There are many other examples guiding developers through creating their very first Windows 10 IoT Core application for RPi. For example, check out the following article - Building Your First App for Windows 10 IoT using C#.

Connecting Pixy to Raspberry Pi

I strongly recommend using a ribbon cable for connecting your Pixy as opposed to breadboard jumper wires. It is far more secure when it comes to placing your Pixy on Pan and Tilt mechanism. Uxcell IDC Socket 10 Pins Flat Ribbon Cable works quite well for me.

In my robot, I soldered a header to a prototype board where I created I2C hub along with 5V and ground for all my I2C devices. SDA and SCL are connected to PRi's GPIO 2 and 3 via jumper wires soldered to the prototype board. Power is supplied by a separate NiMH battery with 5V voltage regulator although for playing with just Pixy, you can simply use RPis.

Pixy I2C connection pinouts:

1	2 Power
3	4
5 SCL	6 Ground
7	8
9 SDA	10

Layered Design

The implementation is split into 3 layers:

Data Access Layer - receives raw data from data source. This layer hosts PixyDataReaderI2C class.
Repository - translates data received from the source into object block entity data model. This is accomplished via PixyObjectFinder.
App Logic - finds biggest objects of interest and determines the target object using CameraTargetFinder.

In a large system involving many different sensors, I’d segregate these layers into separate projects or, at least, into separate project folders.

Control Flow

The Data Access Layer does the hard work of handling every timer event looking for any matching objects. The App Logic on the other hand is only interested in successful matches when such occur.

Information flows from the bottom layer to the top via 2 callbacks implemented via generic delegates. The following flow sums it up but you'd want to revisit this section once you review the details of each layer. Note that higher level objects do not directly create objects at a lower level but use interfaces instead.

Camera Target Finder passes m_dlgtFindTarget delegate to Pixy Object Finder via fFindTarget
Pixy Object Finder passes m_dlgtExtractBlocks delegate to Pixy Data Reader via fParseBlocks
Camera Target Finder creates instances of the reader and object finder, then starts Pixy Object Finder
Pixy Object Finder calls Pixy Data Reader to create a timer and starts listening to the device
When data has been read, Pixy Data Reader invokes m_dlgtExtractBlocks in Pixy Object Finder (via fParseBlocks) to translate data into color signature objects
m_dlgtExtractBlocks invokes m_dlgrFindTarget in Camera Target Finder (via fFindTarget) to extract biggest objects of each color signature and determine the target coordinates.

When used in combination with interfaces, this type of flow decouples our classes from their dependencies so that the dependencies can be replaced or updated with minimal or no changes to our classes' source code. More on this below.

Data Model and Interfaces

Pixy Object Block includes x/y coordinates and its width/height. In addition, we want to keep track of time when it was detected:

    public class CameraObjectBlock
    {
        public int signature = -1;
        public int x = -1;
        public int y = -1;
        public int width = -1;
        public int height = -1;
        public DateTime dt;
    }

Camera Data Reader interface defines a signature for the higher level to decouple from dependency on the Reader implementation. While we have no intention of using other readers here, this leaves room for expansion so if we ever decide to use another Reader, the higher level logic will not have to change to instantiate a different class because that other Reader would still conform to the established interface.

Next, we define an interface for Pixy Object Finder. It's a good idea to keep all interfaces together separate from their implementation. That way, you can have a distinct domain consisting of data model and operations in effect showing what functions the application performs and what type of data it is dealing with:

    public interface ICameraDataReader
    {
        void Init(Func<byte[], int> fParseBlocks);
        Task Start();
        void Stop();
        void Listen();
        int GetBlockSize();    // bytes
        int GetMaxXPosition(); // pixels
        int GetMaxYPosition(); // pixels
    }

    public interface ICameraObjectFinder
    {
        void Start();
        void Stop();
        List<CameraObjectBlock> GetVisualObjects();
    }

    public abstract class CameraDataReader
    {
         protected CameraDataReader(ILogger lf)
         {}
    }

    public abstract class CameraObjectFinder
    {
        protected CameraObjectFinder(ICameraDataReader iCameraReader,
                       Func<List<CameraObjectBlock>, bool> fFindTarget,
                       ILogger lf)
        { }
    }

    public interface ILogger
    {
        void LogError(string s);
    }

Two abstract classes have been created to enforce particular constructor parameters.

Data Access Layer

Pixy processes an image frame every 1/50^th of a second. This means that you get a full update of all detected objects’ positions every 20ms (PIXY_INTERVAL_MS = 20). See http://cmucam.org/projects/cmucam5 for more information.

PixyDataReaderI2C class implements IPixyDataReader interface:

    public class PixyDataReaderI2C : CameraDataReader, ICameraDataReader
    {
        // We are creating DispatcherTimer so that you could add some UI should you choose to do so.
        // If you are creating a headless background application then use ThreadPoolTimer.
        private DispatcherTimer m_timerRead = null;
        private Windows.Devices.I2c.I2cDevice m_deviceI2cPixy = null;
        private Func<byte[], int> m_fParseBlocks = null;

        private const int PIXY_INTERVAL_MS = 20; // PIXY runs every 20 millisecons
        private const int BLOCK_SIZE_BYTES = 14;

        private int m_maxNumberOfExpectedObjects = 50;
        public int MaxNumberOfExpectedObjects { 
		get { return m_maxNumberOfExpectedObjects; } 
		set { m_maxNumberOfExpectedObjects = value; } 
	}
        public int m_sizeLeadingZeroesBuffer = 100; // PIXY data buffer may contain leading zeroes
        // Lens field-of-view: 75 degrees horizontal, 47 degrees vertical
        // The numbers for x, y, width and height are in pixels on Pixy's camera.
        // The values range from 0 to 319 for width and 0 to 199 for height.
        public int GetMaxXPosition() { return 400; }
        public int GetMaxYPosition() { return 200; }

        private ILogger m_lf = null;

        public PixyDataReaderI2C(ILogger lf) : base(lf)
        {
            m_lf = lf;
        }

        public void Init(Func<byte[], int> fParseBlocks)
        {
            // This method is required because the reader is created by the factory - at the higher
	    // level where the Parse Blocks method is unavailable.
            m_fParseBlocks = fParseBlocks;
        }

The data reader takes generic delegate fParseBlocks to allow invocation of higher level translation method w/out having us alter lower level logic, should the translator ever change.

Since my RPi is communicating with Pixy via I2C, we are first retrieving a device selector from the OS and then using it to enumerate I2C controllers. Finally, using the device settings object, we obtain a handle to our device:

        public async Task Start()
        {
            try
            {
                string deviceSelector = Windows.Devices.I2c.I2cDevice.GetDeviceSelector();

                // Get all I2C bus controller devices 
                var devicesI2C = await DeviceInformation.FindAllAsync(deviceSelector).AsTask();
                if (devicesI2C == null || devicesI2C.Count == 0)
                    return;

                // Create settings for the device address configured via PixyMon.
                var settingsPixy = new Windows.Devices.I2c.I2cConnectionSettings(0x54);
                settingsPixy.BusSpeed = Windows.Devices.I2c.I2cBusSpeed.FastMode;

                // Create PIXY I2C Device
                m_deviceI2cPixy = await Windows.Devices.I2c.I2cDevice
					.FromIdAsync(devicesI2C.First().Id, settingsPixy);
            }
            catch (Exception ex)
            {
                m_lf.LogError(ex.Message);
            }
        }

Next, we are setting up a timer and a hander to read raw data from Pixy into dataArray and call m_fParseBlocks to translate it:

        public void Listen()
        {
            if (m_timerRead != null)
                m_timerRead.Stop();

            m_timerRead = new DispatcherTimer();
            m_timerRead.Interval = TimeSpan.FromMilliseconds(PIXY_INTERVAL_MS);
            m_timerRead.Tick += TimerRead_Tick;
            m_timerRead.Start();
        }

        private void TimerRead_Tick(object sender, object e)
        {
            try
            {
                if (m_deviceI2cPixy == null)
                    return;

                byte[] dataArray = new byte[MaxNumberOfExpectedObjects * BLOCK_SIZE_BYTES 
						+ m_sizeLeadingZeroesBuffer];
                m_deviceI2cPixy.Read(dataArray);
                m_fParseBlocks(dataArray);
            }
            catch (Exception ex)
            {
                m_lf.LogError(ex.Message);
            }
        }

Note that instead of timers, we could utilize async/await - asynchronous design pattern - to build an alternative Reader. Such Reader could have been injected into the flow via Class Factory as explained in the App Logic Layer section.

My code assumes headed application but if you are going to run it within a headless app, then change the timer type from DispatcherTimer to ThreadPoolTimer. Please see the corresponding note in the source code.

Repository Layer

Generally speaking, we use a Repository to separate data retrieval logic from the business or application logic through translating source data into the entity model - data structure utilized by the business logic. This additional encapsulation layer is known as the Repository Pattern. In our use case, the translator processes raw data from the data source to extract visual objects of interest. This is accomplished in PixyObjectFinder that converts Pixy byte stream into objects with x/y/w/h properties:

    public class PixyObjectFinder : CameraObjectFinder, ICameraObjectFinder
    {
        const UInt16 PixySyncWord = 0xaa55;
        const int BlockRetentionSeconds = 3;

        private ICameraDataReader m_pixy = null;
        private ILogger m_lf = null;
        public Object m_lockPixy = new Object();
        private Func<List<CameraObjectBlock>, bool> m_fFindTarget;
        private Func<byte[], int> m_dlgtExtractBlocks;

        private List<CameraObjectBlock> m_pixyObjectBlocks = new List<CameraObjectBlock>();
        public List<CameraObjectBlock> GetVisualObjects() { return m_pixyObjectBlocks;  }

Pixy Object Finder

PixyObjectFinder translates the buffer from the Pixy object blocks format into our entity model of detected objects so that the App Logic would only deal with its own format and remain agnostic to an underlying source.

PixyObjectFinder users the Start method for initializing Pixy and launching its timer within the data access layer.

        public void Start()
        {
            m_pixy.Init(m_dlgtExtractBlocks);
            // Initialize Pixy I2C device.
            Task.Run(async () => await m_pixy.Start());
            // Launch Pixy listener
            m_pixy.Listen();
        }

The translation is essentially implemented in m_dlgtExtractBlocks that is being passed to Pixy data reader as a parameter via m_pixy.Init(m_dlgtExtractBlocks).

        public PixyObjectFinder(ICameraDataReader ipixy,
                                Func<List<CameraObjectBlock>, bool> fFindTarget,
                                ILogger lf) : base(ipixy, fFindTarget, lf)
        {
            m_pixy = ipixy;
            m_fFindTarget = fFindTarget;
            m_lf = lf;

            m_dlgtExtractBlocks = delegate (byte[] byteBuffer)
            {
                lock (m_lockPixy)
                {
                    if (byteBuffer == null || byteBuffer.Length == 0)
                        return 0;

                    try
                    {
                        // Convert bytes to words
                        int blockSize = ipixy.GetBlockSize();
                        int lengthWords = 0;
                        int[] wordBuffer = ConvertByteArrayToWords(byteBuffer, ref lengthWords);
                        if (wordBuffer == null)
                            return 0;

                        // 0, 1     0              sync (0xaa55)
                        // 2, 3     1              checksum(sum of all 16 - bit words 2 - 6)
                        // 4, 5     2              signature number
                        // 6, 7     3              x center of object
                        // 8, 9     4              y center of object
                        // 10, 11   5              width of object
                        // 12, 13   6              height of object

                        // Find the beginning of each block
                        List<int> blockStartingMarkers = Enumerable.Range(0, wordBuffer.Length)
                            .Where(i => wordBuffer[i] == PixySyncWord)
                            .ToList<int>();

                        // Drop blocks that are more than BlockRetentionSeconds old
                        m_pixyObjectBlocks=m_pixyObjectBlocks.SkipWhile
                                                        (p => ((TimeSpan)(DateTime.Now - p.dt))
							.Seconds > BlockRetentionSeconds).ToList();

                        // Extract object blocks from the stream
                        blockStartingMarkers.ForEach(blockStart =>
                        {
                            if (blockStart < lengthWords - blockSize / 2)
                                m_pixyObjectBlocks.Add(new CameraObjectBlock()
                                {
                                    signature = wordBuffer[blockStart + 2],
                                    x = wordBuffer[blockStart + 3],
                                    y = wordBuffer[blockStart + 4],
                                    width = wordBuffer[blockStart + 5],
                                    height = wordBuffer[blockStart + 6],
                                    dt = DateTime.Now
                                });
                        });

                        m_fFindTarget(m_pixyObjectBlocks);
                        // Reset the blocks buffer
                        m_pixyObjectBlocks.Clear();
                    }
                    catch (Exception e)
                    {
                        m_lf.LogError(e.Message);
                    }
                }
                return m_pixyObjectBlocks.Count;
            };
        }

PixyObjectFinder is taking fFindTarget generic delegate to invoke higher level processor which converts detected objects to target coordinates.

m_pixyObjectBlocks array contains detected objects. The conversion follows Pixy stream format specified in the above code snippet.

For more details on Pixy data format, see Pixy Serial Protocol. Note that Serial and I2C deliver Pixy data in the same stream format.

In addition, I am accumulating blocks in the array beyond a single read operation to smooth target detection over a longer time period, i.e., longer that 20 ms. It is being done by SkipWhile dropping objects older than BlockRetentionSeconds.

Parsing the Data Stream

The above finder method must first convert the input stream of bytes into 16-bit words and place them in an array of integers. It is that array where we are going to find x, y, width and height of the detected objects.

ConvertByteArrayToWords - PixyObjectFinder private method - converts the byte stream received from Pixy I2C device into 16-bit words:

        private int[] ConvertByteArrayToWords(byte[] byteBuffer, ref int lengthWords)
        {
            // http://cmucam.org/projects/cmucam5/wiki/Pixy_Serial_Protocol
            // All values in the object block are 16-bit words, sent least-signifcant byte 
            // first (little endian). So, for example, to send the sync word 0xaa55, you 
            // need to send 0x55 (first byte) then 0xaa (second byte).
            try
            {
                // Skip leading zeroes
                byteBuffer = byteBuffer.SkipWhile(s => s == 0).ToArray();
                if (byteBuffer.Length == 0)
                    return new int[0];

                // Convert bytes to words
                int length = byteBuffer.Length;
                lengthWords = length / 2 + 1;
                int[] wordBuffer = new int[lengthWords];
                int ndxWord = 0;
                for (int i = 0; i < length - 1; i += 2)
                {
                    if (byteBuffer[i] == 0 && byteBuffer[i + 1] == 0)
                        continue;
		            
		            // convert from little endian
                    int word = ((int)(byteBuffer[i + 1])) << 8 | ((int)byteBuffer[i]); 

                    if (word == PixySyncWord && ndxWord > 0 && PixySyncWord == wordBuffer[ndxWord - 1])
                        wordBuffer[ndxWord - 1] = 0; // suppress Pixy sync word marker duplicates

                    wordBuffer[ndxWord++] = word;
                }
                if (ndxWord == 0)
                    return null;

                return wordBuffer;
            }
            catch (Exception e)
            {
                m_lf.LogError(e.Message);
                return null;
            }
        }

As you can see, I had to tweak the parser to skip potential leading zeroes and duplicate sync words. If you are using RPi 3 and/or a newer UWP Tools SDK, you may not have to deal with these issues.

Here is an example of a single object block byte sequence received from Pixy into the byte buffer:

00-00-00-00-55-AA-55-AA-BB-01-01-00-3D-01-73-00-04-00-06-00-00-00-

You should be able to review the byte buffer in your debugger via BitConverter.ToString(byteBuffer).

App Logic Layer

The Target Finder determines the target based on the selected objects provided by the Repository layer. It is here in this layer that we apply a creation design pattern called Factory to create and retain instances of lower level objects.

Class Factory

This pattern helps decouple our classes from being responsible for locating and managing the lifetime of dependencies. Note how our class factory only exposes interfaces while calling constructors internally. Both the Data Reader and the Object Finder are created and stored here. We instantiate them using constructor dependency injection which gives us flexibility of dropping in other implementations of readers and finders by creating them in the Class Factory.

By using a Factory, we apply the principle of Inversion of Control which replaces direct dependencies between objects with dependencies on abstraction, i.e., interfaces. While this concept goes way beyond my example, quite often, a simple class factory is all you need.

The Create function is passing in the method for calculating the target which is accomplished via delegate Func<List<CameraObjectBlock>, bool> fFindTarget

    public class MyDeviceClassFactory
    {
        private ICameraDataReader m_cameraDataReaderI2C = null;
        private ICameraObjectFinder m_cameraObjectFinder = null;

        private ILogger m_lf = new LoggingFacility();
        public ILogger LF { get { return m_lf; } }

        public void Create(Func<List<CameraObjectBlock>, bool> fFindTarget)
        {
            if (m_cameraObjectFinder != null)
                return;

            m_cameraDataReaderI2C = new PixyDataReaderI2C(m_lf);
            m_cameraObjectFinder = new PixyObjectFinder(m_cameraDataReaderI2C, fFindTarget, m_lf);
        }

        public ICameraDataReader CameraDataReader { get { return m_cameraDataReaderI2C; } }
        public ICameraObjectFinder CameraObjectFinder { get { return m_cameraObjectFinder; } }
    }

Target Finder

At the top of this project is CameraTargetFinder class that filters the pre-selected objects looking for a single object - the target. It ignores objects with an area smaller than minAcceptableAreaPixels, orders the remaining objects by size and takes one from the top. It can potentially apply other filters. Finally, it calls SetTargetPosition with the target position and size in pixels.

    public class CameraTargetFinder
    {
        private const int minAcceptableAreaPixels = 400;

        private MyDeviceClassFactory cf = new MyDeviceClassFactory();
        private Func<List<CameraObjectBlock>, bool> m_dlgtFindTarget;
        private Action<int, int, int, int> m_fSetTarget;

        public CameraTargetFinder(Action<int, int, int, int> fSetTarget)
        {
            m_dlgtFindTarget = delegate (List<CameraObjectBlock> objectsInView)
            {
                try
                {
                    if (objectsInView.Count == 0)
                        return false;

                    objectsInView = GetBiggestObjects(objectsInView);

                    // Select the biggest signature. We are only interested in a single object 
                    // because all signatures represent same object under different light conditions.
                    CameraObjectBlock biggestMatch = (from o in objectsInView
                                                      where o.width * o.height > minAcceptableAreaPixels
                                                      select o)
                                                    .OrderByDescending(s => s.width * s.height)
                                                    .FirstOrDefault();

                    if (biggestMatch == null || biggestMatch.signature < 0)
                        return false;

                    m_fSetTarget(biggestMatch.x, f.CameraDataReader.GetMaxYPosition() - biggestMatch.y, 
							biggestMatch.width, biggestMatch.height);
                    return true;
                }
                catch (Exception e)
                {
                    cf.LF.LogError(e.Message);
                    return false;
                }
            };

            m_fSetTarget = fSetTarget;
        }

The resulting visual object list often contains a lot of false positives, i.e., tiny objects with the same color signature as the desired target. Besides making adjustments to improve accuracy, we drop them by calling GetBiggestObjects() to only retain the max size objects for each color signature. This method first groups them by color signature, then finds the maximum size within each and returns these objects only.

        private List<CameraObjectBlock> GetBiggestObjects(List<CameraObjectBlock> objectsInView)
        {
            // Find the biggest occurrence of each signature, the one with the maximum area
            List<CameraObjectBlock> bestMatches = (from o in objectsInView
                             group o by o.signature into grpSignatures
                             let biggest = grpSignatures.Max(t => t.height * t.width)
                             select grpSignatures.First(p => p.height * p.width == biggest))
                     .ToList();

            return bestMatches;
        }

GetBiggestObjects method is a great example of using LINQ for processing data in robotic apps. Note how compact and clean the query code is comparing to nested loops often found in robotics sample code. Python developers would want to comment here that the power of integrated queries is available to them too albeit with different syntax/predicates.

The App Logic starts the camera and initiates target tracking via the StartCamera method:

        public void StartCamera()
        {
            try
            {
                cf.Create(m_dlgtFindTarget);
                cf.CameraObjectFinder.Start();
            }
            catch (Exception e)
            {
                cf.LF.LogError(e.Message);
                throw e;
            }
        }

Using the Code in Your Project

First off, you have to teach Pixy an object.

Next, create an instance of PixyTagetFinder passing in a handler for processing target coordinates. Here is an example:

    // Test
    public class BizLogic
    {
        CameraTargetFinder ctf = new CameraTargetFinder(delegate (int x, int y, int w, int h)
        {
            Debug.WriteLine("x: {0}, y: {1}, w: {2}, h: {3}", x, y, w, h);
        });
        public void Test()
        {
            ctf.StartCamera();
        }
    }

If you know the actual size of your target object, you can convert height and width to the distance-to-target while converting x and y to angles between the camera and the target so that your controller could turn servo motors accordingly to always point the camera to the target.

In order to run my source code, you could simply add PixyCamera.cs file to your project and - for testing the code - work the above sample into the MainPage function of your application.

If you'd rather use the attached solution, then set the target platform to ARM in the Visual Studio, build it, deploy to RPi and run in the debug mode. Once Pixy camera initializes, bring your preset object in front of the camera. When Pixy detects the object, its LED indicator will light up and object positioning data will appear in the Visual Studio Output window, for example:

Conclusion

Tracking an object using Pixy and the familiar Visual Studio environment is a very rewarding project, especially when it runs in an autonomous system on a small computer like RPi. It's even more fun when the underlying program is well organized and follows design patterns recognized by other developers. It's worth our time to properly structure and continuously refactor a solution keeping up with project growth.

Feel free to use the code in your personal or commercial projects. It has been well tested by my 20-pound 6-wheeler being guided by Pixy.