Computer Vision Sandbox

Table of Contents

Introduction
Project Features
- Video Sources
- Image/Video Processing
- Virtual Video Sources
- Scripting Plug-In
- Device Plug-Ins
- Sandbox Scripting Threads
Project's Code
Conclusion

Introduction

Computer Vision Sandbox is an open source software package, which aims to allow solving different problems related to computer vision areas, like, for example, video surveillance, vision-based automation/robotics, different sorts of image/video processing, etc. Initially, the project started with closed code taking some time to settle with its core architecture and developing a set of features which would allow applying it to variety of tasks. Starting with version 2.0, however, the project migrated to an open source repository making its code available for the public.

From the very beginning, the project was designed to provide high modularity and allow extending its features by developing different plug-ins. The main application taken on its own is of little use. It knows how to load plug-ins and stick them together to get some video processing going. But if those plug-ins are missing, there is not much to do other than opening/closing About box. It should not be the case though, as the official installation package comes with a variety of plug-ins allowing to apply them to a number of tasks.

Plug-ins in Computer Vision Sandbox is the core idea and provides most of its features. Those are like building blocks - the end result you get depends on which blocks are taken and how those are combined. What sort of plug-ins are there, you wonder? Well, there are few different types available. The first and the main type of plug-ins are video sources - those who generate the video to be processed. The video may come from USB camera (or laptop's integrated camera), for example, from IP video surveillance camera, from a video file or any other source providing continuous images (video frames). On top of those, there are different types of plug-ins aimed for image and video processing. It can be some image enhancement, adding visual effects, detection of certain objects, saving video archive, etc. To make things more interesting and allow more advanced video processing, there is a scripting plug-in, which allows to write video processing scripts using Lua programming language. Finally, there are also some device plug-ins, which allow talking to some I/O boards, robotics controllers, devices connected over serial port, etc. – this allows developing more interactive applications, where video processing can be affected by real world events and the other way around.

The code base of the project grew to the point where describing all its details in a single article would be quite a big job to do by now. So, this article will only concentrate on few key concepts and features to demonstrate some of the possible use cases. To give a bit of an idea/direction, here is a screenshot demonstrating some of the applications. Those are more into computer vision side of things. But other areas will get described further as well.

Computer vision examples

Project Features

When the Computer Vision Sandbox project emerged, the idea was to make sort of a Lego building puzzle. It was not aimed for video surveillance only, nor for adding different imaging/video effects, nor for purely computer vision applications, nor for robotics/automation tasks, etc. Instead, it was aimed for everything mentioned above and more. The idea was to make every feature as a plug-in and then let users combine those to get whatever result they want. The main application does not have any idea about any specific camera (video source) or image/video processing routine. It only knows about certain types of plug-ins. Like, some plug-ins provide video, others provide image processing, some others may provide scripting capabilities, etc. But what is actually provided, and how, depends purely on plug-ins. All we need from the main application is just to know how to talk to those plug-ins.

In subsequent sections, we are going to review the main types of plug-ins and how to use those from the Computer Vision Sandbox application.

Video Sources

Video source plug-ins are the foundation of the Computer Vision Sandbox. You may have whatever number of other plug-ins, but if there are no video sources - nothing else can be done. What is a video source? Anything that generates images continuously. It can be a camera attached over USB or IP interface, it can be a video file, a screen capture device, collection of image files in a folder, etc. The main application does not care where images come from, as long as they come.

Video source plug-ins

When adding a video source, it is required to select a plug-in to use for communication with a particular device and then configure its properties. The list of properties is specific to the chosen plug-in type - it can be IP address of camera, URL of MJPEG stream, name/id of USB connected camera, etc. Once configuration is complete, the video source can be opened and watched.

Video source properties

After configuring number of video sources and making sure all of them work as expected, we may create a sandbox to show up to 16 cameras on one view. Any type of video sources can be put into a sandbox, so that cameras of different brands and makes could be opened together.

Cameras view

Cameras' views don't need to have regular grid structure. If we want to make one of the cells larger than others, we can merge few of them to make a bigger one. This allows creating views of various shapes and assign bigger cells to cameras, which may have something more interesting/important to show.

Cameras view

Finally, a sandbox may have multiple views defined, which can be then switched either manually or at certain time intervals. For example, we may have a default view showing all sandbox video sources and then some other views showing particular video sources in larger size.

Cameras view

For every running video source (individually or within a sandbox), it is possible to take a snapshot, which can be then exported into image file or copied into clipboard.

Camera snapshot

Image/Video Processing

Watching different types of cameras is nice to have, but it would be nice to do something about them. For example, do some image enhancement/processing, add different effects, implement some computer vision applications, save video into files, etc. This is where image and video processing plug-ins come into play.

To add video processing steps for a video source, it is required to put it into a sandbox in the same way as it is done when combining several video sources into a view. So, a sandbox represents sort of container, which may run multiple video sources, execute different image/video processing routines, run scripts (more on this later), etc.

When initial sandbox configuration is done, the video processing steps can be added to its video sources (cameras) by running Sandbox Wizard providing a list of available plug-ins, which can be added as video processing steps. For example, the screenshot below shows 5 image processing plug-ins added as video processing steps for a selected camera. Together, those processing steps create an effect of old-style photo - picture first is turned into sepia colors (gradients of brownish), then vertical grain noise is added, vignetting effect to make picture darker on its edges and finally two borders are added: a fuzzy border and a rounded border.

Video processing steps

Running a sandbox with configuration similar to the above, may result in the picture below (provided plug-ins are configured appropriately):

Video processing steps

In some cases, it may be useful to check performance of the configured video processing graph and/or change properties of some of its steps. This can be done from the Video processing information form available from running camera's context menu. This form shows average time taken by each video processing step and percentage of the total time taken by the graph. This information may help troubleshooting the configured graph, finding which steps take most of CPU time and potentially cause video processing delays (we usually don't want graph's total time to be greater than video source's frame interval time). The same form also allows changing properties of image processing plug-ins (if they have any) and see the effect of it on the running video.

Video processing steps

In addition to image processing plug-ins, there are also some video processing plug-ins, which can be put into video processing graph. The difference between these two types of plug-ins may sound a bit subtle though. Image processing plug-ins are usually aimed to take input image and apply some image processing routine, which changes the image (or provide a new image as result). These plug-ins usually don't have state other than configured properties and most of the time, they apply the same routine to all images. Video processing plug-ins on the contrary may have some internal state, which may affect the way image is processed. Also, these plug-ins may or may not make any changes to source images – depends on what plug-in implements.

A good example of video processing plug-in is the Video File Writer plug-in, which writes incoming images into a video file. This plug-in can either write all images into a single file or it can be configured to split video into fragments of certain length (in minutes). In addition to that, it can monitor size of the destination folder and clean-up old files, which allows creating video archives. For example, the configuration below tells the video writing plug-in to write images into files prefixed with "street view" and suffixed with time stamp. Each video file should be 10 minutes long. And the size of destination folder should not exceed 10000Mb (~10Gb). Running a sandbox with video writing plug-in that way will ensure that we always have 10Gb worth of video archive for our camera.

Video writer properties

Another video processing plug-in to mention is the Image Folder Writer plug-in. Unlike the video writing plug-in mentioned above, this one saves individual image as JPEG or PNG files into the specified folder at the configured time intervals. This can be used to create time-lapse images. For example, the configuration below will make the plug-in to write an image every 5 seconds (5000ms) and ignore/skip all other images.

Image writer properties

Virtual Video Sources

Although we already described some video source plug-ins, it may be worth mentioning a bit about their sub-category, which is aimed for virtual video sources. Sub-category does not introduce a new type of plug-in. We still deal with video sources, which provide new images using the same interface as any other plug-in of this type. This sub-category is more for grouping similar plug-ins in UI, etc. The idea was to have some way of separating plug-ins which deal with video sources generated by some cameras/devices and some "virtual" video source like files, images, etc.

The first plug-in to mention in this category is Image Folder Video Source. As it was already mentioned above, using the Image Folder Writer plug-in, we can save images coming from some video source at certain time intervals. As a result, we get folder full of time stamped images. Now, suppose we may want to play them back, but with different interval. For example, we saved collection of time lapse images with 10 seconds interval, but then we want to play them at 30 frames per second rate. This is what the Image Folder Video Source does. When configuring this plug-in, it is required to specify source folder containing image files and desired time interval between frames. The plug-in will then read image files out of that folder and provide them as if they are video frames coming from some camera. Adding Video File Writer as a video processing step for this video source will allow stitching all those images into a single video file. This will give us a proper time lapse video file now!

Another virtual video source plug-in worth mentioning is Video Repeater. Really useful plug-in if used the right way. All the plug-in does is simply repeats/retranslates images pushed into it. Using it together with Video Repeater Push plug-in allows splitting video processing chain into multiple branches, which may apply different video processing steps to the original video frames. Let's see how we can use this plug-in.

First, we need to configure few video sources by using the Video Repeater plug-in. When configuring those, it is required to specify Repeater ID - something unique which will be used later by the Video Repeater Push plug-in to link with the video source. Opening these video sources on their own will not produce any result, but just "Waiting for video source ..." message. That is fine, as we have nothing pushing images into those yet. The next step is to configure a sandbox, which has a video source from some camera and few repeaters as well, three for example.

Video repeater sandbox configuration

Now, using Sandbox Wizard, let's put three Video Repeater Push plug-ins as video processing steps for the camera video source, each configured with different ID we've used previously when configuring Video Repeater plug-ins.

Video repeater pushing

Opening sandbox configured that way will show four video sources showing exactly the same video - one coming from a camera and the other three retranslating it. However, since Computer Vision Sandbox treats them as individual video sources, we can put any video processing we like on all four of them. For example, the screenshot below demonstrates a running sandbox, showing the original camera at the top left and then three repeaters, which apply different video processing steps to get different effects.

Video repeater pushing

The above example demonstrates how to implement video processing branching using the idea of Video Repeater plug-in. But it is not the only use case for it. Suppose we've configured a lengthy video processing chain for some video source consisting of several steps. If we put Video Repeater Push plug-ins in between those steps and put few video repeaters into the sandbox, we will be able to see intermediate results of the performed video processing. In this use case, we use video repeaters only for displaying, instead of running video processing on top of them. This will get especially useful for debugging more complicated video processing done with the help of scripting.

Finally, video repeaters can be used to address some performance issues when dealing with heavy video processing chains. Suppose we have a video source providing images at 30 frames per second rate. Also suppose we put number of video processing steps, which all together take more time than the time interval between coming frames (~33ms). This may not be the configuration we want, as the resulting frame rate will drop due to the time-consuming video processing. The way around it could be doing only half of the video processing on the original video, then push whatever we have so far to video repeater and put the rest of video processing chain there, which will run on a different thread and so will not keep the original video source blocked. And so the performance issue is sorted giving us the frame rate we want! Just remember using the mentioned before Video processing information form to look for potential video processing bottlenecks.

The last virtual video source to mention is Screen Capture plug-in. As the name implies, the plug-in captures screen content at certain rate and provides it as images coming from a video source. It can be configured to capture specific screen available in the system, an area of certain size or a window with certain title, "Paint" for example.

Screen capture

Scripting Plug-In

As was already demonstrated, adding different image/video processing plug-ins into video processing graph of a video source lets us implement different imaging effects, etc. However, there are certain limits to how much can be done with sequential pre-configured video processing graph. It does not let us change configuration of plug-ins during sandbox run time based on some logic. Also, it does not allow implementing more advanced video processing, where images could be analysed and something could be done based on that.

To allow more advanced video processing, Computer Vision Sandbox provides a Lua Scripting plug-in, which makes it possible to implement custom logic using Lua programming language. The scripting plug-in can be added into video processing graph in a similar way as other plug-ins and then configured to tell which script to run.

Screen capture

The project's web site provides complete documentation on the APIs provided by the Lua Scripting plug-in, as well as number of tutorials covering different use cases. Here, we'll demonstrate briefly few scripting examples to give some idea of what could be done.

To start, let's have a look at the simplest script, which uses Colorize image processing plug-in to change hue and saturation of image's pixels. If that plug-in was added directly into video processing graph, then user would need to configure its properties manually. And that would not change while sandbox is running unless user came back and reconfigured it. Using scripting, however, we can change plug-ins properties based on whatever logic we wish. Running the script below as video processing step, will keep changing hue value for the camera's images.

-- Create instance of Colorize plug-in
setHuePlugin = Host.CreatePluginInstance( 'Colorize' )
-- Set Saturation to maximum
setHuePlugin:SetProperty( 'saturation', 100 )
-- Start with Hue set to 0
hue = 0

-- Main function to be executed for every frame
function Main( )
    -- Get image to process
    image = Host.GetImage( )
    -- Set Hue value to set for the image
    setHuePlugin:SetProperty( 'hue', hue )
    -- Process the image
    setHuePlugin:ProcessImageInPlace( image )
    -- Move to the next Hue value
    hue = ( hue + 1 ) % 360
end

As it can be seen from the code above, the script has two parts: a global part and the Main() function. The global part is aimed to perform whatever needed initialization and is executed only once when sandbox is started. The Main() function is then executed for every new frame generated by video source. Using the Host.GetImage() API we can get access to the image currently handled by video processing graph and then apply different image processing routines to it.

A slightly larger script to take a look at uses five different plug-ins to create effect of old video. It was already demonstrated how to create such effect by using those plug-ins directly within video processing graph. But now, we may want to make it more dynamic, so that amount of vignetting and added noise changes between video frames.

local math = require "math"

-- Create instances of plug-ins to use
sepiaPlugin      = Host.CreatePluginInstance( 'Sepia' )
vignettingPlugin = Host.CreatePluginInstance( 'Vignetting' )
grainPlugin      = Host.CreatePluginInstance( 'Grain' )
noisePlugin      = Host.CreatePluginInstance( 'UniformAdditiveNoise' )
borderPlugin     = Host.CreatePluginInstance( 'FuzzyBorder' )

-- Start values of some properties
vignettingStartFactor = 80
grainSpacing          = 40
noiseAmplitude        = 20

vignettingPlugin:SetProperty( 'decreaseSaturation', false )
vignettingPlugin:SetProperty( 'startFactor', vignettingStartFactor )
vignettingPlugin:SetProperty( 'endFactor', 150 )
grainPlugin:SetProperty( 'staticSeed', true )
grainPlugin:SetProperty( 'density', 0.5 )
borderPlugin:SetProperty( 'borderColor', '000000' )
borderPlugin:SetProperty( 'borderWidth', 32 )
borderPlugin:SetProperty( 'waviness', 8 )
borderPlugin:SetProperty( 'gradientWidth', 16 )

-- Other variables
seed    = 0
counter = 0

-- Main function to be executed for every frame
function Main( )
    -- Randomize some properties of the plug-ins in use
    RandomizeIt( )
    -- Get image to process
    image = Host.GetImage( )
    -- Apply image processing routines
    sepiaPlugin:ProcessImageInPlace( image )
    vignettingPlugin:ProcessImageInPlace( image )
    grainPlugin:ProcessImageInPlace( image )
    noisePlugin:ProcessImageInPlace( image )
    borderPlugin:ProcessImageInPlace( image )
end

-- Make sure the specified value is in the specified range
function CheckRange( value, min, max )
    if value < min then value = min end
    if value > max then value = max end
    return value
end

-- Modify plug-ins' properties randomly
function RandomizeIt( )
    -- change vignetting start factor
    vignettingStartFactor = CheckRange( vignettingStartFactor +
                                        math.random( 3 ) - 2, 60, 100 )
    vignettingPlugin:SetProperty( 'startFactor', vignettingStartFactor )
    -- change noise level
    noiseAmplitude = CheckRange( noiseAmplitude +
                                 math.random( 5 ) - 3, 10, 30 )
    noisePlugin:SetProperty( 'amplitude', noiseAmplitude )

    -- change grain every 5th frame
    counter = ( counter + 1 ) % 5
    if counter == 0 then
        -- grain's seed value
        seed = seed + 1
        grainPlugin:SetProperty( 'seedValue', seed )
        -- grain's spacing
        grainSpacing = CheckRange( grainSpacing + math.random( 5 ) - 3, 30, 50 )
    end
end

OK, enough with imaging effects. Let's try something different instead. For example, let's try making a simple motion detector. The script below uses the Diff Images Thresholded plug-in to find number of pixels, which differ by a certain amount in two consecutive images. If the difference amount is higher than certain threshold, it is triggered as motion by highlighting the area and adding red rectangle around the image. A logical extension to the script would be to start writing video file when motion is detected, instead of saving everything in the video archive like it was demonstrated before.

-- Create instances of plug-ins to use
diffImages   = Host.CreatePluginInstance( 'DiffImagesThresholded' )
addImages    = Host.CreatePluginInstance( 'AddImages' )
imageDrawing = Host.CreatePluginInstance( 'ImageDrawing' )

-- Since we deal with RGB images, set threshold to 60 for the sum
-- of RGB differences
diffImages:SetProperty( 'threshold', 60 )
-- Highlight motion area with red color
diffImages:SetProperty( 'hiColor', 'FF0000' )

-- Amount of difference image to add to the source image for motion highlighting
addImages:SetProperty( 'factor', 0.3 )

-- Motion alarm threshold
motionThreshold = 0.1

-- Highlight motion areas or not
highlightMotion = true

function Main( )
    image = Host.GetImage( )

    if oldImage ~= nil then
        -- Calculate difference between current and the previous frames
        diff = diffImages:ProcessImage( image, oldImage )

        -- Set previous frame to the current one
        oldImage:Release( )
        oldImage = image:Clone( )

        -- Get the difference amount
        diffPixels  = diffImages:GetProperty( 'diffPixels' )
        diffPercent = diffPixels * 100 / ( image:Width( ) * image:Height( ) )

        -- Output the difference value
        imageDrawing:CallFunction( 'DrawText', image, tostring( diffPercent ),
                                   { 1, 1 }, 'FFFFFF', '00000000' )

        -- Check if alarm has to be raised
        if diffPercent > motionThreshold then
            imageDrawing:CallFunction( 'DrawRectangle', image,
                { 0, 0 }, { image:Width( ) - 1, image:Height( ) - 1 }, 'FF0000' )

            -- Highlight motion areas
            if highlightMotion then
                addImages:ProcessImageInPlace( image, diff )
            end
        end

        diff:Release( )
    else
        oldImage = image:Clone( )
    end
end

And here is an example of how it may look like when motion is detected.

Motion detection

Another interesting example to show is a script to look for round objects. It uses a plug-in, which finds individual blobs (objects) in an image and checks if they have circular shape. Before doing blobs' processing, we need to do segmentation though - separate background from foreground. In this case, the script uses simple thresholding technique. This puts a restriction that our image must have dark even background and brighter objects. Which is fine for this example.

local math   = require "math"
local string = require "string"

-- Create instances of plug-ins to use
grayscalePlugin     = Host.CreatePluginInstance( 'Grayscale' )
thresholdPlugin     = Host.CreatePluginInstance( 'Threshold' )
circlesFilterPlugin = Host.CreatePluginInstance( 'FilterCircleBlobs' )
drawingPlugin       = Host.CreatePluginInstance( 'ImageDrawing' )

-- Set threshold to separate background and objects
thresholdPlugin:SetProperty( 'threshold', 64 )

-- Don't do image filtering, only collect information about circles
circlesFilterPlugin:SetProperty( 'filterImage', false )
-- Set minimum radius of circles to collect
circlesFilterPlugin:SetProperty( 'minRadius', 5 )

-- Color used for drawing
drawingColor = '00FF00'

function Main( )
    image = Host.GetImage( )

    -- Pre-process image by grayscaling and thresholding it
    grayImage = grayscalePlugin:ProcessImage( image )
    thresholdPlugin:ProcessImageInPlace( grayImage )

    -- Apply circles filter
    circlesFilterPlugin:ProcessImageInPlace( grayImage )

    circlesFound    = circlesFilterPlugin:GetProperty( 'circlesFound' )
    circlesCenters  = circlesFilterPlugin:GetProperty( 'circlesCenters' )
    circlesRadiuses = circlesFilterPlugin:GetProperty( 'circlesRadiuses' )

    -- Tell how many circles are detected
    drawingPlugin:CallFunction( 'DrawText', image, 'Circles: ' .. tostring( circlesFound ),
                                { 5, 5 }, drawingColor, '00000000' )

    -- Highlight each detected circle
    for i = 1, circlesFound do
        center = { math.floor( circlesCenters[i][1] ), math.floor( circlesCenters[i][2] ) }
        radius = math.floor( circlesRadiuses[i] )
        dist   = math.floor( math.sqrt( radius * radius / 2 ) )

        lineStart = { center[1] + radius, center[2] - radius }
        lineEnd   = { center[1] + dist, center[2] - dist }

        drawingPlugin:CallFunction( 'FillRing', image, center, radius + 2, radius, drawingColor )

        drawingPlugin:CallFunction( 'DrawLine', image, lineStart, lineEnd, drawingColor )
        drawingPlugin:CallFunction( 'DrawLine', image, lineStart,
                                    { lineStart[1] + 20, lineStart[2] }, drawingColor )

        -- Tell radius of the circle
        drawingPlugin:CallFunction( 'DrawText', image, tostring( radius ),
                                    { lineStart[1] + 2, lineStart[2] - 12 },
                                    drawingColor, '00000000' )
    end

    grayImage:Release( )
end

Circles detection

The final scripting example shows how to use image exporting plug-ins and implement time lapse image writing as a Lua script. Yes, it was already demonstrated how to do that without the need of scripting – we have a dedicated plug-in for this to put directly into video processing graph. However, in case custom image saving logic is needed, it can be still of use.

local os = require "os"

-- Folder to write images to
folder = 'C:\\Temp\\images\\'

-- Create instance of plug-in for saving images
imageWriter = Host.CreatePluginInstance( 'PngExporter' )
--imageWriter = Host.CreatePluginInstance( 'JpegExporter' )
--imageWriter:SetProperty( 'quality', 100 )
ext = '.' .. imageWriter:SupportedExtensions( )[1]

-- Interval between images in seconds
imageInterval = 10
lastClock     = -imageInterval

function Main( )
    image = Host.GetImage( )

    -- Get number of seconds the application is running
    now = os.clock( )

    if now - lastClock >= imageInterval then
        lastClock = now
        SaveImage( image )
    end
end

-- Save image to file
function SaveImage( image )
    dateTime = os.date( '%Y-%m-%d %H-%M-%S' )
    fileName = folder .. dateTime .. ext
    imageWriter:ExportImage( fileName, image )
end

There are more scripting examples available and those are included into the official installation package of Computer Vision Sandbox or can be found on the project's web page. Together with Lua scripting API description and different tutorials, they may provide in depth coverage of available features.

Device Plug-Ins

When scripting plug-in was introduced, which lets implementing more advanced video processing, the next step forward was to implement support for interaction with different devices. The idea was to let scripts to interact with the real word - change video processing routine based on some device's inputs or set device's outputs/actuators based on results of image processing algorithm. As a result, two new plug-in types were added – device plug-ins and communication device plug-ins. These allow adding support for communication with external devices, like different I/O boards, robotics controllers, devices attached to serial port, etc. As it is one of the recent features added, there are not many plug-ins of these types available so far. More will be added as the project evolves.

Although both new plug-in types are aimed for communication with external devices, they provide slightly different API, which allows device interaction in different ways. Device plug-ins hide all communication details/protocols and allow talking to devices by means of setting/getting plug-in's properties. For example, if we have some digital I/O board, setting its outputs can be implemented by setting some properties of the plug-in. And querying state of its output pins can be implemented by reading properties. In some cases, such interface may be too limited though and more flexibility is needed. Communication Device plug-ins extend the API and provide Read/Write methods, which allow sending raw data to device using whatever protocol it supports. Let's have a look at few examples of interaction with some devices.

The first plug-in to demonstrate is the Gamepad device, which can be used in a number of applications. For example, if some camera is mounted on a pan/tilt device, that could be controlled with the help of a gamepad. Or it can be used to control some robot, video processing sequence, etc.

local math = require 'math'

gamepad = Host.CreatePluginInstance( 'Gamepad' )

-- Connected to the first game pad device 
gamepad:SetProperty( 'deviceId', 0 )
if not gamepad:Connect( ) then
    error( 'Failed connecting to game pad' )
end

-- Query name of the device, number of axes and buttons
deviceName   = gamepad:GetProperty( 'deviceName' )
axesCount    = gamepad:GetProperty( 'axesCount' )
buttonsCount = gamepad:GetProperty( 'buttonsCount' )

function Main( )
    -- Query value of all axes and buttons as arrays
    axesValues   = gamepad:GetProperty( 'axesValues' )
    buttonsState = gamepad:GetProperty( 'buttonsState' )
    
    print( 'X: ' .. tostring( math.floor( axesValues[1] * 100 ) / 100 ) )
    print( 'Y: ' .. tostring( math.floor( axesValues[2] * 100 ) / 100 ) )

    -- Query value of the X axis only
    x = gamepad:GetProperty( 'axesValues', 1 )
    
    -- Query status of the first button only
    buttonState1 = gamepad:GetProperty( 'buttonsState', 1 )
    
    if buttonState1 then
        print( "Button 1 is ON" )
    else
        print( "Button 1 is OFF" )
    end
end

To craft your own pan/tilt device, a Phidget Advanced Servo board can be used. A plug-in for this is not included into official installation package, but can be obtained separately from GitHub. Once added into Computer Vision Sandbox, it can be used either on its own to control servos or with the above mentioned gamepad device plug-in.

servos = Host.CreatePluginInstance( 'PhidgetAdvancedServo' )

-- Connected to Phidget Servo board connected to the system
if not servos:Connect( ) then
    error( 'Failed connecting to servo board' )
end

-- Check number of supported servos
motorCount = servos:GetProperty( 'motorCount' )

-- Configure velocity limit, acceleration and position range
servos:SetProperty( 'velocityLimit', { 2, 2 } )
servos:SetProperty( 'acceleration', { 20, 20 } )
servos:SetProperty( 'positionRange', { { 105, 115 }, { 135, 145 } } )

-- Engage both servos
servos:SetProperty( 'engaged', { true, true } )

-- Set target position of servo 1 and 2
servos:SetProperty( 'targetPosition', { 110, 140 } )

function Main( )
    -- Check actual position of servos and they are still moving
    actualPosition = servos:GetProperty( 'actualPosition' )
    stopped        = servos:GetProperty( 'stopped' )
    
    -- Set new target positions
    servos:SetProperty( 'targetPosition', 1, 115 )
    servos:SetProperty( 'targetPosition', 2, 135 )
end

Another supported device from the same manufacturer is Phidget Interface Kit, which allows interacting with digital inputs/outputs and with analog inputs. For example, it can be possible to control video processing routine depending on the state of inputs. Or control devices connected to digital outputs depending on what is detected in the video stream.

kit = Host.CreatePluginInstance( 'PhidgetInterfaceKit' )

-- Connected to Phidget Interface Kit board plugged into the system
if not kit:Connect( ) then
    error( 'Failed connecting to interface kit board' )
end

-- Check number of available digital/analog I/O
digitalInputCount  = kit:GetProperty( 'digitalInputCount' )
digitalOutputCount = kit:GetProperty( 'digitalOutputCount' )
analogInputCount   = kit:GetProperty( 'analogInputCount' )

-- Switch OFF all digital inputs (assuming 8 inputs available)
kit:SetProperty( 'digitalOutputs', { false, false, false, false,
                                     false, false, false, false } )

function Main( )
    -- Switch ON 1st and 2nd digital outputs
    kit:SetProperty( 'digitalOutputs', { true, true } )
    -- Also switch ON the 7th output
    kit:SetProperty( 'digitalOutputs', 7, true )
    
    -- Read digital/analog inputs
    analogInputs  = kit:GetProperty( 'analogInputs' )
    digitalInputs = kit:GetProperty( 'digitalInputs' )
    
    for i = 1, #analogInputs do
        print( 'Analog input', i, 'is', analogInputs[i] )
    end

    for i = 1, #digitalInputs do
        print( 'Digital input', i, 'is', digitalInputs[i] )
    end
end

Now, suppose we have something connected over serial port which implements some specific communication protocol. For example, it can be an Arduino board running some sketch, which allows controlling some of its electronics by sending commands over serial interface. For this, we can use the Serial Port communication device plug-in and implement the supported protocol by using Read/Write API. For example, the script below demonstrates communication with an Arduino device to switch LED on/off and query push button's state (it is assumed the Arduino board is running sample sketch from here).

local string = require 'string'

serialPort = Host.CreatePluginInstance( 'SerialPort' )
serialPort:SetProperty( 'portName', 'COM8' )

-- Use blocking input, read operations wait up to the configured
-- timeout value
serialPort:SetProperty( 'blockingInput', true )
-- Total Read Timeout = ioTimeoutConstant + ioTimeoutMultiplier * bytesRequested
serialPort:SetProperty( 'ioTimeoutConstant', 50 )
serialPort:SetProperty( 'ioTimeoutMultiplier', 0 )

function Main()
    
    if serialPort:Connect( ) then
        print( 'Connected' )
        
        -- Test IsConnected() method
        print( 'IsConnected: ' .. tostring( serialPort:IsConnected( ) ) )

        -- Let Arduino board reset and get ready
        sleep( 1500 )
        
        -- Switch LED on - send command as string
        sent, status = serialPort:WriteString( 'led_on\n' )
        
        print( 'status: ' .. tostring( status ) )
        print( 'sent  : ' .. tostring( sent ) )

        strRead, status = serialPort:ReadString( 10 )

        print( 'status  : ' .. tostring( status ) )
        print( 'str read: ' .. strRead )

        -- Switch LED off - sned command as table of bytes 
        sent, status = serialPort:Write( { 0x6C, 0x65, 0x64, 0x5F, 0x6F, 0x66, 0x66, 0x0A } )
        
        print( 'status: ' .. tostring( status ) )
        print( 'sent  : ' .. tostring( sent ) )

        readBuffer, status = serialPort:Read( 10 )
        
        print( 'status    : ' .. tostring( status ) )
        print( 'bytes read: ' )
        for i=1, #readBuffer do
            print( '[', i, ']=', readBuffer[i] )
        end
        
        -- Check button state
        sent, status = serialPort:WriteString( 'btn_state\n' )
        
        print( 'status: ' .. tostring( status ) )
        print( 'sent  : ' .. tostring( sent ) )

        strRead, status = serialPort:ReadString( 10 )

        print( 'status  : ' .. tostring( status ) )
        print( 'str read: ' .. strRead )
        
        if string.sub( strRead, 1, 1 ) == '1' then
            print( 'button is ON' )
        else
            print( 'button is OFF' )
        end

        -- Test that communication is not blocking
        print( 'Testing timeout' )
        strRead, status = serialPort:ReadString( 10 )

        print( 'status  : ' .. tostring( status ) )
        print( 'str read: ' .. strRead )
        
        serialPort:Disconnect( )
    end
end

As we can see, adding support for device plug-ins expands the range of applications for the Computer Vision Sandbox. Indeed, there are many interesting ways of combining video processing and computer vision with different available devices.

Sandbox Scripting Threads

As it was just demonstrated in the previous chapter, device plug-ins allow talking to a variety of different devices making it possible to create more interactive applications. Interaction with different devices can be done from the same scripts as those used to perform some video processing. In many cases, however, it is preferred to put communication with devices into separate scripts instead of doing it from video processing scripts. There are number of reasons for that. First, there may not be relation with performed video processing at all. For example, a pan/tilt device may move the camera at certain time intervals, which don't depend on results of video processing algorithms. Or, robot's movement can be controlled based on inputs from another device. Second, very often it is preferred to complete video processing as soon as possible, so that video source does not get blocked. Communication with some devices, however, may involve certain delays caused by connection speed, protocols in use, etc. Another reason could be requirement to interact with certain devices at time intervals, which are not based on video source's frame rate, i.e., have more frequent interactions with some devices and less frequent with others.

To address the need of running some scripts independent from video processing, Computer Vision Sandbox has a concept of sandbox scripting threads. Sandbox wizard allows not only configuring which video processing steps to run for each camera within a sandbox, but also create additional threads, which run specified scripts at set time intervals. For example, the screenshot below demonstrates a possible set-up for controlling the PiRex robot. The first thread runs a script for controlling robot's motors based of game pad's input. To make the robot responsive enough, the thread runs control script at 10 milliseconds intervals. The second thread runs a different script, which queries distance measurements provided by robot's ultrasonic sensor. As it is mostly informational, we chose to run it 10 times a second, i.e., at 100 milliseconds intervals.

Sandbox threads

The scripts running within sandbox threads have very similar structure to those used to perform video processing on camera's images. They have a global section and a Main() function. The global section is executed once, when sandbox gets started (before starting any video sources). And the Main() function is executed again and again at the configured time intervals.

Let's have a look at potential implementation of the scripts used for the above shown sandbox threads. The first script does robot's control - changes motors' power based on game pad's input. It has nothing to do with the video coming from robot's camera and we want to run it at a higher rate than camera's FPS. Looks like a perfect candidate to run on its own in a sandbox thread. All it does is reading values of game pad's axes, converting those into motors' power values and sending them to the robot so it performs desired movement.

local math = require 'math'

gamepadPlugin = Host.CreatePluginInstance( 'Gamepad' )
pirexPlugin   = Host.CreatePluginInstance( 'PiRexBot' )

prevLeftPower  = 1000
prevRightPower = 1000

-- Configure gamepad and connect to it
gamepadPlugin:SetProperty( 'deviceId', 0 )
gamepadPlugin:Connect( )

-- Configure PiRex Bot and connect to it
pirexPlugin:SetProperty( 'address', '192.168.0.12' )
pirexPlugin:Connect( )

function Main( )
    axesValues = gamepadPlugin:GetProperty( 'axesValues' )

    -- Pulling gamepad's axis up result in -100, down: 100
    -- So need to invert it here to get something making sense
    leftPower  = 0 - math.floor( axesValues[2] * 100 )
    rightPower = 0 - math.floor( axesValues[3] * 100 )

    -- Set motors' power
    if ( math.abs( prevLeftPower - leftPower ) ) then
        pirexPlugin:SetProperty( 'leftMotor', leftPower )
    end

    if ( math.abs( prevRightPower - rightPower ) ) then
        pirexPlugin:SetProperty( 'rightMotor', rightPower )
    end

    -- Remember motors' power
    prevLeftPower  = leftPower
    prevRightPower = rightPower
end

The second script we have runs at 100 milliseconds intervals and is used to read distance measurements provided by robot's ultrasonic sensor. There is not much we'll do about it, but just display it to user directly on the video coming from robot's camera. This requires some image processing (drawing) for displaying the distance to obstacles, which means we could put the code for reading the sensor into the script doing camera's video processing. However, as mentioned before, sensor reading may cause certain delays and we don't really want to introduce those into video processing. So, we'll separate sensor reading and measurement displaying into two scripts, which communicate by using host variables.

local string = require 'string'

pirexPlugin = Host.CreatePluginInstance( 'PiRexBot' )

-- Configure PiRex Bot and connect to it
pirexPlugin:SetProperty( 'address', '192.168.0.12' )
pirexPlugin:Connect( )

function Main( )
    -- Get distance to obstacles in front of the robot
    distance = pirexPlugin:GetProperty( 'obstacleDistance' )

    Host.SetVariable( 'obstacleDistance', string.format( '%.2f', distance ) )
end

As we can see from above, the script only reads distance measurements and puts those into a host variable - nothing more. Obviously, this will not display anything to user, but this is where video processing script comes into play. Among other things we may want to do with images coming from robot's camera, we can also output the distance measurement, which can be retrieved from the host variable.

drawing = Host.CreatePluginInstance( 'ImageDrawing' )

function Main( )
    image    = Host.GetImage( )
    distance = Host.GetVariable( 'obstacleDistance' )
    
    -- ... Perform any image processing we wish to ...
    
    -- Distance to obstacles in front of robot
    drawing:CallFunction( 'DrawText', image, 'Distance : ' .. distance,
                          { 10, 10 }, '00FF00', '00000000' )
end

The above use case demonstrates usage of sandbox threads and how those can be used to perform certain actions at configured time intervals. All scripts (threading or video processing) running within a sandbox can communicate by setting/reading host variables. This may allow different scenarios. A video processing routine can be driven by reading some sensors. However, an opposite can be done as well, i.e., a video processing script may set some variables based on the results of applied algorithms and then a threading script can read those variables and drive some device's actuators.

To demonstrate all the above in action with some video processing on top, here is a short video of the PiRex robot controlled with gamepad to hunt for hidden glyphs.

Project's Code

The entire project’s source code is available in its GitHub repository. The code is primarily developed in C/C++ to get most out of the available resources and provide reasonable performance. It was also developed with the idea of being portable, so that eventually it could be built for other than just Windows platforms. In the early stages, it was really so with tests running on both Window and Linux, but then more effort was put into getting something out and running. So, for now only a Window installation package is provided, while support for other platforms potentially coming in future releases.

The Computer Vision Sandbox project uses number of open source components to do image/video decoding/encoding, provide scripting capabilities, built-in editor, etc. In addition to those, it also uses Qt Framework to get cross-platform user interface.

Building of the project is done in two stages. The first part is to build all external components. This usually needs to be done once to get required libraries and binaries of all dependencies. Then the project's code itself can be built. It is possible to either build everything by running a single script or build individual components as needed, which is a common case when developing new features. Two tool chains are currently supported by the project – Visual Studio 2015 (Community Edition will work fine) and MinGW. VS is mostly used for development/debugging, while all official releases are done with MinGW till now.

The source code of the project has grown quite substantially over the last few years, so describing its details in a single article may not be feasible by now. The foundation of it is provided by "afx" libraries, which provide common types and functions, including image processing algorithms and access to some video sources. Then a set of core libraries define interfaces for plug-ins, their management, scripting and the backbone for running video processing sandboxes. A good collection of plug-ins implement those interfaces providing variety of video sources, image/video processing routines, image importing/exporting, devices, etc. Finally, some applications are provided. The main one of them is the Computer Vision Sandbox, which has been described in this article. Another useful one is Computer Vision Sandbox Scripts Runner (cvssr), which is a command line tool to run some simple scripts for image processing, interaction with devices, etc. The provided collection of plug-ins can be potentially re-used in other applications as well, as the project provides C++ library for their loading and management.

Conclusion

Well, this is it for now about the Computer Vision Sandbox project. Although the article may not provide detailed description of every single feature implemented in the project, it does provide a good review of the key features and how to use them for different applications. The project's web site provides additional tutorials describing the rest of the features in detail and giving more examples of how to use them.

As it was stated in the beginning, the idea was to build a project, which allows implementing different applications from various areas of computer vision. It was made very modular, so that individual features are delivered as plug-ins. Depending on the type of plug-ins and the way those are combined, a very different result can be achieved. And if a new camera, image processing routine, device, etc. need to be supported – just add a new plug-in, no need to go deep into the main application's code.

Doing different computer vision related projects in the past, I usually ended up making a new application for a new project. Now, however, I try to do it just as a script. And if I find something missing, then developing a new plug-in. No more additional applications for different things, one is enough.

Being available for number of years (although not in open source shape), the project was used successfully to implement different applications. Some of them as different hobby projects. But some are in the areas of process automation used in labs/production. Hope the Computer Vision Sandbox project will continue to evolve and more and more interesting applications can be developed based on it.