Real-Time Computer Vision on Android using BoofCV

lessthanoptimal

4.90/5 (30 votes)

Feb 27, 2013

CPOL

6 min read

80739

3217

A simple tutorial on how to create an application on Android which processes video in real-time.

Download example source code

Colorized image gradient displayed on an Android cell phone.

Introduction

In this article, a step by step tutorial will be given for writing a simple computer vision application on Android devices using BoofCV. At the end of the tutorial, you will know how to process a video feed, compute the image gradient, visualize the gradient, and display the results. For those of you who don't know, BoofCV is an open source computer vision library written in Java, making it a natural fit for Android devices.

Most of this article is actually going to deal with the Android API, which is a bit convoluted when it comes to video streams. Using BoofCV on Android is very easy and requires no modification or native code. It has become even easier to use since BoofCV v0.13 was released, with its improved Android integration package.

Before we start, here are some quick links to help familiarize you with BoofCV and its capabilities. I'm assuming that you are already familiar with Android and the basics of Android development.

The source code for this tutorial can be downloaded from the link at the top. This project is entirely self contained and has all the libraries you will need.

Changes

Since this article was first posted (February 2013), it has been modified (January 2014) to address the issues raised by "Member 4367060", see discussion below. While the effects of these changes are not readily apparent due to how the project is configured, it is more correct and can be applied to other configurations with fewer problems.

Application can now be loaded onto Nexus 7 devices
Front facing images are flipped for correct viewing
Fixed bug where camera capture doesn't start again if surfaceCreated isn't called

BoofCV on Android

As previously mentioned, BoofCV is a Java library, which means the library does not need to be recompiled for Android. Jars found on its download page can be used without modification. For Android specific functions, make sure you include the BoofCVAndroid.jar, which is part of the standard jar download or can be compiled by yourself. See project website for additional instructions.

The key to writing fast computer vision code on Android is efficient conversion between image formats. Using RGB accessor functions in Bitmap is painfully slow and there is no good build in way to convert NV21 (video image format). This is where the Android integration package comes in. It contains two classes which will make your life much easier.

ConvertBitmap
ConvertNV21

Use those classes to convert from Android image types into BoofCV image types. Here are some usage examples:

// Easiest way to convert a Bitmap into a BoofCV type
ImageUInt8 image = ConvertBitmap.bitmapToGray(bitmap, (ImageUInt8)null, null); 
 
// From NV21 to gray scale
ConvertNV21.nv21ToGray(bytes,width,height,gray);

Capturing Video on Android

On Android, you capture a video stream by listening in on the camera preview. To make matters more interesting, they try to force you to display a preview at all times. Far from the best API that I've seen for capturing video streams, but it's what we have to work with. If you haven't downloaded the example code, now would be a good time to do so.

Video on Android Steps:

Open and configure camera
Create SurfaceHolder to display camera preview
Add view on top of the camera preview view for display purposes
Provide a listener for the camera's preview
Start camera preview
Perform expensive calculations in a separate thread
Render results

Before you can access the camera, you must first add the following to AndroidManifest.xml.

<uses-permission android:name="android.permission.CAMERA" />
<uses-feature android:name="android.hardware.camera" android:required="false" />
<uses-feature android:name="android.hardware.camera.autofocus" android:required="false" />

If this is a computer vision application, which must use a camera to work, why is it told not to require a camera? Turns out that if you tell it to require a camera, then you will exclude devices (such as a tablet) from the Play store with only one forward-facing camera! In newer version of Android OS, there is apparently some way to get around this issue.

Take a look at VideoActivity.java. Several important activities take place in onCreate() and onResume().

View for displaying the camera preview is created and configured.
View for rendering the output is added.
The camera is opened and configured.

@Override
public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);

    requestWindowFeature(Window.FEATURE_NO_TITLE);
    setContentView(R.layout.video);

    // Used to visualize the results
    mDraw = new Visualization(this);

    // Create our Preview view and set it as the content of our activity.
    mPreview = new CameraPreview(this,this,true);

    FrameLayout preview = (FrameLayout) findViewById(R.id.camera_preview);

    preview.addView(mPreview);
    preview.addView(mDraw);
}

@Override
protected void onResume() {
    super.onResume();
    setUpAndConfigureCamera();
}

The variable mPreview, which is an instance of CameraPreview (discussed below), is required to capture video images. mDraw draws our output on the screen. FrameLayout allows two views to be placed on top of each other, which is exactly what's being done above.

Configuring the Camera

In 'VideoActivity.onCreate()' it invokes the setUpAndConfigureCamera() function. This function opens a camera using selectAndOpenCamera(), configures it to take a smaller preview image, starts a thread for processing the video, and passes the camera to mPreview.

private void setUpAndConfigureCamera() {
    // Open and configure the camera
    mCamera = selectAndOpenCamera();

    Camera.Parameters param = mCamera.getParameters();

    // Select the preview size closest to 320x240
    // Smaller images are recommended because some computer vision operations are very expensive
    List<Camera.Size> sizes = param.getSupportedPreviewSizes();
    Camera.Size s = sizes.get(closest(sizes,320,240));
    param.setPreviewSize(s.width,s.height);
    mCamera.setParameters(param);

    // declare image data
   ....

    // start image processing thread
    thread = new ThreadProcess();
    thread.start();

    // Start the video feed by passing it to mPreview
    mPreview.setCamera(mCamera);
}

A good practice when dealing with video images on Android is to minimize the amount of time spent in the preview call back. If your process takes too long, it will cause a backlog and cause a crash. Which is why we start a new thread in the above function. The very last line in the function passes the camera to mPreview so that the preview can be displayed and the video stream started.

Why not just call Camera.open() instead of selectAndOpenCamera()? Camera.open() will only return the first back-facing camera on the device. In order to support tablets, with only a forward-facing camera, we examine all the cameras and return the first back-facing camera or any forward-facing one we find. Also note that flipHorizontal is set to true for forward-facing cameras. This is required for them to be viewed correctly.

private Camera selectAndOpenCamera() {
    Camera.CameraInfo info = new Camera.CameraInfo();
    int numberOfCameras = Camera.getNumberOfCameras();
    int selected = -1;

    for (int i = 0; i < numberOfCameras; i++) {
        Camera.getCameraInfo(i, info);

        if( info.facing == Camera.CameraInfo.CAMERA_FACING_BACK ) {
            selected = i;
            flipHorizontal = false;
            break;
        } else {
            // default to a front facing camera if a back facing one can't be found
            selected = i;
           flipHorizontal = true;
       }
    }

    if( selected == -1 ) {
        dialogNoCamera();
        return null; // won't ever be called
    } else {
        return Camera.open(selected);
    }
}

Camera Preview View

CameraPreview.java's task is to placate Android and "display" the camera preview so that it will start streaming. Android requires that the camera preview be displayed no matter what.

CameraPreview can display the preview or hide it by making it really small. It's also smart enough to adjust the display size so that the original camera image's aspect ratio is maintained. For the sake of brevity, a skeleton of CameraPreview is shown below. See code for details.

public class CameraPreview extends ViewGroup implements SurfaceHolder.Callback {
 
    CameraPreview(Context context, Camera.PreviewCallback previewCallback, boolean hidden ) {
        // provide context, camera callback function and specify if the preview should be hidden
        ...
 
        // Create the surface for displaying the preview
        mSurfaceView = new SurfaceView(context);
        addView(mSurfaceView);
 
        // Install a SurfaceHolder.Callback so we get notified when the
        // underlying surface is created and destroyed.
        mHolder = mSurfaceView.getHolder();
        mHolder.addCallback(this);
        // deprecated setting, but required on Android versions prior to 3.0
        mHolder.setType(SurfaceHolder.SURFACE_TYPE_PUSH_BUFFERS);
    }
 
    public void setCamera(Camera camera) {
        ...
        if (mCamera != null) {
            // Without calling startPreview() here the video will not
            // wake up under certain conditions
            startPreview();
            requestLayout();
        }
    }
 
    protected void onMeasure(int widthMeasureSpec, int heightMeasureSpec) {
        // adjusts the setMeasuredDimension to hide the preview 
        // or ensure that has the correct aspect ratio
        ...
    }
 
    protected void onLayout(boolean changed, int l, int t, int r, int b) {
        // adjust the size of the layout so that the aspect ratio is maintained
        ...
    }

    public void surfaceCreated(SurfaceHolder holder) {
        startPreview();
    }

    protected void startPreview() {
        mCamera.setPreviewDisplay(mHolder);
        mCamera.setPreviewCallback(previewCallback);
        mCamera.startPreview();
  }
}

Processing the Camera Preview

Each time a new frame has been captured by the camera, the function below will be called. The function is defined in VideoActivity, but a reference is passed to CameraPreview since it handles initialization of the preview. The amount of processing done in this function is kept to the bare minimum to avoid causing a backlog.

/**
 * Called each time a new image arrives in the data stream.
 */
@Override
public void onPreviewFrame(byte[] bytes, Camera camera) {
 
    // convert from NV21 format into gray scale
    synchronized (lockGray) {
        ConvertNV21.nv21ToGray(bytes,gray1.width,gray1.height,gray1);
    }
 
    // Can only do trivial amounts of image processing inside this function or else bad stuff happens.
    // To work around this issue most of the processing has been pushed onto a thread and the call below
    // tells the thread to wake up and process another image
    thread.interrupt();
}

The last line in onPreviewFrame() invokes thread.interrupt(), which will wake up the image processing thread, see the next code block. Note the care that is taken to avoid having onPreviewFrame() and run() manipulate the same image data at the same time since they are run in different threads.

@Override
public void run() {
    while( !stopRequested ) {
 
        // Sleep until it has been told to wake up
        synchronized ( Thread.currentThread() ) {
            try {
                wait();
            } catch (InterruptedException ignored) {}
        }
 
        // process the most recently converted image by swapping image buffered
        synchronized (lockGray) {
            ImageUInt8 tmp = gray1;
            gray1 = gray2;
            gray2 = tmp;
        }

       if( flipHorizontal )
           GImageMiscOps.flipHorizontal(gray2);

        // process the image and compute its gradient
        gradient.process(gray2,derivX,derivY);
 
        // render the output in a synthetic color image
        synchronized ( lockOutput ) {
            VisualizeImageData.colorizeGradient(derivX,derivY,-1,output,storage);
        }
        mDraw.postInvalidate();
    }
    running = false;
}

As mentioned above, all of the more expensive image processing operations are done in this thread. The computations in this example are actually minimal, but they are done in their own thread for the sake of demonstrating best practices. After the image is done being processed, it will inform the GUI that it should update the display by calling mDraw.postInvalidate(). The GUI thread will then wake up and draw our image on top of the camera preview.

This function is also where BoofCV does its work. Gradient computes the image gradient and was declared earlier, as is shown below. After the image gradient has been computed, it's visualized using BoofCV's VisualizeImageData class. That's it for BoofCV in this example.

ImageGradient<ImageUInt8,ImageSInt16> gradient =
     FactoryDerivative.three(ImageUInt8.class, ImageSInt16.class);

Visualization Display

After the preview has been processed, the results are displayed. The thread discussed above updates a Bitmap image 'output' which is displayed in the view below. Note how the threads are careful to avoid stepping on each others feet when reading/writing to 'output'.

**
 * Draws on top of the video stream for visualizing computer vision results
 */
private class Visualization extends SurfaceView {
 
    Activity activity;
 
    public Visualization(Activity context ) {
        super(context);
        this.activity = context;
 
        // This call is necessary, or else the
        // draw method will not be called.
        setWillNotDraw(false);
    }
 
    @Override
    protected void onDraw(Canvas canvas){
 
        synchronized ( lockOutput ) {
            int w = canvas.getWidth();
            int h = canvas.getHeight();
 
            // fill the window and center it
            double scaleX = w/(double)output.getWidth();
            double scaleY = h/(double)output.getHeight();
 
            double scale = Math.min(scaleX,scaleY);
            double tranX = (w-scale*output.getWidth())/2;
            double tranY = (h-scale*output.getHeight())/2;
 
            canvas.translate((float)tranX,(float)tranY);
            canvas.scale((float)scale,(float)scale);
 
            // draw the image
            canvas.drawBitmap(output,0,0,null);
        }
    }
}

Conclusion

Now you know how to perform computer vision using a video stream on the Android platform with BoofCV! Let me know if you have questions or comments.