Sony’s N – The Smart Wireless Headset that You Can Program

Jeffrey T. Fritz

0/5 (0 vote)

Feb 23, 2017

CPOL

12 min read

17876

Writing your own segments for Nigel with the Future Lab Program

I’ve spent a lot of time using wireless headsets with my mobile phone. Let's face it, no one wants to drive, walk their dog, run on a treadmill, or work in the kitchen while holding a phone up to their ear. Even worse, you don’t want a wire hanging out of your ear and getting caught on something and disturbing whatever you may be doing on your phone. On top of that, most wireless headsets today are a simple bluetooth speaker with no real functionality besides simply amplifying what your phone is doing.

As a consumer, I’m looking for a smartphone accessory that fundamentally upgrades my mobile communication and computing experience. With Sony’s N, I think they’re on to something very promising.

N – What’s in a Name?

N? Sony is shipping a device called "N?" Well, sort of ... In an age where companies are betting billions on devices and need to make the launch of the next phone or tablet successful, how do you succeed without a little market research? Better yet, the software industry has embraced the agile methodologies, why not take a similar approach with hardware? That is exactly what Sony is doing.

N is a fully realized prototype that software developers can get their hands on to start testing and exploring in the name of feedback and market research for Sony. It's not a fully-functional device, and is clearly short of a few features that could potentially be added to make it a complete phone accessory. It's no surprise that Sony omitted a few features, so that their engineers can iterate and continue to improve the device.

To complete the concept of agile hardware development, Sony has not even given the product a full code-name for the public. As technologists we’ve heard codenames for Chicago, <INSERT OTHER CODENAMES> but this device is only labeled ‘N’, just a few characters short of a cool codename. That’s just the point though ... if the feedback isn’t strong, and developers aren’t eager to work with the device, then it can just be quietly discontinued and Sony can quickly move on to constructing the next innovative device.

What does ‘N’ look like?

Here is what my unboxing experience felt like when I received the device in the mail. It arrived in a simple unmarked white box, and when I opened it I found this cylindrical box with a label on top:

That is a cool initial impression, but what’s inside? I took the top, or what I thought was the top off and found this:

The N device is a horseshoe shaped headset that rests on your neck. I’ve been calling it a neckband and I’ve seen wireless headsets like this before and was worried that there were no wired headphones included to connect to the horseshoe. That was when I discovered that there was another layer to the round box. When I lifted it, I found the following:

There’s a USB power adapter and some strange looking earbuds. I’ve been using Apple Ear Pods on and off for the last few years, and these looked NOTHING like ear pods:

Umm, someone forgot to close the back of the earphone, because I can see right through it. I read through the instructions, plugged the headphones into the neckband and powered it up. After about 60 seconds, N started and I placed it on my neck. The earphones were an interesting fit, clipping around my earlobe and allowing me to hear everything around me while no sound was being played through the device.

I downloaded and installed the Future Lab Program N application on my iPhone and was walked through a simple tutorial that showed me some of these cool features of the device.

What Makes N Different?

There is an app to install on your mobile phone to connect to and work with the headset. I downloaded the app to my iPhone 6s Plus and started through its tutorial to learn how to use the device. I tried some of the built in news stations that allow me to get weather and up to date information from internet sources, accessed through my phone’s wireless connection. I could use voice commands with the headset by saying "Listen Up Nigel" and issuing various voice commands. My wife and kids started to find it amusing that I would start talking to ‘Nigel’ while walking around the kitchen, or walking the dog, or even in the car while driving.

I deduced that N was more than just a neckband with cool headphones, but rather a complete computing experience with a camera and a drive that I could access. I could write extensions with JavaScript that could be loaded onto the device to enable more functionality. This is what I was looking for: a visionary piece of gear that I could extend with software to make my own and significantly improve my mobile experience.

I would start playing music or a podcast on my phone, and I could listen to it directly through the headset, as you would expect. Those pass-through earbuds are very handy when you’re listening to music and someone wants to start a conversation. You can hear perfectly fine, and the comparison I would draw is that it feels like there is a loud stereo turned on near you and you can still hear the folks in the room with you.

On the other hand, N doesn’t know how to start or manage phone calls. As a daily phone call headset, it doesn’t offer any features for the phone app on your mobile device ... yet. It does have a small drive available that stores the pictures that you instruct Nigel to capture by issuing the ‘photo’ command. You can also instruct Nigel to "start interval photo shooting" and the camera will capture photos every minute until instructed to stop. That’s an interesting concept: take pictures of everything I’m doing and at the end of the session I can review what Nigel captured by downloading the pictures from the headset’s on-board drive.

In order to access the content of the drive with anything over than your voice, you need to connect the headset to your PC or Mac with the provided micro-USB cable. The headset appears to your computer as another drive and you can do normal file management options on that drive. Besides photos, the drive can also store local copies of your favorite music files in mp3 format for easy playback without needing to be connected to the phone. You can of course stream music from your phone like a normal Bluetooth headset.

Get Started with Development

This is the neat part of the device, writing your own segments for Nigel to obey and do some work for you. First, you will need to sign up for the Future Lab Program and download the SDK from their Getting Started page. The SDK is a NodeJS package that you will need to download and configure using npm. The instructions on the Getting Started page walk you through placing the libraries on disk and building their Electron-based developer tool called the "Segment Developer Tools"

Once this is installed, you can start writing some JSON and JavaScript files that will define your segment. For this sample, I’m going to write a segment that reads the latest three headlines from the MSDN Web Development blog. To get started, you should create a new folder for your segment with two child folders: app and res. App will contain the code to run your application and res will contain JSON configuration files about the segment.

At the root level of the folder, you need to create three files to define your application:

LaunchRules.json – a set of definitions of how your segment will be started by a N user
Manifest.json – Segment meta-data such as the name of the segment, author, description, and version information.
index.html – this contains pointers to the JavaScript files that your segment needs to run

The LaunchRules.json file has a very simple format, and for this segment where I just want a simple "check web dev blog" command to start the segment, I can write:

{
    "voicePattern": [{
            "domain": "CUSTOMAPP",
            "name": ["web dev blog"]
    }]
}

The Domain is required to be "CUSTOMAPP", but the name array contains the name of the command to launch the segment. This value should be all lower case.

Next, the Manifest.json needs to contain some simple information about the segment. For this blog reader, I wrote the following:

{
    "package": "MSDN.Blog.WebDev",
    "manifestVersion": 1,
    "name": "MSDN WebDev Blog Latest",
    "shortName": "MSDN_WebDev_Latest",
    "author": {
        "name": "Jeffrey T. Fritz"
    },
    "description": "The latest news from the MSDN Web Development Blog",
    "version": 1,
    "versionName": "1.0.2"

}

This is the minimal amount of information to provide about the segment. Of note: the "manifestVersion" is required to be a 1 and the "version" is required to be an integer. I’ve incremented the string "versionName" because I’ve already found 2 bugs and patched them in my demo code.

Let’s configure the index.html file to contain pointers to my application’s JavaScript files. This is a simple HTML file containing only script elements that reference my JavaScript:

<html>
    <head>
        <script type="text/javascript" src="app/lib/jquery-2.1.4.min.js"></script>
        <script type="text/javascript" src="app/blog.js"></script>
    </head>
    <body>
        <!-- Not used -->
    </body>
</html>

Next, we need to start writing some resources for configuring the segment. In the res folder, we need to create a "scripts" child folder and place a "scripts.json" file in that folder. This file defines the music, the sounds, or the "script" of the segment as it is launched and concluded. A simple default file contains the following entries which trigger some nice music for intro and outro on the device:

{
  "segmentStart": {
    "common": "start segment"
  },
  "segmentEnd": {
    "common": "stop segment"
  }
}

The final two pieces of configuration shall be placed in the res/settings folder. These two files are schema.json and defaultValues.json. schema.json defines the various settings of the application including the description to show on the mobile application. defaultValues.json defines the default values of those settings. My schema.json contained the following:

{
  "schema": {
    "scope": "MSDN.Blog.WebDev",
    "version": "1.0",
    "settings": [{
        "key": "description",
        "displayName": "A summary of the latest news from the MSDN Web Development Blog.\nThis segment starts with the following voice commands:\n - Start web dev blog\n - Check web dev blog\n\njeff@jeffreyfritz.com",
        "layout": "description"
    }]
  }
}

There are additional settings that you can apply to this file in order to capture other settings from the segments configuration page in the N mobile application. The defaultValues that corresponds to this schema is:

{
  "defaultValues": {
    "scope": "MSDN.Blog.WebDev",
    "version": "1.0",
    "settings": { }
  }
}

The scope in both files should match the package name defined in the Manifest.json. With this metadata configured and the launch information saved, I can start writing some code in my app folder. In order to simplify fetching data using AJAX and promises, I grabbed a copy of jQuery and placed it in my app folder. Next, I started writing a script file called blog.js that would contain all of my business logic for fetching and reading blog posts.

In the blog.js file, we can reference and use the main object exposed by the API, called "da". There are two primary event handlers that are exposed that I configured: segment.onpreprocess and segment.onstart. In the onpreprocess event, I wrote some code to reach out to the RSS feed for the blog and parse the entries into JavaScript objects that I could instruct Nigel to read during the onstart event. My onpreprocess handler looks similar to this:

da.segment.onpreprocess = function(trigger, args) {
    
    var blogData = new BlogData(); 
    blogData.load().done(function(errorCode) {

        if (errorCode == 0) {

            da.startSegment(null, {
                args: {
                    blog: blogData.entries
                }
            });

        }

    });

}

My BlogData object knows how to use jQuery AJAX to fetch the RSS and return that object as a series of items in the blogData.entries property. You review that code in the attached sample code download. The onstart handler is where things are a little interesting, because we can start to use the Speech Synthesizer to read text to our user:

da.segment.onstart = function (trigger, args) {
    var synthesis = da.SpeechSynthesis.getInstance();

    synthesis.speak(args.blog.length + " new items from the M S D N Web Development Blog", {
        onend: function() {
            SpeakNextStory(synthesis, args.blog);
        }
    });

};

var storyCounter = 0;

function SpeakNextStory(synthesis, blogEntries) {

    if (storyCounter >= blogEntries.length) { 
        synthesis.speak("That's all from the M S D N Web Development Blog", {
            onend: function(){
                console.log("Done reading stories");
                da.stopSegment();
            }
        });
    } else {
        synthesis.speak(blogEntries[storyCounter].title + blogEntries[storyCounter].description, {
            onend: function(){
                console.log("Story " + storyCounter);
                storyCounter++;
                SpeakNextStory(synthesis, blogEntries);
            }
        });
         
    }

}

I grabbed an instance of the speech synthesizer with the da.SpeechSynthesis.getInstance() call, and use the promises baked in to that API in order to ensure that my code runs at the end of each sentence being read. I chained calls to the SpeakNextStory method to allow this recursive JavaScript promise code structure to be structured. Once the third story has been read, a wrap-up sentence is read and the stopSegment method is called.

StopSegment signals to the device that this segment has completed processing and should be deallocated.

Testing the Segment

Using the Segment Developer Tools application, I can click the blue ‘Add’ button above the Segment List section to browse to my folder of code and add my segment to the list. You can see in the image below what it looks like when I add my segment:

I’ve drawn an arrow to the Launch button for this segment, because I can click this and listen to how the device would interpret and use my application. Under the ‘Debug’ menu item is a ‘Segment DevTools’ option that will allow me to use the familiar Chrome JavaScript debugging tools with my JavaScript code in my segment to test and step through my code.

Once my code is working the way I want it to, I can click the ‘Export’ button in the middle of the screen to compile my code and configuration into a CPK file that can be uploaded to the Future Lab Program website and activated on my device.

Click the ‘Developer Profile’ link on the left-side and then choose to ‘Upload Applications’ to locate and upload your CPK file to the Future Lab Program website. Once it is uploaded, you will see it in the Manage Applications area, as demonstrated in the image above. You can click the ‘Manage’ button next to your segment name in order to request that your segment be published for the public to use. With my segment in the "Test Applications" area, I can find my MSDN Blog segment in the ‘Segments’ area of the mobile app.

From there, I can activate it and start asking my N device "check web dev blog’ to report on the last three articles on the blog.

Summary

We learned about Sony’s exciting new Future Lab Program and how to sign up to get involved with the program. Its easy to download the NodeJS SDK and start using the emulator to debug code that would run on the device. If you’re really interested in getting involved, you can request loaner hardware from Sony from the developer site. Sign up for Sony's Future Lab Program and get started building voice activated applications to improve your mobile computing experience.