Advanced JSON Form Specification - Chapter 1: Introduction

Don Fizachi

4.97/5 (12 votes)

Jul 18, 2016

LGPL3

9 min read

28472

1061

A JSON form specification

Chapters

Introduction

In this 8-chapter article, a JSON specification for defining dynamic forms that can be opened and filled out on any platform that understands JSON is presented. These advanced JSON forms can be used to collect data that range from simple alphanumeric values to multimedia files or location based information. Constraints can be built into these forms to ensure that only valid values are accepted as inputs. Features exist in these advanced forms that allow inputs to be skipped where necessary. There are many more features contained within this advanced form specification that are better described by the use of examples. To this end, this multi-chapter article will use an Android app to describe the specification and, among other features, demonstrate the following:

20 different input screen types
Metadata collection
Form screen skip logic
Repeat screens
Forms associated with encryption keys and
Multi-language support

Note that it is important that the reader is familiar with the JSON schema specification to understand the rest of this work. See here for a quick introduction.

Background (Skip this Section If You Want To)

Consider a scenario where a large scale data survey has to be conducted. In the said scenario, each data collector or enumerator is equipped with a hand held device, e.g., a mobile phone or tablet computer. These enumerators download the survey forms from a remote server to their devices and then go into the field to collect data as defined by the parameters of the downloaded form. When each instance of the form is complete, the form is uploaded to the server immediately or as soon as device connectivity is established. At the end of the survey, the data is analyzed. The scenario above describes, in a nutshell, my primary preoccupation in the last couple of years.

A number of features associated with electronic data collection helped my team and I assure an acceptable level of data quality. For example, the constraints placed by the electronic forms prevented enumerators from inputting data that would typically be considered invalid. The fact that we were not collecting data using paper and pen and then converting same to their electronic equivalent further helped to minimize the amount of errors in the data.

The ODK community, more than any other entity, has done a lot to advance the field in large scale electronic data collection. Indeed, our primary tool for data collection up until recently was the ODK Collect. These tools, i.e., ODK, are open sourced which meant they could be built upon or extended as a number of folks have done in the past.

We needed more features from our data collection tools and two options came to mind; one was to build on top of ODK or start from scratch. We decided to start afresh for the following reasons amongst others:

We wanted the format in which the data is collected to match the format in which it is stored, transported, presented, queried and analyzed. As noted in this article, the Java Script Object Notation (JSON) was chosen as the preferred format. It turned out that in terms of size, JSON forms were much smaller than similar XML forms implemented by ODK in most of our use cases. To be fair to the ODK community, at the inception of ODK, JSON was still in its infancy and XML (by way of ODK XForms) was the only viable platform agnostic format for form definition.
We sort to utilize the inbuilt constraint parameters of the JSON Schema (version 4) for form input validation.
For scenarios that required a very high degree of confidentiality, we wanted to place at the heart of cryptographic operations, mechanisms that allowed for the encryption of each completed instance of forms using a unique ephemeral key and a robust symmetric encryption protocol. In our case, the Diffie–Hellman (D–H) key exchange mechanism is used to compute the disposable keys and the Advanced Encryption Standard for encryption.

Taking inspiration from the works of the ODK community, within context, we set out to build the novel JSON advanced form specification described in this 8-chapter article.

Using the Android App

As mentioned in the introductory section, this advanced JSON form specification is engineered primarily for display on graphical user interfaces (GUIs). An Android app called CCA Mobile, courtesy CCA System, is provided alongside this article to demonstrate how forms are displayed on a GUI.

To get started, download and install the app on an Android device. Also, download the file named City Census Form.zip. Extract the JSON form from the ZIP file and copy same to your Android device.

Click on the "Forms" button on the App and then click on "Click to get new form" button to open the "City Census.json" form copied initially to your device. With that done, click on the form to start to fill the form by swiping left or right and inputting values. Save the form (or not) when you get to the end.

To see what a completed form instance looks like, go back to the home screen of CCA Mobile and click on the "Backup" button. Click on the "City Census" form. Select the completed instances to backup. Click on the Backup button at the bottom of the screen. The selected instances will be written to directory "/CcaMobile/Backup/" on your SD card. Now open the backed up JSON instance file using a text editor or some other similar tool.

Please note that the Android app, i.e., CCA Mobile, provided is a stripped down version of the main App. The attached app is provided strictly to demonstrate the power of advanced JSON forms. All rights, stated and unstated, to this app belong to CCA System.

The JSON Advanced Form Schema

This section assumes that the reader has installed the Android app and has played around with the City Census form.

Basic Anatomy of a Form

The block of code below shows an extract from a form instance example:

/*    
{
"formName": "City Census", 
"formID": "76ce800b-acf9-4bb0-9f9d-b1b41acdb606", 
"formDescription": "This form captures the population of cities in 1950, 
                    2000 and projected population in 2050", 
"canSavePartial": true, 
"formScreens": [
{


...

]
}
*/

From the above, it can be seen that a form consists of a number of JSON variables. These variables are described in the bullet points below:

formName: This variable is of type JSON string and its value is the human readable name of the form.
formID: This is a globally unique value that identifies the form. Its value is of JSON string type.
formDescription: This is a human readable string that describes the form. Its value is of JSON string type.
formPublicKey (not shown here): This is a JSON string value that contains a public encryption key in its base 64 string format. This public key is used to compute a symmetric encryption key which is then used to encrypt every completed instance of the form. Its value is of JSON string type and its presence is optional in a form. This JSON variable is discussed in chapter 8.
canSavePartial: This is a JSON Boolean value that indicates whether a partially completed form can be saved and completed at a more convenient time or not.
formScreens: This variable is a JSON array type that holds the definition of all the form input screens that users have to enter values into. The structure of a form screen is discussed in the next section.

The schema that defines a valid form is attached to this article, i.e., Form Schema Verbose or Form Schema Compact. Both files are similar except that the Form Schema Compact file uses the JSON #ref keyword to minimize its size, thus making it more readable.

An interested reader can follow the form definition sections of these walkthroughs to learn how to design forms using this GUI tool.

Basic Anatomy of a Form Input Screen

The image above should be familiar to the reader as this is the third input screen on the City Census form. Note that this form was designed such that if the display device language is set to English, then the screen on the left is shown. If the device’s language is set to French, then the screen on the right is shown. To understand how input screens are defined, consider the JSON code block below:

/*    
{
"mainScreen":  {
"screenID": "City", 
"screenDisplayArray": [
{
"localeCode": "en", 
"screenLabel": "3. What is the name of this city?", 
"screenHint": "Do not leave blank."
}, 
{
"localeCode": "fr", 
"screenLabel": "3. Quel est le nom de cette ville?", 
"screenHint": "Ne pas laisser en blanc."
}], 
"screenwidgetType": "textInput", 
"inputRequired": true, 
"widgetSchema": "
\"City\": {
\"type\": \"string\"
}"
}

*/

The example code block above shows parameters that are common to all JSON input screens. These parameters are defined below:

mainScreen: This is a JSON object that contains parameters that define how the form screen is displayed and the constraints that the input of the screen must conform to.
screenID: This is a JSON string value that uniquely identifies the screen within a form.
screenwidgetType: This is a JSON string value that identifies the type of input screen, e.g., text input, numeric input, multi-choice option, single-choice option, location coordinates capture, photo capture, etc.
inputRequired: This is a JSON Boolean value that indicates whether a screen input is mandatory or optional.
widgetSchema: This is a JSON string value that contains the escaped JSON schema for the screen input.
screenDisplayArray: This is a JSON array whose items are a three-tuple object that contain the text of the instructions displayed for a screen. The instructions guide the user/enumerator on entering/capturing the input value. An item of this array is a JSON object that has the following properties:
- screenLabel: This is a JSON string whose value contains the instruction to be displayed.
- screenHint: This is an optional JSON string that holds an additional instruction to be displayed.
- localeCode: This is a JSON string that contains a ISO 639-1 two letter language code. This code is used to determine whether the associated screenLabel /screenHint are displayed based on the language settings of the device.

There are many other parameters included in a typical form input screen. These parameters are discussed in the relevant sections of the subsequent parts of this article.

See the schema of an input screen in these attached files: Form Schema Verbose or Form Schema Compact.

Next Chapter

In the next chapter of this article, i.e., Chapter 2, the formats of basic input screens are dealt with.

History

18^th July, 2016: First version
3^rd November, 2016: Made corrections to this work