Recursive VBA JSON Parser for Excel
JSON parser for VBA class module, allows recursive data
Introduction
This is a VBA class that can parse recursive JSON data.
Background
I needed to parse JSON data that was highly recursive (Arrays in objects buried in other objects). I didn't find anything suitable already in VBA so I put something together. I thought the result was worthy of sharing.
The class can be used to load JSON data directly from the web, although the sample code loads data to a file first to allow the JSON data to be compared with the parsed data. Parsed data is returned in arrays.
Using the Code
The parser is passed the JSON text in a string
through the "loadstring
" method. The text is then parsed into two arrays; one containing keys, the other containing values. A ">
" character is used to indicate a level of key recursion. Each time a "{
" or "[
" control character is detected, the level is increased. Each time a "}
" or "]
" control character is detected, the level decreases.
The following JSON string
:
{
"Category": "Famous Pets",
"Pet": {
"Size": "Little",
"Type": "Lamb",
"Coat": {
"Fur": "Fleece",
"Color": "White",
"Texture": "Like snow"
}
}
}
Is then parsed into a set of keys and values:
Key(1) = ">Category" |
Value(1) = "Famous Pets" |
Key(2) = ">Pet>Size" |
Value(2) = "Little" |
Key(3) = ">Pet>Type" |
Value(3) = "Lamb" |
Key(4) = ">Pet>Coat>Fur" |
Value(4) = "Fleece" |
Key(5) = ">Pet>Coat>Color" |
Value(5) = "White" |
Key(6) = ">Pet>Coat>Texture" |
Value(6) = "Like snow" |
Once the JSON text has been parsed, a number of properties are used to read the parsed text. The number of key/value pairs is read using the "NumElements
" property (in the example above, NumElements = 6
). The indexed "Key
" and "Value
" properties then hold the parsed information. Keys and values are returned from the class as string
s. Null
values are returned as zero length string
s.
The class also contains an "err
" (status) property to give some indication as to what's going on inside the class. Err
is a long value:
1 = JSON string
has been successfully parsed
-1 = JSON string
has not been loaded, no results are available
-2 = JSON string
cannot be correctly parsed (JSON text not fully or incorrectly formed)
The following code then makes up the class:
Global variables allow the parsed JSON text to be persistent and read out through properties of the class:
Private strKey As Variant
Private strVal As Variant
Private intHMax As Integer
Private lngStatus As Long
The following function initializes the class. A status
variable is set to indicate that no JSON data has been parsed, this variable is then updated during the parsing of the text.
Private Sub Class_Initialize()
lngStatus = -1
End Sub
The "NumElements
" property allows the user to determine the number of key/value pairs that have been extracted from the JSON text:
Public Property Get NumElements() As Integer
NumElements = intHMax
End Property
The keys and values are then made available as array elements. VBA uses a variant type to pass an array value. Array elements that are 'out of bounds' are returned as a zero length string
:
Public Property Get Key(Index As Integer) As Variant
If Index > UBound(strKey) Or Index < LBound(strKey) Then
Key = ""
Else
Key = strKey(Index)
End If
End Property
Public Property Get Value(Index As Integer) As Variant
If Index > UBound(strVal) Or Index < LBound(strVal) Then
Value = ""
Else
Value = strVal(Index)
End If
End Property
The heart of the class is the code that parses the JSON string
. The code looks for the next available control character in the JSON text, captures the text before the control character and shortens the initial string
to the remaining JSON text. The captured text is then used to populate the keys and values. The process repeats until the initial string
has been reduced to nothing. The parser also allows verbose debug data sent to the immediate window by setting: blDebug = True
.
Public Sub LoadString(JSONText As String)
'Load the JSON text into an array
Const cLongMax = (2 ^ 31) - 1 'Maximum Value for Long type
Dim lngIndex As Long
Dim lngContLoc As Long
Dim lngLoc As Long
Dim lngDelimitOffset As Long
Dim lngASize As Long
Dim intNoOfChecks As Integer 'Number of different control characters in JSON
Dim intCheck As Integer
Dim intCtrlChr As Integer
Dim intObJLvl As Integer
Dim intAryElement As Integer
Dim intLvl As Integer
Dim strID As String
Dim strChr As String
Dim strKeyValue As String
Dim strValue As String
Dim strPChar As String
Dim strFoundVal As String
Dim strTempString As String
Dim strAKey() As String
Dim strAVal() As String
Dim strALvlKey(100) As String
Dim blArray As Boolean 'Flag to indicate that an array has been found
Dim blStringArray As Boolean 'Flag to indicate that the element in the array is a string (added v1.1)
Dim BlArrayEnd As Boolean 'Flag to indicate that the end of an array is found (added v1.1)
Dim blValue As Boolean 'Falg to indicate that a value has been found
Dim blKeyAndValue As Boolean 'Found a key and value pair
Dim blDebug As Boolean
'Set the flag to true if you want to see debug information
'during the loading process
blDebug = True
On Error GoTo ErrHandler:
lngASize = 10
ReDim strAKey(lngASize)
ReDim strAVal(lngASize)
'intArrayElement = 1 'initialize value
'initialize values
blArray = False
BlArrayEnd = False '(added v1.1)
blStringArray = False '(added v1.1)
'Generate a string of control characters
'String is {[:,]}"
strID = ""
strID = strID & Chr(123) 'The '{' character
strID = strID & Chr(91) 'The '[' character
strID = strID & Chr(58) 'The ':' character
strID = strID & Chr(44) 'The ',' character
strID = strID & Chr(93) 'The ']' character
strID = strID & Chr(125) 'The '}' character
strID = strID & Chr(34) 'The '"' character
intNoOfChecks = Len(strID)
intObJLvl = 0
lngIndex = 1 'First element in the array will be strKey(1) and strVal(1)
'As we process the JSON string it becomes shorter and shorter, until
'its all been processed
Do While Len(JSONText) > 0
'Set to maximum value as default
lngContLoc = cLongMax
'Find Next control character:
'Scan the text for the closest control character
'to the beginning of the remaining JSON text
For intCheck = 1 To intNoOfChecks
strChr = Mid(strID, intCheck, 1)
lngLoc = InStr(1, JSONText, strChr, vbBinaryCompare)
If (lngLoc > 0) And (lngLoc < lngContLoc) Then
lngContLoc = lngLoc
intCtrlChr = intCheck
strPChar = strChr
End If
Next intCheck
'When the above for next loop ends we will have found the closest control character
'stored in intCtrlChr - an index (1 to 8) to the found character in strChr
'stored in lngContLoc - position of the next control character
'stored in strPChar - the closest next control character
If blDebug = True Then
Debug.Print "Parse Character: " & strPChar
End If
'A control character has been found, figure out what to do by the found character
If lngContLoc < cLongMax Then
'Capture the information before the control character
strValue = Mid(JSONText, 1, lngContLoc - 1)
'Capture everything after the control character (the remaining JSON string)
JSONText = Mid(JSONText, lngContLoc + 1, Len(JSONText))
Else
'We found the end of the JSON string
Exit Do
End If
'Found a number or boolean value or key (the comma)
'Updated in v1.1 to handle number types in array (process value as string or number; not both)
If (intCtrlChr = 4) Then
If ((blValue = True) Or (blArray = True)) And (blStringArray = False) Then
'Found a value, and we already have key
strFoundVal = fnStringToVal(strValue)
blKeyAndValue = True 'Set the "Key and value found" flag
End If
'Finding a comma resets the string found in the array
blStringArray = False
End If
'Start of object (The "{" character)
If intCtrlChr = 1 Then
intObJLvl = intObJLvl + 1
blArray = False 'An object, not an array
blValue = False 'Need to find a key first
If blDebug = True Then
Debug.Print "Start of Object, Moved up to level" & intObJLvl
End If
End If
'End of of object (The "}" character)
If intCtrlChr = 6 Then
'Updated in Revision 1.1
'Numbers preceded by the "}" character
If blValue = True Then
'Get the found value and set a flag
strFoundVal = fnStringToVal(strValue)
blKeyAndValue = True 'Set the "Key and value found" flag
'Add back a "}" character to the string so that the level can be decremented properly
JSONText = "}" & JSONText
Else
'No value was found, the "}" character indicates the end of this level
intObJLvl = intObJLvl - 1
blValue = False 'Need to find a key first
End If
If blDebug = True Then
Debug.Print "End of Object, Moved down to level" & intObJLvl
End If
End If
'Start of array (The "[" character)
If intCtrlChr = 2 Then
'intObJLvl = intObJLvl + 1
'strALvlKey(intObJLvl) = intArrayElement
blArray = True
blValue = True 'Next thing should be a value
intAryElement = 1
If blDebug = True Then
Debug.Print "Start of Array, Moved up to level" & intObJLvl
End If
End If
'End of of array (The "]" character)
If intCtrlChr = 5 Then
'Updated v1.1 parse last numeric or boolean value of an array
If (blArray = True) And (blStringArray = False) Then
'Get the found value and set a flag
strFoundVal = fnStringToVal(strValue)
blKeyAndValue = True 'Set the "Key and value found" flag
End If
BlArrayEnd = True 'Mark that the end of the array is found
blArray = False
blValue = False 'Need to find a key first
If blDebug = True Then
Debug.Print "End of Array, Moved down to level" & intObJLvl
End If
End If
'Object Value start is found (The ":" character)
If intCtrlChr = 3 Then
blValue = True
BlArrayEnd = False 'Added v1.1, start of an object value is not the end of an array
If blDebug = True Then
Debug.Print "ready to get value"
End If
End If
'Start of a string (the quote " character)
'Can be a key or value
If intCtrlChr = 7 Then
'The start of the key or value has been found
'The next quote will end the key or value
'(unless the quote has an escape character in front of it "\")
lngDelimitOffset = 1
Do
'Look for the next quote character
lngLoc = InStr(lngDelimitOffset, JSONText, Chr(34), vbBinaryCompare)
'If the string is zero length "" then exit the loop
If lngLoc = 1 Then
Exit Do
End If
'Check to see if there is a delimter just before the quote
'if there is then quote is part of the string and not the end of
'the string.
If Mid(JSONText, lngLoc - 1, 1) = Chr(92) Then
' The quote character has an escape character in front of it
'so this quote doesn't count. Remove the escape character.
JSONText = Mid(JSONText, 1, lngLoc - 2) & Mid(JSONText, lngLoc, Len(JSONText))
'and move the start of the check past the delimited quote
lngDelimitOffset = lngLoc
'If we have a boogered JSON string where there is no valid closing quotes
'the above "if" will cause an error (the MID statement will attempt to check
'the string starting at a position of -1) and the code will jump to the error
'handling section. If this error didn't occur the do..loop would get stuck.
Else
Exit Do
End If
Loop
'We now have a string, find any other delimiters
'(any delimited " characters have already been fixed)
strTempString = fnStringFix(Mid(JSONText, 1, lngLoc - 1))
If (blValue = True) Or (blArray = True) Then
'The key has been previously found and this is the value for the key
strFoundVal = strTempString
blKeyAndValue = True 'Set the "Key and value found" flag
If blArray = True Then
blStringArray = True 'Added v1.1, mark that the value is a string
End If
Else
If lngLoc > 0 Then
'We've found a key
strALvlKey(intObJLvl) = strTempString
If blDebug = True Then
Debug.Print "Found Key:" & strALvlKey(intObJLvl) & _
" for Level: " & intObJLvl
End If
End If
End If
JSONText = Mid(JSONText, lngLoc + 1, Len(JSONText))
End If
'Found a key and value, move it to the array
If blKeyAndValue = True Then
If lngIndex > lngASize Then
lngASize = lngASize + 100
ReDim Preserve strAKey(lngASize)
ReDim Preserve strAVal(lngASize)
End If
strAKey(lngIndex) = ""
For intLvl = 1 To intObJLvl
strAKey(lngIndex) = strAKey(lngIndex) & ">" & strALvlKey(intLvl)
Next intLvl
'Updated v1.1 - save last element of an array
If (blArray = True) Or (BlArrayEnd = True) Then
'add the array element to the key
strAKey(lngIndex) = strAKey(lngIndex) & ">" & Trim(str(intAryElement))
'increment the array element
intAryElement = intAryElement + 1
'Reset end of array flag (set again when array end is found)
BlArrayEnd = False
End If
strAVal(lngIndex) = strFoundVal
If blDebug = True Then
Debug.Print "Added Key:" & strAKey(lngIndex) & _
" Value: " & strAVal(lngIndex) & " index: " & lngIndex
End If
lngIndex = lngIndex + 1 'Increment the array
blKeyAndValue = False 'Reset the "found" flag
blValue = False 'Reset the "Value Found" flag
End If
DoEvents
Loop
'Number of items found
intHMax = lngIndex - 1
strKey = strAKey
strVal = strAVal
lngStatus = 1 'JSON sucessfully parsed
Exit Sub
ErrHandler:
'Error handling code
lngStatus = -2 'JSON Parse error
'Uncomment the next line to figure out the cause of the issue
'Debug.Print VBA.err.Number
'Debug.Print VBA.err.Description
'Resume
End Sub
The values are stripped of any non-text formatting. Values should be numbers (integer, floating point, or "null
"). Values read as 'null
' are further converted into a zero length string
.
Private Function fnStringToVal(strInStr As String) As String
'Converts a string that contains formatting information into a string that only
'contains a value. Values can be text, integer, or floating point values.
'null is passed back as a zero length string: "".
Dim intStrPos As Integer
Dim strTemp As String
Dim intChar As Integer
'default value
strTemp = ""
'Make sure that the string does not have a zero length
strInStr = " " & strInStr
'Loop through each character in the string and remove anything
'that is not alphanumeric.
For intStrPos = 1 To Len(strInStr)
intChar = Asc(Mid(strInStr, intStrPos, 1))
If ((intChar >= Asc("a")) And (intChar <= Asc("z"))) Or _
((intChar >= Asc("A")) And (intChar <= Asc("Z"))) Or _
((intChar >= Asc("0")) And (intChar <= Asc("9"))) Or _
(intChar = Asc(".")) Or (intChar = Asc("+")) Or (intChar = Asc("-")) Then
strTemp = strTemp & Chr(intChar)
End If
Next intStrPos
'Values that are listed as 'null' are converted to a zero length string
If InStr(1, "null", strTemp, vbTextCompare) > 0 Then
strTemp = ""
End If
fnStringToVal = strTemp
End Function
Finally, JSON supports a number of escape codes. This function looks through the passed string
and performs the requested escape sequence. While VBA string
s support Unicode characters, other parts of Microsoft Excel are more random. Parsed text that is sent to cells or message boxes may not behave as expected and may require further processing.
Private Function fnStringFix(strInput As String) As String
'This function goes through a JSON string and corrects delimited characters
Dim blParseComplete As Boolean
Dim lngStartPos As Long
Dim lngCurrentPos As Long
blParseComplete = False
lngStartPos = 1
Do While blParseComplete = False
blParseComplete = True 'If we don't find any escape sequences then allo the loop to end
'Escaped sequence: replace \\ with \
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\\", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & "\" & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \/ with /
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\/", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & "/" & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \b with a backspace
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\b", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(8) & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \f with a formfeed
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\f", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(12) & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \n with a newline
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\n", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(10) & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \r with a carriage return
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\r", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(13) & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \t with a horizontal tab
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\t", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(9) & _
Mid(strInput, lngCurrentPos + 2, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
'Escaped sequence: replace \uXXXX with a unicode character
'look for the specific escape sequence
lngCurrentPos = InStr(lngStartPos, strInput, "\u", vbTextCompare)
If lngCurrentPos > 0 Then
strInput = Mid(strInput, 1, lngCurrentPos - 1) & _
ChrW$(CLng("&h" & Mid(strInput, lngCurrentPos + 2, 4))) & _
Mid(strInput, lngCurrentPos + 6, Len(strInput))
blParseComplete = False 'set the status to check for another escape
End If
Loop
fnStringFix = strInput
End Function
Points of Interest
Keys for array elements are numbered when they generate a value. In the example below, the keys indicate the position of the value within the array.
{
"prices": {
"USD": [
[1,"1.25"],
[25,"1.17"],
[50,"0.95"]
],
"EUR": [
[1,"0.98"],
[25,"0.92"],
[50,"0.74"]
]
}
}
The JSON above is parsed into the following keys and values:
Key(1) = ">prices>USD>1" |
Value(1) = "1" |
Key(2) = ">prices>USD>2" |
Value(2) = "1.25" |
Key(3) = ">prices>USD>1" |
Value(3) = "25" |
Key(4) = ">prices>USD>2" |
Value(4) = "1.17" |
Key(5) = ">prices>USD>1" |
Value(5) = "50" |
Key(6) = ">prices>USD>2" |
Value(6) = "0.95" |
Key(7) = ">prices>EUR>1" |
Value(7) = "1" |
Key(8) = ">prices>EUR>2" |
Value(8) = "0.98" |
Key(9) = ">prices>EUR>1" |
Value(9) = "25" |
Key(10) = ">prices>EUR>2" |
Value(10) = "0.92" |
Key(11) = ">prices>EUR>1" |
Value(11) = "50" |
Key(12) = ">prices>EUR>2" |
Value(12) = "0.74" |
The downloadable demo is setup to load one of three URLs to a text file using the “URL to File” buttons. I’ve added some JSON sources as reference. The “Parse File” button then loads these text files and parses the JSON data into Sheet2 of the spreadsheet.
History
- Rev 1.0 - Initial release
- Rev 1.1 - Updated to correctly parse numeric values that are followed by the '
]
' or '}
' control characters. These values are ignored in the initial release.