Tokenizing strings in VBScript






4.17/5 (6 votes)
A simple function that allows you to tokenize a string sing multiple token separators
This is an extremely simple function for tokenizing strings. You supply the function with a string that you wish to tokenize, and an array of tokens the delimit the tokens.
For example, suppose you have the string "Tom, Dick and Harry" and you'd like to break it up into "Tom", "Dick", "Harry". Your string is thus "Tom, Dick and Harry" and your array contains the "," and "and" separators:
Dim Str, Seps(2)
Str = "Tom, Dick and Harry"
Seps(0) = ","
Seps(1) = "and"
Dim i, a
a = Tokenize(Str, Seps)
Response.Write "<p>Found " & UBound(a) & " tokens</p>"
Response.Write "<ol>"
For i=1 to UBound(a)
Response.Write "<li>Keyword " & i & " = " & a(i-1) & "</li>"
next
Response.Write "</ol>"
The results will be
Found 3 tokens
1. Keyword 1 = Tom
2. Keyword 2 = Dick
3. Keyword 3 = Harry
The function is as follows:
Function Tokenize(byVal TokenString, byRef TokenSeparators())
Dim NumWords, a()
NumWords = 0
Dim NumSeps
NumSeps = UBound(TokenSeparators)
Do
Dim SepIndex, SepPosition
SepPosition = 0
SepIndex = -1
for i = 0 to NumSeps-1
' Find location of separator in the string
Dim pos
pos = InStr(TokenString, TokenSeparators(i))
' Is the separator present, and is it closest to the beginning of the string?
If pos > 0 and ( (SepPosition = 0) or (pos < SepPosition) ) Then
SepPosition = pos
SepIndex = i
End If
Next
' Did we find any separators?
If SepIndex < 0 Then
' None found - so the token is the remaining string
redim preserve a(NumWords+1)
a(NumWords) = TokenString
Else
' Found a token - pull out the substring
Dim substr
substr = Trim(Left(TokenString, SepPosition-1))
' Add the token to the list
redim preserve a(NumWords+1)
a(NumWords) = substr
' Cutoff the token we just found
Dim TrimPosition
TrimPosition = SepPosition+Len(TokenSeparators(SepIndex))
TokenString = Trim(Mid(TokenString, TrimPosition))
End If
NumWords = NumWords + 1
loop while (SepIndex >= 0)
Tokenize = a
End Function