Click here to Skip to main content

Tokenizing strings in VBScript

This is an extremely simple function for tokenizing strings. You supply the function with a string that you wish to tokenize, and an array of tokens the delimit the tokens.

For example, suppose you have the string "Tom, Dick and Harry" and you'd like to break it up into "Tom", "Dick", "Harry". Your string is thus "Tom, Dick and Harry" and your array contains the "," and "and" separators:

Dim Str, Seps(2)
Str     = "Tom, Dick and Harry"
Seps(0) = ","
Seps(1) = "and"

Dim i, a
a = Tokenize(Str, Seps)

Response.Write "<p>Found " & UBound(a) & " tokens</p>"
Response.Write "<ol>"
For i=1 to UBound(a)
	Response.Write "<li>Keyword " & i & " = " & a(i-1) & "</li>"
next
Response.Write "</ol>"

The results will be

Found 3 tokens
  1. Keyword 1 = Tom</tt>
  2. Keyword 2 = Dick</tt>
  3. Keyword 3 = Harry</tt>

The function is as follows:

Function Tokenize(byVal TokenString, byRef TokenSeparators())

	Dim NumWords, a()
	NumWords = 0
	
	Dim NumSeps
	NumSeps = UBound(TokenSeparators)
	
	Do 
		Dim SepIndex, SepPosition
		SepPosition = 0
		SepIndex    = -1
		
		for i = 0 to NumSeps-1
		
			' Find location of separator in the string
			Dim pos
			pos = InStr(TokenString, TokenSeparators(i))
			
			' Is the separator present, and is it closest to the beginning of the string?
			If pos > 0 and ( (SepPosition = 0) or (pos < SepPosition) ) Then
				SepPosition = pos
				SepIndex    = i
			End If
			
		Next

		' Did we find any separators?	
		If SepIndex < 0 Then

			' None found - so the token is the remaining string
			redim preserve a(NumWords+1)
			a(NumWords) = TokenString
			
		Else

			' Found a token - pull out the substring		
			Dim substr
			substr = Trim(Left(TokenString, SepPosition-1))
	
			' Add the token to the list
			redim preserve a(NumWords+1)
			a(NumWords) = substr
		
			' Cutoff the token we just found
			Dim TrimPosition
			TrimPosition = SepPosition+Len(TokenSeparators(SepIndex))
			TokenString = Trim(Mid(TokenString, TrimPosition))
						
		End If	
		
		NumWords = NumWords + 1
	loop while (SepIndex >= 0)
	
	Tokenize = a
	
End Function

Web01 | 2.8.160204.4 | Advertise | Privacy
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service