65.9K
CodeProject is changing. Read more.
Home

Tokenizing strings in VBScript

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.17/5 (6 votes)

May 17, 2000

CPOL
viewsIcon

167313

A simple function that allows you to tokenize a string sing multiple token separators

This is an extremely simple function for tokenizing strings. You supply the function with a string that you wish to tokenize, and an array of tokens the delimit the tokens.

For example, suppose you have the string "Tom, Dick and Harry" and you'd like to break it up into "Tom", "Dick", "Harry". Your string is thus "Tom, Dick and Harry" and your array contains the "," and "and" separators:

Dim Str, Seps(2)
Str     = "Tom, Dick and Harry"
Seps(0) = ","
Seps(1) = "and"

Dim i, a
a = Tokenize(Str, Seps)

Response.Write "<p>Found " & UBound(a) & " tokens</p>"
Response.Write "<ol>"
For i=1 to UBound(a)
	Response.Write "<li>Keyword " & i & " = " & a(i-1) & "</li>"
next
Response.Write "</ol>"

The results will be

Found 3 tokens
  1. Keyword 1 = Tom
  2. Keyword 2 = Dick
  3. Keyword 3 = Harry

The function is as follows:

Function Tokenize(byVal TokenString, byRef TokenSeparators())

	Dim NumWords, a()
	NumWords = 0
	
	Dim NumSeps
	NumSeps = UBound(TokenSeparators)
	
	Do 
		Dim SepIndex, SepPosition
		SepPosition = 0
		SepIndex    = -1
		
		for i = 0 to NumSeps-1
		
			' Find location of separator in the string
			Dim pos
			pos = InStr(TokenString, TokenSeparators(i))
			
			' Is the separator present, and is it closest to the beginning of the string?
			If pos > 0 and ( (SepPosition = 0) or (pos < SepPosition) ) Then
				SepPosition = pos
				SepIndex    = i
			End If
			
		Next

		' Did we find any separators?	
		If SepIndex < 0 Then

			' None found - so the token is the remaining string
			redim preserve a(NumWords+1)
			a(NumWords) = TokenString
			
		Else

			' Found a token - pull out the substring		
			Dim substr
			substr = Trim(Left(TokenString, SepPosition-1))
	
			' Add the token to the list
			redim preserve a(NumWords+1)
			a(NumWords) = substr
		
			' Cutoff the token we just found
			Dim TrimPosition
			TrimPosition = SepPosition+Len(TokenSeparators(SepIndex))
			TokenString = Trim(Mid(TokenString, TrimPosition))
						
		End If	
		
		NumWords = NumWords + 1
	loop while (SepIndex >= 0)
	
	Tokenize = a
	
End Function