Gibberish
Generates a bunch of gibberish from real text.
Introduction
When I checked out one of Microsoft's Web sites Starter Kits, I noticed that they had a bunch of gibberish sample text and wondered what poor soul had to write all of that out. I then thought, "They must have a tool to generate that gibberish text" and so as a fun project I decided to write my own.
Gibberish can take a bunch of text and swap out every alpha-numeric character in the text with a different random character. Or it can split the text up into an array of words and replace the text on a word-by-word basis, still creating gibberish but it's somewhat more legible and uses real words.
I took the list of words from the 12Dicts collection on the http://wordlist.sourceforge.net/ site. However, you can also upload your own list of words from the form if you'd like.
It requires only .NET 2.0, no install needed.
Background
Getting Gibberish to generate a bunch of random characters in place of existing ones was easy. But I also had to exclude any special characters to preserve formatting of the text which wasn't too bad.
But getting Gibberish to swap out a word with one from a predefined list of words was a bit more difficult because each word might have several "special" characters; each one of these characters should be used to split up the word and each section of the word is replaced with a different word. Finally the sections of the word are joined back together with their original "special" characters, formatting, and letter casing.
Example: Hello,Worldly.World's -> Wheel,Mallets.Coons'l
This requires that you split the word up into 4 words using 3 different split characters: "Hello", "Wordly", "World", "s" using split characters: ,.'
The real trick is to use a recursive method which loops back into itself every time it finds a special character in the word or sub-word being processed.
Using the Code
Buttons
The Make Gibberish button splits the entire string
into a CharArray
and loops through each character in the array. For each loop iteration, it calls the SwapCharacters
function and appends the new character to the gibberishString
. Finally, it outputs gibberishString
to the text box.
The Change Words button splits the text up into an array of words, not characters, and loops through each word in the array. For each loop iteration, it checks to see if the word has any special characters in it; if not, it calls the SwapWord
function which swaps the word with a predefined list of words from a text file read during the form_load
event.
If the word contains a special character, it calls the ProcessSpecialWord
function which splits the word into an array of sub-words to be replaced.
Finally, it appends the new word to the gibberishString
and eventually outputs gibberishString
to the text box.
Methods
The SwapCharacters
function gets passed a Char
and determines if it is upper case, lower case, or a number. If it's upper or lower case, it grabs a random character from the alphabet, converts it to upper or lower case as needed and returns the resulting Char
. If it's a number, it returns a random number from 0-9.
' take a character, swap it with a random character, and return the result
Private Function SwapCharacters(ByVal character As Char) As Char
If Char.IsUpper(character) Then
' get a random letter from the letters array, convert to upper,
' and return new character
Return Char.ToUpper(letterCharArray(rand.Next(0, 25)))
ElseIf Char.IsLower(character) Then
' get a random letter from the letters array, which is already lowercase,
' and return it
Return letterCharArray(rand.Next(0, 25))
ElseIf CStr(numberCharArray).Contains(character) Then
' get a random number from 0-9 and store as new character
Return numberCharArray(rand.Next(0, 9))
Else 'not an alpha-numeric digit to return whatever it is to whence it came
Return character
End If
End Function
The SwapWord
function gets passed a word, origWord
, and checks to see if it contains any special characters; if it does, it just returns back the origWord
passed.
If the word passed is only 1 character in length, it calls the SwapCharacters
function and returns the result.
If the word passed has no special characters and has more than one character, it creates a new ArrayList
, loops through every word in the loaded word list, and adds any words with the same length as the origWord
to the ArrayList
. If the ArrayList
contains any words with the same length as the origWord
, it picks out one of the words in the ArrayList
at random and returns the new word after passing it through the CopyCase
function.
' take the word passed and return a new word of the same length
Private Function SwapWord(ByVal origWord As String) As String
' just in case the word dictionary was empty
If wordListArray Is Nothing Then
Return origWord
End If
' if the word passed has any special characters in it,
' or nothing at all, toss it back
If origWord.IndexOfAny(specialCharArray) >= 0 OrElse origWord = String.Empty Then
Return origWord
ElseIf origWord.Length = 1 Then
' if word is only one character, just replace the character
Return SwapCharacters(origWord.Chars(0)).ToString()
Else ' no special characters and length <> 1,
' so swap out the word and return it back
' build and ArrayList with all words from the array with the same
' length as the word being processed
Dim wordsWithLength As ArrayList = New ArrayList
Dim tempWord As String = String.Empty
' loop through every word in the wordListArray and store words of
' the same length as the original word
For i As Integer = 0 To wordListArray.Length - 1I
tempWord = wordListArray(i)
If tempWord.Length = origWord.Length Then
wordsWithLength.Add(tempWord)
End If
Next i
' we have all the words with the same length as the one we're processing,
' now pick one out at random
If wordsWithLength.Count = 0 Then
Return origWord
Else
Dim newWord As String = wordsWithLength.Item_
(rand.Next(0, wordsWithLength.Count - 1I)).ToString()
' return the new word after copying the letter casing from the original word
Return CopyCase(origWord, newWord)
End If
End If
End Function
The CopyCase
function gets passed oldWord
and newWord
, splits both words up into their own CharArray
s and loops through each position in the oldWordCharArray
. For each iteration, the loop checks to see if the Char
is upper or lower case and converts the corresponding newWord
array character to that case.
It then uses CStr
to convert the CharArray
back into a string
and returns the now correct case newWord
.
' loops through every letter in the old and new words and matches character cases
Private Function CopyCase(ByVal oldWord As String, ByVal newWord As String) As String
' first ensure the two words are the same length
If oldWord.Length = newWord.Length Then
' convert each word into a CharArray so we can loop through each character
Dim oldWordCharArray() As Char = oldWord.ToCharArray()
Dim newWordCharArray() As Char = newWord.ToCharArray()
' loop through each character of the old word
For i As Integer = 0 To UBound(oldWordCharArray)
If Char.IsUpper(oldWordCharArray(i)) Then
' original word character was upper case,
' so convert new word character to upper case
newWordCharArray(i) = Char.ToUpper(newWordCharArray(i))
ElseIf Char.IsLower(oldWordCharArray(i)) Then
' original word character was lower case,
' so convert new word character to lower case
newWordCharArray(i) = Char.ToLower(newWordCharArray(i))
End If
Next
' convert the new CharArray back into a word and return to sender
Return CStr(newWordCharArray)
Else
' the old and new words should be the same length, if not, just return
' the unprocessed newWord
Return newWord
End If
End Function
The ProcessSpecialWord
function is where all the head scratching came it. How in the world do I recursively split a word with multiple, different, special characters into an array of separate words, replace them, and then put it all back together remembering which special characters go where?
After writing and then deleting about a 1,000 lines of code I came up with the ProcessSpecialWord
method which keeps passing words back to itself until there are no more special characters. The trick is to split the word with only one special character at a time and if there are more, just pass it back into the ProcessSpecialWord
method for another go.
Once all sections of the word have been split up and replaced, join them all back together, and return the string
.
' if the word has a non alpha-numeric letter, it gets passed here
' to be split up and each section of the word is changed to a new word
Private Function ProcessSpecialWord(ByVal word As String) As String
' loop through each special character and see of the word contains it
For Each specialChar As Char In specialCharArray
If word.Contains(specialChar.ToString) Then
' split the word into an array with the found special character
Dim subword() As String = SplitWord(word, specialChar)
' process each new split word
For i As Integer = 0 To subword.Length - 1I
If subword(i).IndexOfAny(specialCharArray) >= 0 Then
' the new word still contains special characters.
' pass the word back in to this same method again to get it
' split up again and keep passing it in until it's entirely split up
' and all sections of the word have been changed
subword(i) = ProcessSpecialWord(subword(i))
Else
' the subword has no special characters, so just swap it
subword(i) = SwapWord(subword(i))
End If
Next
' all done swapping words, so join all subwords back together,
' exit the loop and return the new joined word back
word = String.Join(specialChar, subword)
' we exit the loop because we only want to process one special
' character at a time. when there's more than one special character
' in the word, it gets thrown back into the function and split up again
' using that special character
Exit For
End If
Next specialChar
Return word
End Function
Points of Interest
It's fun to see what combination of words you can come up with and it may be useful to someone out there. I know there are other script based gibberish generators out there but the couple that I looked at couldn't parse multiple special characters and preserve word punctuation.
Nevertheless, even if not practically useful it should be academically useful and someone might make use of some of the beginner to intermediate concepts in the code logic.
Coming up with the recursive ProcessSpecialWord
function was a small achievement and it was fun to think up.
History
- 5-7-09: First post
Please feel free to leave any comments, feedback, code quality, or performance tuning thoughts you might have.