This app uses a phonetic fingerprint of each word to isolate similarities that go beyond just the final syllable.

Introduction
I've been writing a novel using my Creative Writer's Word Processor app and in the process, continuously improved on it. However, the Rhymer.com dictionary I scraped off their website is inadequate. I was able to download 79,635 files from their server and then create a search engine for it which is incorporated into the Words app, but although they provide thousands of words that rhyme with common suffix endings, they are all clustered together with no concern for multiple ending syllable rhymes.
E.g.: uncomfortable rhymes with bowel and 10 000 other entries that have only a similar 'el' sound at the very end which need to be picked through to find something reasonable
Just because you added '-ed' suffix to a word doesn't mean you need to include every verb in the dictionary in your list of words that rhyme with it when you add the same '-ed' suffix.
Its still a useful dictionary but you have to do some work to filter through all the examples that don't really fit what you're looking for. And sometimes there are thousands of them. It's like looking for a specific snowflake in the middle of a blizzard.
This app, however, uses a phonetic fingerprint of each word to isolate similarities that go beyond just the final syllable.
Using the App
Note to Newbie: If you have never written any software and you're just looking for a rhyming dictionary, you can still run this app on Windows10. You just download the software above and extract it onto your hard-drive. Then you'll need to find the executable file.
C:\-wherever-you-extracted-the-file\Rhymes\Rhymes\bin\Debug\Rhymes.exe
You'll likely want to create a short-cut and keep it on your desktop.
Because CodeProject limits the downloadable file-size to 10MB, the app needs to build its database when you launch it for the first time. It will take about 10-15 minutes before it's ready and you'll see a list of words flash on the top left of the form while it's working. Once that is done, it will be ready to rhyme all you want.
Just type in the word you want to rhyme in the text-box and press Enter. Since it uses a phonetic algorithm, your spelling won't be as important as if you were looking for a word from a pre-defined list. Even if you misspell the word, you're trying to rhyme(as I often do) the search results will give you a list that are correctly spelled and that might help you in the future. You don't need to worry about American or British spelling and labourioiusly argue with it while you're just being creative.

Options
MaxEntries - You can limit the number of entries that are presented with this option. If the list of words found by your search exceed this limit, none of the words in that search 'level' (number of matching ending sounds) will not be presented. Searching a short word like 'this' will give you oodles of answers that will spill out all over the place so I question your poetry if you need help rhyming 'this' or 'that' but if you must... just increase the MaxEntries and you'll find something.

By increasing the MaxEntries
value, you may get the results you're looking for:

UseClipBoard - When this box is checked, Microsoft's 'Clipboard' will be tested once every second while the app is running. If you 'copy' or 'cut' text into the clipboard, the app will use the text you copied as a search parameter. This is convenient while you're working in a separate Word-Processor and don't want to switch apps.
TopMost - Checking this box will force the app to stay in front of other apps you're running. You can still use your word processor and do your writing, but with this option and the UseClipBoard option checked, you'll quickly see rhymes appear and only have to 'copy' the word you want to rhyme without switching to the app itself.
The Code
The data-tree is a ternary-tree with linked-lists of words attached to every leaf in the tree. As all the data is appended to the end of the file while it is being created and each data-item (tree-leaf & linked-list item) are referenced by their addresses on the same file, there is no need for an encumbring index system and the varying word-sizes do not affect store and retrieval. The Insert
/Search
methods both start at the root of the tree and progress down for the number of search-keys that make up the signature of the word being searched.
The tree search keys themselves are sound tags that are sequence in reverse order, from the end of the word to the front. Similar sounds have identical tags and each level of the Ternary-Tree search corresponds to the number of 'sounds' counting from the end of a word. As each tree key comparison is a comparison of arbitrarily assigned unique numeric ID numbers that correspond to the collection of letter combinations, these comparisons are not alphabetical but strictly numeric in nature.

The entire Rhyming Dictionary and its Ternary-Tree algorithm can be incorporated into any app with a single file classRD_TernaryTree.cs. Both the Linked-List elements and Tree-Leaves have separate classes which handle the Write
& Read
methods needed to access the data on the file using Addr
long integer to position the FileStream
. These are the same addresses that are recorded as pointers for all the tree's component leaves and lists. The tree insertion and search methods both receive the Word to be inserted/searched as a string
parameter. The Search()
method returns not just one list of words that rhyme with the input parameter but a list of incrementally similar lists of words such that the first list includes words with only one syllable that rhymes with the search word while the next list will have two final syllables that are similar to the requested rhyme.
The Word is first 'dissected' into its component phonetics and the list of sounds are used as search keys of the tree. During the build phase, as each word is added to every linked-list of the tree leaves, it 'falls' down through using a Front-End-Insertion to the tree leaf's Linked-List these linked-lists include all the words in the tree that have identical phonetic signatures down to that level of the search (whatever level the leaf is that you're looking at).
E.g. - The words proposal and disposal will be in two successive the levels 'al
'(1) and 'pos
'(2) leaves but will diverge in the next level into two separate 'pro
'(3) and 'dis
'(3) leaves.
The different sounds used to phonetically fingerprint each word are stored in classSounds
. Which is shown here below in its entirety.
public class classSounds
{
public List<string> lstText = new List<string>();
static int intIDCounter = 0;
int intID = intIDCounter++;
public int ID { get { return intID; } }
public classSounds() { }
public classSounds(string strSound)
{
lstText.Add(strSound);
}
public classSounds(string[] strSounds)
{
lstText.AddRange(strSounds.ToArray<string>());
}
}
The actual 'key' used in the search is the classSound
instance's ID
. This value is a unique integer assigned to it when it is first created through the use of a static
counter integer variable. Each instance of this class includes one or more letter combinations in the lstText
variable. The entire collection of instances of this class are sorted into three categories: Prefix
, Cluster
& Suffix
.
Each word is converted into a phonetic-signature by the DissectWord()
method. The prefix and suffix groups are processed first and taken off the head and tail of a dissected word in the sequence in which they were created. The list of 'cluster' type are taken from any part of the word and are not limited to tail/head of it as suffix/prefix lists are.
public static List<int> DissectWord(string strWord)
In order to create the word's phonetic signature, each discovered series of characters that are defined in the classSound
's lstText
is replaced by a square braced tag.
E.g., an instance of classSound
having:
- ID = 34
string
variables 'ea
' & 'ee
' in its lstText
will be used to replace every instance of the letters 'ea
' and 'ee
' in the word with their corresponding 'key-string' [34] where they were located in the word being dissected. When the dissection is complete, the entire string
strWord
that was first received by the method will be converted into its equivalent series of 'key-strings
' (square bracketed ID
numbers in the order in which they appeared in the word) and no longer have any letters but only square brackets and numerals. These are then Split
at the square braces into an array of strings holding the classSounds
ID
numbers which are then converted into integer values and returned to the calling methods (both the Ternary Tree's Search
and Insert
methods make use of the DissectWord()
) in the reverse order they were found and used to traverse the Ternary Tree.
Here's the method below:
public static List<int> DissectWord(string strWord)
{
if (lstSounds.Count == 0)
SoundsInit();
string strDebugCopy = strWord;
bool bolDebug = false;
if (bolDebug)
strWord = strDebugCopy;
List<int> lstRetVal = new List<int>();
strWord = Deaccent(strWord).ToLower();
classSounds cNULL = lstSounds[0];
string strNULL = EnumReplacement(ref cNULL);
for (int intLetterCounter = strWord.Length - 1; intLetterCounter >= 0; intLetterCounter--)
{
char chrTest = strWord[intLetterCounter];
if (!char.IsLetter(chrTest))
{
string strLeft = strWord.Substring(0, intLetterCounter);
string strRight = intLetterCounter < strWord.Length - 1
? strWord.Substring(intLetterCounter + 1)
: "";
strWord = strLeft + strNULL + strRight;
}
}
for (int intPrefixCounter = Prefixes_Start;
intPrefixCounter <= Prefixes_End; intPrefixCounter++)
{
classSounds cPrefix = lstSounds[intPrefixCounter];
string strEnumReplacement = EnumReplacement(ref cPrefix);
for (int intTextCounter = 0;
intTextCounter < cPrefix.lstText.Count; intTextCounter++)
{
string strPrefix = cPrefix.lstText[intTextCounter];
if (strWord.Length > strPrefix.Length)
{
if (string.Compare(strWord.Substring(0, strPrefix.Length), strPrefix) == 0)
{
strWord = strEnumReplacement + strWord.Substring(strPrefix.Length);
goto exitPrefix;
}
}
}
}
exitPrefix:
for (int intSuffixCounter = Suffixes_Start;
intSuffixCounter <= Suffixes_End; intSuffixCounter++)
{
classSounds cSuffix = lstSounds[intSuffixCounter];
string strEnumReplacement = EnumReplacement(ref cSuffix);
for (int intTextCounter = 0; intTextCounter < cSuffix.lstText.Count; intTextCounter++)
{
string strSuffix = cSuffix.lstText[intTextCounter];
if (strWord.Length > strSuffix.Length)
{
string strWordEnd = strWord.Substring(strWord.Length - strSuffix.Length);
if (string.Compare(strWordEnd, strSuffix) == 0)
{
strWord = strWord.Substring
(0, strWord.Length - strSuffix.Length) + strEnumReplacement;
goto exitSuffixes;
}
}
}
}
exitSuffixes:
for (int intClusterCounter = Clusters_Start;
intClusterCounter <= Clusters_End; intClusterCounter++)
{
classSounds cSound = lstSounds[intClusterCounter];
string strEnumReplacement = EnumReplacement(ref cSound);
for (int intTextCounter = 0; intTextCounter < cSound.lstText.Count; intTextCounter++)
{
string strCluster = cSound.lstText[intTextCounter];
if (strWord.Length >= strCluster.Length)
{
if (strWord.Contains(strCluster))
{
DissectWord_ReplaceEnum(ref strWord, ref cSound);
}
}
}
}
char[] chrSplit = { ']', '[' };
string[] strEnumList = strWord.Split(chrSplit, StringSplitOptions.RemoveEmptyEntries);
for (int intCounter = strEnumList.Length - 1; intCounter >= 0; intCounter--)
{
string strEnum = strEnumList[intCounter];
try
{
int intEnum = Convert.ToInt32(strEnum);
int eItem = (int)intEnum;
lstRetVal.Add(eItem);
}
catch (Exception)
{
}
}
return lstRetVal;
}
Creating the Phonetic Lists
The phonetic lists themselves will likely require fine tuning. As I use this writing tool, I will correct whatever issues I discover and gradually improve the already formidable performance. You can easily do this yourself with your own copy of this open-source program by editing the existing examples in the SoundsInit()
method.
static void SoundsInit()
{
lstSounds.Add(new classSounds("NULL"));
_intPrefixes_Start = lstSounds.Count;
lstSounds.Add(new classSounds("extra"));
lstSounds.Add(new classSounds("hyper"));
lstSounds.Add(new classSounds("inter"));
lstSounds.Add(new classSounds("trans"));
lstSounds.Add(new classSounds("ultra"));
lstSounds.Add(new classSounds("under"))
lstSounds.Add(new classSounds("super"));;
lstSounds.Add(new classSounds("anti"));
lstSounds.Add(new classSounds("auto"));
lstSounds.Add(new classSounds("down"));
lstSounds.Add(new classSounds("mega"));
lstSounds.Add(new classSounds("over"));
lstSounds.Add(new classSounds("post"));
lstSounds.Add(new classSounds("semi"));
lstSounds.Add(new classSounds("tele"));
lstSounds.Add(new classSounds("con"));
lstSounds.Add(new classSounds("dis"));
lstSounds.Add(new classSounds("mid"));
lstSounds.Add(new classSounds("mis"));
lstSounds.Add(new classSounds("non"));
lstSounds.Add(new classSounds("out"));
lstSounds.Add(new classSounds("pre"));
lstSounds.Add(new classSounds("pro"));
lstSounds.Add(new classSounds("sub"));
lstSounds.Add(new classSounds("de"));
lstSounds.Add(new classSounds("il"));
lstSounds.Add(new classSounds("im"));
lstSounds.Add(new classSounds("ir"));
lstSounds.Add(new classSounds("in"));
lstSounds.Add(new classSounds("re"));
lstSounds.Add(new classSounds("un"));
lstSounds.Add(new classSounds("up"));
_intPrefixes_End = lstSounds.Count - 1;
_intClusters_Start = lstSounds.Count;
classSounds cSound = new classSounds();
cSound.lstText.Add("ayer");
cSound.lstText.Add("ower");
cSound.lstText.Add("oyer");
cSound.lstText.Add("our");
cSound.lstText.Add("ure");
lstSounds.Add(cSound);
cSound = new classSounds();
cSound.lstText.Add("ord");
cSound.lstText.Add("ard");
cSound.lstText.Add("urd");
lstSounds.Add(cSound);
As these sound groupings are processed in the order in which they are inserted into the list, the longer string values should be tested first, otherwise a similarly spelled shorter string combination may discount a longer one and could cause unexpected results to the user's experience. Since the Suffix
, Prefix
and Cluster
list of sounds are all in the same list and only denominated by integer variables that are used to distinguish them from each other in the DissectWord()
method, you want to be certain those delimiting integer values reflect the sequence in which they all appear.
static int _intPrefixes_Start = -1;
static public int Prefixes_Start
{
get { return _intPrefixes_Start; }
}
static int _intPrefixes_End = -1;
static public int Prefixes_End
{
get { return _intPrefixes_End; }
}
static int _intClusters_Start = -1;
static public int Clusters_Start
{
get { return _intClusters_Start; }
}
static int _intClusters_End = -1;
static public int Clusters_End
{
get { return _intClusters_End; }
}
static int _intSuffixes_Start = -1;
static public int Suffixes_Start
{
get { return _intSuffixes_Start; }
}
static int _intSuffixes_End = -1;
static public int Suffixes_End
{
get { return _intSuffixes_End; }
}
You can see these integer values being assigned in the SoundsInit()
method using the lstSounds.Count
Property
and any changes you make to that method should take them into consideration. Including sounds intended for Cluster
(not head/tail of word but anywhere in the middle) in either prefix or suffix lists will not give you the results you want.
N.B.: To make changes to your data-tree, you'll need to delete the CK_RhymingDictionary.tree
file it builds and relies on before re-launching the app after you've made any changes to the SoundInit()
. Just launch the app again and it will build the data-tree like it did the first time you launched it. The words I used to include into this rhyming dictionary were the file names of the Rhyming.Com website that I scraped for a different project. As it would have been impossible to provide you with all these files (cumbersome and unnecessary), I collated their file names into 26 separate TextFiles located in the Debug/Bin subdirectory of the source code you need to download. Its a simple thing to add/remove words from that list and alter your own personal rhyming dictionary. If you want to make a rhyming dictionary in a different language such as Polish, French or German. You'll have to change the SoundInit()
method in order for it to reflect the language you intend to rhyme. This will likely require a bit of tinkering on your part but it's really not that painful.
Points of Interest
This is not the first Ternary-Tree I've built. They tend to use up far too much memory to be worth using in RAM memory so I would normally opt for a Binary-Tree for a lot of my search methods but as this project relies on the Ternary Tree's unique properties to track through a phonetic signature of repeated sounds (not unique tree leaves such as you'd find in a binary tree), I don't know of any better alternative though there very well may be. I originally had only one ID for each letter combination so 'ph' was unique and different from 'ff' or even 'f' which was kind of useless. Since that only took me a few hours to write, I worked on it a little longer and made the necessary changes to build the current version.
History
- 30th January, 2022 - First published