Click here to Skip to main content
15,918,123 members
Articles / General Programming / Algorithms
Tip/Trick

How to Unscramble Any Word

Rate me:
Please Sign up or sign in to vote.
4.82/5 (9 votes)
12 Apr 2021CPOL1 min read 16.1K   12   31
This is an unscramble class that can be used to decypher any word.
I could never find a good way to unscramble a word on the interwebs. Every algorithm was either brute-force or permutations...

Introduction

I could never find a good way to unscramble a word on the interwebs. Every algorithm was either brute-force or permutations.

The problem with brute-force is that it's a guessing game... very slow, or if you're lucky, very fast.

The problem with permutations is that once you go over 7 characters, you're using a bunch of memory.
e.g., a 12 character scrambled word has 479,001,600 configurations!

It finally dawned on me that if you sort the scrambled word and then sort the dictionary entries, then if we equate any sorted dictionary entry to our sorted scramble, then they must be a match!

There is probably some fancy machine learning algorithm that could do this, but my method works perfectly and instantly.

Using the Code

You'll want to embed your favorite dictionary into your project (for speed and portability).

There are a lot of free dictionary files out there; here's the one I used... https://github.com/dwyl/english-words.

Direct link... https://raw.githubusercontent.com/dwyl/english-words/master/words.txt.

The work-horse is the UnscrambleWord method; this will take care of loading the dictionary, filtering and then sorting the results and storing them in a List<string> object that will be returned to you from the call.

C#
class Unscramble
{
     private static bool _dictionaryLoaded = false;
     private static string _wordToUnscramble = "";
     private static int _totalEntries = 0;
     private static Dictionary<string, string> _sortedDictionary =
                                       new Dictionary<string, string>();
     private static List<string> _results = new List<string>();
     private static Stopwatch _stopwatch;

     //====================================================================================
     /** We don't really need a constructor **/
     //public Unscramble(string wordToUnscramble)
     //{
     //    _WordToUnscramble = wordToUnscramble;
     //}

     //====================================================================================
     public List<string> UnscrambleWord(string wordToUnscramble, bool useFiltering = true)
     {
         _stopwatch = Stopwatch.StartNew();

         if (string.IsNullOrEmpty(_wordToUnscramble))
         {
             _wordToUnscramble = wordToUnscramble;
         }
         else if (!_wordToUnscramble.Equals
              (wordToUnscramble, StringComparison.OrdinalIgnoreCase) && useFiltering)
         {   //If re-using the object and the word is different,
             //we'll need to reload the dictionary
             _dictionaryLoaded = false;
             _wordToUnscramble = wordToUnscramble;
             _results.Clear();
         }
         else if (_wordToUnscramble.Equals
                 (wordToUnscramble, StringComparison.OrdinalIgnoreCase))
         {
             _results.Clear(); //we should clear the results array so they don't stack
         }

         if (!_dictionaryLoaded) //the first call will be slightly slower
             LoadEmbeddedDictionary(wordToUnscramble.ToUpper(), useFiltering);

         string scrambleSorted = SortWord(wordToUnscramble);

         //var kvp = SortedDictionary.FirstOrDefault
         //(p => SortedDictionary.Comparer.Equals(p.Value, scrambledSort));
         var matchList = _sortedDictionary.Where
             (kvp => kvp.Value == scrambleSorted).Select(kvp => kvp.Key).ToList();

         if (matchList.Count > 0)
         {
             foreach (string result in matchList)
             {
                 System.Diagnostics.Debug.WriteLine($"> Match: {result}");
                 _results.Add(result);
             }

             _stopwatch.Stop();
             System.Diagnostics.Debug.WriteLine($"> Elapsed time: {_stopwatch.Elapsed}");
             return _results;
         }
         else //no matches
         {
             _stopwatch.Stop();
             _results.Clear();
             System.Diagnostics.Debug.WriteLine($"> Elapsed time: {_stopwatch.Elapsed}");
             return _results;
         }
     }

     //==================================================================================
     private static void LoadEmbeddedDictionary(string wordText, bool filter = false)
     {
         char[] delims = new char[1] { '\n' };
         string[] chunks;
         int chunkCount = 0;
         if (filter)
             chunks = global::Utility.Properties.Resources.
                                      DictionaryNums.ToUpper().Split(delims);
         else
             chunks = global::Utility.Properties.Resources.
                                      DictionaryNums.ToUpper().Split(delims);

         System.Diagnostics.Debug.WriteLine($"> Length filter: {wordText.Length}");
         _sortedDictionary.Clear();
         foreach (string str in chunks)
         {
             chunkCount++;
             if (filter)
             {
                 //we're assuming the word will have at least 3 characters...
                 //I mean would you really need this program if it was only two?
                 if ((str.Length == wordText.Length) &&
                      str.Contains(wordText.Substring(0, 1)) &&
                      str.Contains(wordText.Substring(1, 1)) &&
                      str.Contains(wordText.Substring(2, 1))) //just checking the 1st,
                                    //2nd & 3rd letter will trim our search considerably
                 {
                     try
                     {
                         _sortedDictionary.Add(str, SortWord(str));
                     }
                     catch
                     {
                         //probably a key collision, just ignore
                     }
                 }
             }
             else
             {
                 try
                 {
                     _sortedDictionary.Add(str, SortWord(str));
                 }
                 catch
                 {
                     //probably a key collision, just ignore
                 }
             }
         }
         System.Diagnostics.Debug.WriteLine($">
         Loaded {_sortedDictionary.Count} possible matches out of {chunkCount.ToString()}");
         _totalEntries = chunkCount;
         _dictionaryLoaded = true;
     }

     //=================================================================================
     private static string SortWord(string str)
     {
         return String.Concat(str.OrderBy(c => c));

         /*** Character Array Method ***
         return String.Concat(str.OrderBy(c => c).ToArray());
         *******************************/

         /*** Traditional Method ***
         char[] chars = input.ToArray();
         Array.Sort(chars);
         return new string(chars);
         ***************************/
     }

     #region [Helper Methods]
     //=================================================================================
     public TimeSpan GetMatchTime()
     {
        return _stopwatch.Elapsed;
     }

     //=================================================================================
     public List<string> GetMatchResults()
     {
        return _results;
     }

     //=================================================================================
     public int GetMatchCount()
     {
        return _results.Count;
     }

     //=================================================================================
     public int GetFilterCount()
     {
        return _sortedDictionary.Count;
     }

     //=================================================================================
     public int GetDictionaryCount()
     {
        return _totalEntries;
     }
     #endregion
}

Testing/Implementation

To drive the code, you would do this...

C#
string scrambled = "mctmouicnaino";
Unscramble obj1 = new Unscramble();
List<string> results = obj1.UnscrambleWord(scrambled);
if (results.Count > 0)
{
    Console.WriteLine($"> Total matches: {obj1.GetMatchCount()}");
    foreach (string str in results)
    {
        Console.WriteLine($">> {str}");
    }
    Console.WriteLine($"> Total time: {obj1.GetMatchTime()}");
    Console.WriteLine($"> Filtered set: {obj1.GetFilterCount()}
                          out of {obj1.GetDictionaryCount()}");
}
else
{
    Console.WriteLine("> No matches available:
            Check your spelling, or the dictionary may be missing this word.");
}

In the class, we could add some more LINQ methods to change the order, take the top result, etc., but this should be a good remedy for any unscramble engine base.

History

  • 11th April, 2021: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralWhere is Utility.Properties.Resources.DictionaryNums ? Pin
largenqcd15-Apr-21 5:33
largenqcd15-Apr-21 5:33 
GeneralRe: Where is Utility.Properties.Resources.DictionaryNums ? Pin
GuildOfCalamity15-Apr-21 19:44
GuildOfCalamity15-Apr-21 19:44 
GeneralRe: Where is Utility.Properties.Resources.DictionaryNums ? Pin
largenqcd16-Apr-21 6:11
largenqcd16-Apr-21 6:11 
GeneralRe: Where is Utility.Properties.Resources.DictionaryNums ? Pin
GuildOfCalamity16-Apr-21 6:41
GuildOfCalamity16-Apr-21 6:41 
QuestionKudos to the author for the high-level heuristic Pin
Martin Fierro14-Apr-21 5:59
Martin Fierro14-Apr-21 5:59 
AnswerRe: Kudos to the author for the high-level heuristic Pin
GuildOfCalamity15-Apr-21 15:13
GuildOfCalamity15-Apr-21 15:13 
GeneralRe: Kudos to the author for the high-level heuristic Pin
Martin Fierro22-Apr-21 6:45
Martin Fierro22-Apr-21 6:45 
QuestionPlease post working code.....working code included. Pin
Member 1088480414-Apr-21 3:47
Member 1088480414-Apr-21 3:47 
AnswerRe: Please post working code.....working code included. Pin
GuildOfCalamity15-Apr-21 15:11
GuildOfCalamity15-Apr-21 15:11 
QuestionNice Approach - Another Variant Pin
Bob McGowan13-Apr-21 17:39
Bob McGowan13-Apr-21 17:39 
AnswerRe: Nice Approach - Another Variant Pin
Member 1341177115-Apr-21 0:01
Member 1341177115-Apr-21 0:01 
QuestionWords with wrong or missing letters from OCR Pin
Curt Krueger13-Apr-21 19:41
Curt Krueger13-Apr-21 19:41 
AnswerRe: Words with wrong or missing letters from OCR Pin
Bob McGowan14-Apr-21 4:21
Bob McGowan14-Apr-21 4:21 
GeneralRe: Words with wrong or missing letters from OCR Pin
Member 1341177115-Apr-21 0:02
Member 1341177115-Apr-21 0:02 
QuestionHow to Unscramble Any Word Pin
Doncp13-Apr-21 11:01
Doncp13-Apr-21 11:01 
AnswerRe: How to Unscramble Any Word Pin
largenqcd13-Apr-21 14:01
largenqcd13-Apr-21 14:01 
AnswerRe: How to Unscramble Any Word Pin
GuildOfCalamity16-Apr-21 5:34
GuildOfCalamity16-Apr-21 5:34 
GeneralRe: How to Unscramble Any Word Pin
largenqcd16-Apr-21 6:32
largenqcd16-Apr-21 6:32 
GeneralRe: How to Unscramble Any Word Pin
GuildOfCalamity16-Apr-21 6:42
GuildOfCalamity16-Apr-21 6:42 
GeneralMessage Closed Pin
14-Nov-21 19:45
kuki yang14-Nov-21 19:45 
AnswerRe: How to Unscramble Any Word Pin
largenqcd16-Apr-21 6:46
largenqcd16-Apr-21 6:46 
GeneralMy vote of 5 Pin
rspercy6513-Apr-21 8:32
rspercy6513-Apr-21 8:32 
GeneralRe: My vote of 5 Pin
GuildOfCalamity15-Apr-21 15:03
GuildOfCalamity15-Apr-21 15:03 
GeneralAnagrams and suggestions in VB Pin
HenkAlles18-Apr-21 23:54
HenkAlles18-Apr-21 23:54 
There are two use cases for the code. One is solving anagrams and the other - as mentions in messages - is to fix misspelled search terms. For the latter, a ranked list of words is more helpful than an unsorted list. Adding Levenstein distance measure to calculate the rank of each anagram returns a list of words with the most likely - least number of changes - first.

''' <summary>
'''  Anagram and scrambled term alternatives generator. The anagrams and term alternatives are
'''  matched with the supplied dictionary. Typically the Unscamble method is used in search
'''  applications where alternative terms might yield better recall for the application.
'''  </summary>
''' <example>
'''   <code title="Example" description="" groupname="1" lang="VB.NET">
''' Dim unscrambler As New Anagram("words.txt")
'''
''' Console.WriteLine("Anagrams for 'asphalt'")
''' Dim newResults As List(Of String) = unscrambler.Anagrams("asphalt")
''' For Each str As String In newResults
'''     Console.WriteLine(str)
''' Next
'''
''' Console.WriteLine()
''' Console.WriteLine("Alternatives for 'saphalt'")
''' Dim newnewResults = unscrambler.Unscramble("saphalt")
''' For Each item In newnewResults
'''     Console.WriteLine(item.Term & vbTab & item.Rank)
''' Next
'''
''' Console.WriteLine()
''' Console.WriteLine("Alternatives for 'asphalt'")
''' Dim newnewnewResults = unscrambler.Unscramble("asphalt")
''' For Each item In newnewnewResults
'''     Console.WriteLine(item.Term & vbTab & item.Rank)
''' Next
'''
''' ' Produced output from an English list of words.
''' '
''' ' Anagrams for 'asphalt'
''' ' asphalt
''' ' spathal
''' ' taplash
'''
''' ' Alternatives for 'saphalt'
''' ' asphalt 0.714285714285714
''' ' spathal 0.571428571428571
''' ' taplash 0.428571428571429
'''
''' ' Alternatives for 'asphalt'
''' ' asphalt 1
''' ' spathal 0.428571428571429
''' ' taplash 0.285714285714286</code>
''' </example>
Public Class Anagram
    Private terms As Dictionary(Of String, List(Of String))
    Private termFileName As String
    Private termEnum As IEnumerable(Of String)

    ''' <summary>
    '''  Initializes a new instance of the <see cref="Anagram" /> class.
    '''  </summary>
    ''' <param name="FileName">
    '''  The CSV UTF-8 encoded term file where each term is on a separate line.
    '''  </param>
    ''' <exception caption="" cref="ArgumentNullException">File name cannot be null.</exception>
    ''' <exception caption="" cref="IOException">File doesn't exist.</exception>
    ''' <example>
    '''   <code title="Example" description="" lang="VB.NET">
    ''' Dim unscrambler As New Anagram("words.txt")
    '''
    ''' Console.WriteLine("Anagrams for 'asphalt'")
    ''' Dim newResults As List(Of String) = unscrambler.Anagrams("asphalt")
    ''' For Each str As String In newResults
    '''     Console.WriteLine(str)
    ''' Next
    '''
    ''' Console.WriteLine()
    ''' Console.WriteLine("Alternatives for 'saphalt'")
    ''' Dim newnewResults = unscrambler.Unscramble("saphalt")
    ''' For Each item In newnewResults
    '''     Console.WriteLine(item.Term & vbTab & item.Rank)
    ''' Next</code>
    ''' </example>
    Public Sub New(FileName As String)
        If FileName.IsNullOrEmpty() Then Throw New ArgumentNullException("File name cannot be null.")
        If Not File.Exists(FileName) Then Throw New IOException("File doesn't exist.")
        Me.termFileName = FileName
    End Sub

    ''' <summary>
    ''' Initializes a new instance of the <see cref="Anagram"/> class.
    ''' </summary>
    ''' <param name="TermList">List of terms.</param>
    ''' <exception cref="ArgumentNullException">TermList enumerator cannot be null.</exception>
    Public Sub New(TermList As IEnumerable(Of String))
        If TermList Is Nothing Then Throw New ArgumentNullException("TermList enumerator cannot be null.")
        termEnum = TermList
    End Sub

    ''' <summary>
    ''' Generates anagrams of the specified word. Only terms that occur in the supplied
    ''' dictionary are returned.
    ''' </summary>
    ''' <param name="Word">The word or term to generate anagram(s) for.</param>
    ''' <returns>IEnumerable(Of System.String).</returns>
    ''' <exception cref="ArgumentException">
    ''' Word parameter cannot be null, empty or just whitespace.
    ''' </exception>
    Public Function Anagrams(ByVal Word As String) As IEnumerable(Of String)
        If Word.IsNullOrEmptyOrWhiteSpace Then Throw New ArgumentException("Word parameter cannot be null, empty or just whitespace.")
        If Terms Is Nothing Then LoadTermDictionary()
        Dim results = Terms(SortByLetter(Word.ToUpper()))
        Return results
    End Function

    ''' <summary>
    ''' Unscrambles the specified word. Common typos in queries are based on the transposition
    ''' of 2 or more letter. For eample 'saphalt' might be entered instead of 'asphalt'. The
    ''' unscramble method returns a ranked list of alternatives based on the supplied
    ''' dictionary. The ranking is on a scale of [1.0 .. 0.0] where higher scores denote a
    ''' better (closer) alternative for the supplied term.
    ''' </summary>
    ''' <param name="Word">The word.</param>
    ''' <returns>IEnumerable(Of System.ValueTuple(Of System.String, System.Double)).</returns>
    ''' <exception cref="ArgumentException">
    ''' Word parameter cannot be null, empty or just whitespace.
    ''' </exception>
    Public Function Unscramble(Word As String) As IEnumerable(Of (Term As String, Rank As Double))
        If Word.IsNullOrEmptyOrWhiteSpace Then Throw New ArgumentException("Word parameter cannot be null, empty or just whitespace.")
        If Terms Is Nothing Then LoadTermDictionary()
        Word = Word.ToUpper()
        Dim results As New List(Of (Term As String, Rank As Double))
        For Each term In Terms(SortByLetter(Word))
            results.Add((term, 1 - (LevenshteinDistance(term.ToUpper(), Word) / term.Length())))
        Next
        Return results.OrderByDescending(Function(x) x.Rank)
    End Function

    Private Sub LoadTermDictionary()
        terms = New Dictionary(Of String, List(Of String))()
            If termFileName IsNot Nothing Then
                For Each term As String In File.ReadAllText(termFileName).Split(vbLf)
                    AddTerm(term)
                Next
            Else
                If termEnum IsNot Nothing Then
                    For Each term As String In termEnum
                        AddTerm(term)
                    Next
                Else
                    Throw New Exception("Cannot load dictionary terms.")
                End If
            End If
    End Sub

    Private Sub AddTerm(Term As String)
        Dim key As String = SortByLetter(Term.ToUpper())
        If terms.ContainsKey(key) Then
            terms(key).Add(Term)
        Else
            Dim tl = New List(Of String)() From {Term}
            terms(key) = tl
        End If
    End Sub

    Private Shared Function SortByLetter(ByVal Term As String) As String
        Return String.Concat(Term.OrderBy(Function(c) c))
    End Function

    ''' <summary>
    ''' Calculates the Levenshtein edit distance between strings.
    ''' </summary>
    ''' <param name="Text1">First string</param>
    ''' <param name="Text2">Second string</param>
    ''' <returns>Number of edits necessary to get from Text1 to Text2</returns>
    Private Shared Function LevenshteinDistance(ByVal Text1 As String, ByVal Text2 As String) As Integer
        Const INSERT_PENALTY As Integer = 1
        Const DELETE_PENALTY As Integer = 1
        Const SUBSTITUTE_PENALTY As Integer = 1

        Dim diffs As Integer(,) = New Integer(Text1.Length + 1, Text2.Length + 1) {}
        Dim cost As Integer

        For i As Integer = 0 To Text1.Length
            diffs(i, 0) = i
        Next

        For i As Integer = 0 To Text2.Length
            diffs(0, i) = i
        Next

        For i As Integer = 1 To Text1.Length
            For j As Integer = 1 To Text2.Length
                If (Text1.Chars(i - 1) = Text2.Chars(j - 1)) Then
                    cost = 0
                Else
                    cost = SUBSTITUTE_PENALTY
                End If
                diffs(i, j) = Min(Min(diffs(i - 1, j) + INSERT_PENALTY, diffs(i, j - 1) + DELETE_PENALTY), diffs(i - 1, j - 1) + cost)
            Next
        Next
        Return diffs(Text1.Length, Text2.Length)
   End Function
End Class

​I'd rather live in a place where nothing makes sense but things work rather than a place where things make sense but don't work.

QuestionAn alternate algorithm Pin
Member 1122079213-Apr-21 8:31
Member 1122079213-Apr-21 8:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.