Click here to Skip to main content
13,835,332 members
Click here to Skip to main content
Add your own
alternative version

Stats

5K views
362 downloads
13 bookmarked
Posted 13 Dec 2018
Licenced CPOL

Mr. Crossworder - Create Crosswords in Seconds!

, 6 Jan 2019
Rate this:
Please Sign up or sign in to vote.
Crossword creator - with a touch of Unicode Logic!

Version 2.0:

 

Version 1.0:

Introduction

This is a crossword puzzle creator coded in C#.NET with .NET framework 4.5.2. It is also extended to support crossword creation with Unicode letters. Different human languages use different Unicode codepage, hence coding would be different for different Unicode languages.  However, this project gives an idea of how to extend the logic to accommodate different human languages.

Background

Necessity is the mother of invention. While I was trying to download crossword puzzles for my son, I just hit the idea of why not coding for it? I already have a similar design in my other project, I can re-use it to suit a little different requirement. That’s how it started.

How It Works

1) At the very beginning it loads the regular (English) words and clues automatically.

2) If the user is not satisfied with the words assembly, then s/he can click on 'Reshuffle Board' menu item. This can be done as many times as needed. However, logically a better assembly should be determined by the count of successful words placement which is displayed in the bottom right status label (e.g.: 6 failed case(s), 6 isolated case(s); remaining 38 words will be on the crossword).

3) The user can select a word in the listview. The corresponding word will be highlighted in the grid.

4) If the user is not happy with a word and want to pick another random word from the dictionary, then s/he needs to select the word in the listview and press ENTER.

5) If the user wants to modify a word and meaning (clue), then s/he needs to double-click on the word. A small dialog will appear that will facilitate changing the word.

6) After the user is satisfied, s/he clicks on the menu item 'Create Crossword'. The actual crossword board will be displayed.

7) Click on File->Save Crossword on this board. The board (bmp image), clues (text file) and answers (text file) will be saved in the 'Crosswords' folder of the current executable path. These files will be suffixed with the current date-time stamp.

8) If the user wants to create Bangla unicode crosswords, then s/he clicks on the 'Load Bangla Unicode' menu item of the main board.

9) If the JSON dictionary is somehow tampered with and not in correct format, then it displays an error msg.

10) The necessary configurations are in the app.config file.

Logic

The logic is to use a JSON dictionary with key-value pair as word-clue pairs. For example, if following is a JSON entry, then the idea is to use the meaning as the clue and the word as the crossword.

{

                “BUS”: “A public transportation used to carry people from place to place”

}

The word “BUS” will be placed on the grid either ACROSS or DOWN, and the meaning would be the clue to find the crossword. After all the words are placed on the board and the user is satisfied with the assembly, then s/he proceeds with the crossword generation.

For word generation, an open-source JSON dictionary is obtained from here. To reduce bandwidth a small portion of the dictionary is added to the project (about 600 words). It is advised to download the whole dictionary and use it; effects will be the same however, with more words at hand.

High Level logic:

  1. Randomly select (X, Y) axis and direction.
  2. Try to place the word on the board.
    1. If there are not enough sparse words on the board, then find an isolated axis on the board and place it there.
    2. Or If there are enough sparse words on the board, then make sure the current word crosses with existing word(s) on the board. During this phase, if the attempts for placement reaches a maximum count, then abort the word and proceed with the next word.

The explanation for (2a) is, the first few words are placed as disjoint words. This is to make sure that words are scattered over all through the board.

The explanation for (2b) is, all the rest of the words should cross other existing word(s) on the board. There might be an unfortunate situation when a word doesn’t find a suitable place after a lot of attempts. In such cases the word is marked as a fail after the threshold is reached.

Improved High Level logic:

Rather than randomly selecting the starting (X, Y) of a word, a second logic is applied which is more efficient. The second logic checks for each letter of the word if there is another word on the board that contains the letter.

For example, if (CART) is to be placed, then it checks for any existing word on the board that contains ‘C’ or ‘A’ or ‘R’ or ‘T’. E.g., there might be such words like CAR, ATTEST, ASTEROID on the board.

Pseudocode for this logic is the following:

For each letter in the current word: (E.g.: ‘C’ in CAT)

  1. Take the letter and look for words on the board that contains that letter. (E.g.: COW, ARC, SCATTER).
  2. Check if the letter can be placed there:

          

  1. However, the second logic is applied to the Unicode section only. It is left as an exercise to the reader to apply it on the regular English alphabets.If a placement is possible, then place the word (CAT) there and proceed on with the next word.
  2. If a placement is not possible (failed to cross with any existing word on the board, then start with the next letter (e.g. ‘A’ in CAT) and try to find similarly words on the board that contains ‘A’ (e.g.: CAR, ASTEROID, PASCAL etc.); loop from (2a).

Legitimate Placement:

The logic for a valid placement is as following:

  1. First check if the word (e.g. CART) can be placed on the board – if it (CAT) crosses through another word (e.g. HAT), then the letter at the crossing (e.g. ‘A’) is the same that is on the board.

          

  1. If a word is to be placed ACROSS, then:
    1. Under no circumstances, the word can have any other letter before and after it. E.g.: if CART is to be placed ACROSS, then the cell before and after should be blank; as TRAIN and STOP are already on the board, so CART cannot be placed here.

         

                  b. If there is any letter on any cell above the row of the word, then that word (which is already on the                          board) cannot stop at the row before, but can only cross through the word. For example, if CART                            is the current word, then it cannot be placed below HAT, but can be placed along MART, ACTOR,                            TRIM, ALONG.

            

                   c. Similarly if there is any letter on any cell below the row of the word, then that word (which is                                   already on the board) should stop at the row before, but should only cross through the word. For                           example, if CAT is the current word, then it cannot be placed above HAT, but can be placed along                           MART, ACTOR, TRIM, ALONG.

            

 

  1. If a word is to be placed DOWN, then:
    1. Under no circumstances, the word can have any other letter above and below it. E.g.: if CAT is to be placed DOWN, then the cell to the top and bottom should be blank.

             

  1. If there is any letter on any cell to the left of the word, then that word (which is already on the board) cannot stop at the column before, but can only cross through the word. For example, if CAT is the current word, then it cannot be placed below HAT, but can be placed along MANGO, ARC, STAY, THREAD.

            

  1. Similarly if there is any letter on any cell to the right of the word, then that word (which is already on the board) cannot start at the column to the right, but should only cross through the word. For example, if CAT is the current word, then it cannot be placed before HAT, but can be placed along MANGO, TRAIN, SCOOP, STAY, THREAD, SCANT.

             

Project Structure:

The project has two main forms, one auxiliary form, 6 class files. The purpose of individual elements are:

  1. Form – MainBoard: This is the main form. Its activities are:
    1. Load JSON dictionary into a collection (e.g.: about 86,000 words).
    2. Randomly load a certain amount of words and meanings (e.g.: 50).
    3. Populate the listview so the user can see the words and meanings.
    4. Call the GameEngine class to utilize the placement logic and populate the word matrix.
    5. Draw grids (horizontal, vertical lines).
    6. Map the matrix to individual cells.
    7. Update legends (status labels).
    8. Update the listbox with different colours to represent failed words, isolated words, and words with lengthy clues.
    9. Interact with different menu selections:
      1. Load English Words – load English dictionary of words.
      2. Load Bangla Unicode – load Bangla Unicode dictionary of words.
      3. Reshuffle board – try a different assembly of the words.
      4. Create Crossword – display the ‘FinalCrosswordBoard’.
      5. About – Display the ‘About’ box.
    10. Enable the user to highlight the word on the board if a word is selected on the board.
    11. Enable the user to change an individual word by selecting it on the listview and pressing ENTER.
    12. Enable the user to tweak (change) an individual word by double-clicking on it. This displays the ‘EditWord’ form.
  2. Form – EditWord: Allows the user to change a word and meaning (clue).
  3. Form –  FinalCrosswordBoard: This is the crossword form. Its activities are:
    1. Arrange the clues in the ACROSS and DOWN textboxes. Apply logic for proper numbering.
    2. Draw grids (horizontal, vertical lines).
    3. Fill-in blank cells (cells in matrix with NULLs) with grey colour.
    4. Place indices accordingly in individual white boxes where the word would appear.
    5. Interact with different menu selections: Save the crossword.
  4. Interface – IDetails, ICompositeUnicode: The interfaces containing the basic signature of the word details info – word, meaning, axes, direction, failing flag, overlapping flag, isolation flag, output sequence. The 'ICompositeUnicode' has one extra list to hold the composite unicode characters.
  5. Class – DetailsAndAxes: contains two classes (structural bodies) – one for regular words, the other for Unicode. The Unicode one has an extra element ‘CompositeUnicodeLetters’ for individual composite elements.
  6. Class – Globals: For global and static variables.
  7. Class – BanglaUnicodeParser: For parsing Bangla Unicode characters. Input: Whole word (e.g.: ভণ্ডুল), output list of strings (e.g.: individualLetters[0] = ভ, individualLetters[1] = ণ্ডু, individualLetters[2] = ল).
  8. Class – GameEngine: The class with placement logic:
    1. Method – PlaceWordsOnTheBoard(): Loops through all the words in the list and tries to find a placement for them on the board.
      1. GetRandomAxis() – generate random axes for the word.
      2. PlaceTheWord() – try to place the word on the board. Follow the high-level logic specified in 'high level logic' section.
        1. If it is a right-directed (ACROSS) word:
          1. See if there is no mismatching overlap on the board.
          2. See if the left cell is free.
          3. See if the right cell is free.
          4. See if the top cells along all the letters of this word are free; if not, see if this is a legitimate crossing.
          5. See if the bottom cells along all the letters of this word are free; if not, see if this is a legitimate crossing.
          6. If all these are passed, then this is a valid axis for the word; place it there.
        2. If it is a down-directed (DOWN) word:
          1. See if there is no mismatching overlap on the board.
          2. See if the top cell is free.
          3. See if the bottom cell is free.
          4. See if the left cells along all the letters of this word are free; if not, see if this is a legitimate crossing.
          5. See if the right cells along all the letters of this word are free; if not, see if this is a legitimate crossing.
          6. If all these are passed, then this is a valid axis for the word; place it there.
  9. Class – BanglaUnicodeGameEngine: Like the previous class. However, instead of random initial axes generation, it offers a better logic. Please refer to 'improved high level logic' section for a high-level logic overview. The only addition is, since each cell represents a compound Unicode letter, so how do you accommodate a compound letter for a cell? You guessed right! Add a third dimension to the 2D matrix where the third dimension takes care of individual compound Unicode letters.

After the words are placed, they would look something like the following:

Touch of Unicode:

Each language in the world has its own Unicode page. In this project Bangla Unicode is applied. This section sheds some light on how to extend the logic to other Unicode languages.

Apart from regular English alphabets, Unicode is used to represent other languages. However, coding in Unicode is a little different as the alphabets are usually represented by a combination of different codes. For example, the word ‘ভণ্ডুল’ is represented as:

Each alphabet is represented as a different code, and a Unicode alphabet can be represented as a single code (e.g.: 2477 for 'ভ'), or a combination of codes (e.g.: ণ্ডু = 2467 'ণ' + 2509 '্' + 2465 'ড').

Following is a simple example of how to output the word (ভণ্ডুল). This shows a message box displaying the word (ভণ্ডুল).

MessageBox.Show(((char)2477).ToString() +
                ((char)2467).ToString() +
                ((char)2509).ToString() +
                ((char)2465).ToString() +
                ((char)2497).ToString() +
                ((char)2482).ToString());

For regular English words, a letter is there by itself, so wherever there is a need to work with individual alphabets, the letters can be used as such. However, for Unicode letters, a list of strings is needed where each string in the list represent a composite Unicode letter.

public List<string> CompositeUnicodeLetters { get; set; }

In other words, the word (ভণ্ডুল) needs to be segregated into three individual composite letters and put in the list. So, the list would look like:

CompositeUnicodeLetters[0] = ‘ভ’
CompositeUnicodeLetters[1] = ‘ণ্ডু’
CompositeUnicodeLetters[2] = ‘ল’

This is needed wherever there is a need to walk to the length of the word. To compare, following is a snippet that walks to the length of the word to find if it not isolated.

if (wrd.Y > 0)
    for (int x = wrd.X, y = wrd.Y - 1, i = 0; i < wrd.Word.Length; x++, i++)
        if (matrix[x, y] != '\0')
        {
            wrd.Isolated = false;
            return;
        }

This word.length cannot be used as such for Unicode. As for example, the word length for the word (ভণ্ডুল) would be 6 as it comprises of 6 Unicode numbers.

That is why the split is necessary that segregates the word into distinct values, so the list correctly walks along the length as follows:

if (wrd.Y > 0)
    for (int x = wrd.X, y = wrd.Y - 1, i = 0; i < wrd.CompositeUnicodeLetters.Count; x++, i++)
        if (matrix[x, y, 0] != '\0')
        {
            wrd.Isolated = false;
            return;
        }

Now the problem is, individual compound letters are needed for the crossword where each compound letter can be put in a cell. When a Unicode language is read, it can be read as is and parsed as such. However, problem lies in separating the individual compound letters as there is no delimiter between each successive letter. As a comparison, in English each letter is of its own and no delimiter is needed. E.g.: Each alphabet in CAT is of its own and no delimiter is needed; each letter can be placed on individual cells on the board.

To do the same for Bangla or other Unicode languages, a logic is needed to parse individual compound letters. The parsing logic is obviously different for different Unicode languages. Further, the delimiter is not length-specific. For example, the letter (ন্দ্রি) in the word (চন্দ্রিমা) alone requires six individual Unicode codes to make the compound letter (ন্দ্রি).

So, there is no hard and fast rule of how to parse the individual compound Unicode letters. A logic is developed for parsing individual Bangla Unicode letters which is available in the file ‘BanglaUnicodeParser.cs’ of the project. As mentioned, the segregation logic is different for different Unicode languages. It requires language-specific expertise as well. Hence, different Unicode languages need to develop their own parsers as the language semantics and structure are completely different from each other. The Bangla Unicode crossword would look something like the following:

Program Flow:

Reading From File:

NewtonSoft.Json is used to parse the JSON file and put the words in a collection:

using (StreamReader reader = new StreamReader(fileName))
    jsonWords = reader.ReadToEnd();
JObject obj = (JObject)JsonConvert.DeserializeObject(jsonWords);
wordsAndMeaning = obj.ToObject<Dictionary<string, string>>();

Take A Snapshot in the Collection:

After that, a snapshot of some words is put in a list. This is the list of words that will be put in the crossword. The words are trimmed off any space and hyphen. Also, no duplicates are allowed.

Populate the Listview with the Words in the Snapshot:

After obtaining a snapshot, the words are put in the list for the user to have a look at them. Column widths are maintained dynamically by a scale factor and the maximum word-length in the list view. User can change a word and meaning by double-clicking on a word. Also, if the user wants to pick a new word instead of a word on the list, all s/he needs is to press ENTER, and another word is randomly selected from the collection.

Start the game engine:

Now it is time for the crucial logic to find proper placement of the words on the board. The logic is described in 'Logic' section of this article.

After the engine successfully runs, it exposes two public variables to be used by other forms:

  1. wordDetails: The list of word details that contain information of a word – the axes, direction, word, meaning, direction, isolation flag, failure flag, and the sequence (that will be populated later in the crossword board).
  2. matrix: The character matrix that represents letters on the board. In programming linguistics, this is a 2D char array.

Isolation of words is checked at the end of the engine’s primary activity. The word CROSSWORD means, every WORD CROSSes with each other. This project doesn’t conform to the orthodox view that all the words should be connected. That is left as an exercise to the reader. This project can have groups of isolated words. However, it doesn’t allow a word to be totally disjoint and standing on its own. Such words are flagged as isolated and will be removed from the final crossword board.

Place the Words on the Board:

After returning from the game engine the main board starts painting the characters from the matrix to the game board. Now the user can select a word on the list and the main board will indicate where the word is on the board.

At this point the legends are updated with respective statuses. There are three status labels – one for failed words, one for isolated words, and one for long-meaning words. They are updated accordingly.

Generating the Crossword:

After the user is satisfied with the assembly, s/he opts for creating the crossword. The current word list, the letter matrix, and the word details are sent to the constructor of the form.

Maintaining correct sequence of words is a challenge here as the main board has a single list of words whereas now it is time to separate them into two groups – ACROSS and DOWN.

At the very beginning, the words that have the same starting axes are placed in both ACROSS and DOWN strings. A clone is taken of the original word details collection. After that, the words with same starting axes are placed in ACROSS and DOWN strings. When these words are done, then the rest of the words are placed in ACROSS and DOWN strings according to their direction. After all the words are taken care of, then the clone is copied back to the original collection. The textboxes are also populated with respective clues.

After the clues are parsed successfully, it is time to place the numbers on the board. The same line drawing functionality is used, only this time numbers are to be placed at the cells instead of the word. After the numbers are placed, the only thing left is to fill in the other cells with a block colour so the cells with the CROSSWORDs are more vigilant.

Finally, when the user selects File->Save, the crossword is saved in the root folder as an image. Along with the image, the answers and the clues are also written in separate text files. For simplicity, the user is not asked for any filename, but the application simply puts a date-time stamp to separate from subsequent CROSSWORDs in future.

A Glimplse of the Code:

Interface: IDetails:

This contains the basic signature of the details of the words – axes, direction, max attempts, fail flag and isolation flag.

The regular words class implements this interface. Basically, the regular words have exactly the same properties – no more or less.

Interface: ICompositeUnicode:

This contains the basic signature for an extra field required for holding split composite Unicode characters. The Unicode words class implements this as well as the IDetails interface.

Reading from file:

Words are read from file and parsed into a dictionary object as key-value pairs. This is done in the following code:

using (StreamReader reader = new StreamReader(fileName))
    jsonWords = reader.ReadToEnd();
JObject obj = (JObject)JsonConvert.DeserializeObject(jsonWords);
wordsAndMeaning = obj.ToObject<Dictionary<string, string>>();

Placement Logic:

There can be two orientations for the words - ACROSS (Direction.Right) and DOWN (Direction.Down). First it checks if the word can be placed on the board. For each letter of the word it checks if the corresponding cell in the matrix (i.e., the corresponding cell in the board) is blank ('\0') or not. If it is not blank (not '\0'), then at least the current letter should be the same as the letter that is already staying on the board. This is done in the following code:

for (int i = 0, xx = x; i < word.Length; i++, xx++) // First we check if the word can be placed in the array. For this it needs blanks there or the same letter (of another word) in the cell.
{
    if (xx >= Globals.gridCellCount) return false;  // Falling outside the grid. Hence placement unavailable.
    if (matrix[xx, y] != '\0')
    {
        if (matrix[xx, y] != word[i])   // If there is an overlap, then we see if the characters match. If matches, then it can still go there.
        {
            placeAvailable = false;
            break;
        }
        else overlapped = true;
    }
}

Similar check is done for the DOWN words, only that for them we need to travel down (i.e., x remains constant, y changes).

For Unicode, we need one additional line in this logic. This because, for Unicode, there is no more a single letter in the cell, but there are a couple of Unicode letters that combine into a composite code (letter). Also, for Unicode we have a 3D matrix. Hence the line:

if (matrix[xx, y] != '\0')

changes to:

if (matrix[xx, y, 0] != '\0')

And the same letter check for a non-blank cell changes from:

if (matrix[xx, y] != word[i])
{
    placeAvailable = false;
    break;
}

to:

string compositeUnicodeLetter = Globals.GetCompositeLetterFromTheMatrix(xx, y, matrix);
if (compositeUnicodeLetter != unicodeLetters[i])
{
    placeAvailable = false;
    break;
}

After the initial blank cell check and same letter check is satisfied, then the 'overlapped' flag is used along with the maximum non-overlapping word count threshold to determine if the word should be alone, or it should overlap. Just to remind, the first few words should not overlap to make the words spreaded sparsely across the board, whereas the rest of the words must overlap with existing word(s) on the board. These are checked in the following part:

if (currentWordCount < Globals.MAX_NON_OVERLAPPING_WORDS_THRESHOLD && overlapped)
    return false;

else if (currentWordCount >= Globals.MAX_NON_OVERLAPPING_WORDS_THRESHOLD && !overlapped)
    return false;

After these conditions are satisfied, now it is time to check if the word is really placeable on the current axes in the given direction.

This part discusses the logic for ACROSS words, named leftFreetopFreebottomFreerightMostFree.

There are two types of checks - one is, if there cannot be any letter at the beginning and ending of an ACROSS word. the leftFree and rightMostFree flags confirm this through the methods they call. For example, the leftFree flag is determined by the method 'LeftCellFreeForRightDirectedWord' which has the following code:

if (x == 0) return true;
if (x - 1 >= 0)
    return matrix[x - 1, y] == '\0';
return false;

Here, (x, y) are the axes where the word is to be placed ACROSS. Now if it is the leftmost column (x = 0), then there is no need to check if the left cell is blank or not, as there is no left cell. Otherwise, it checks if the left cell of x is blank or not.

Similarly, the check for the freeness of the rightmost cell of this ACROSS word is determined by the following code in the method 'RightMostCellFreeForRightDirectedWord':

if (x + word.Length == Globals.gridCellCount) return true;
if (x + word.Length < Globals.gridCellCount)
    return matrix[x + word.Length, y] == '\0';
return false;

First it checks if the last letter of the word reaches the rightmost column of the matrix. If it reaches the right-most cell, then there is no need to further check the rightmost letter, as there is no cell further right. Otherwise it checks if the next rightmost cell of the word is blank or not.

For an ACROSS word, the check for top and bottom cell freeness is much more complex. Let us see what is happening at the 'TopCellFreeForRightDirectedWord' method.

if (y == 0) return true;
bool isValid = true;
if (y - 1 >= 0)
{
    for (int i = 0; i < word.Length; x++, i++)
    {
        if (matrix[x, y - 1] != '\0')
            isValid = LegitimateOverlapOfAnExistingWord(x, y, word, Direction.Up);
        if (!isValid) break;
    }
}
return isValid;

First it checks if the word is to be placed ACROSS on the topmost cell of the matrix (y = 0). If that is the case, then there is no further top cell to check. Otherwise, for each letter of the word check if the top cell is blank or not (matrix[x, y - 1] != '\0'). If it is not blank, then check if the letter above is part of another word that must satisfy three conditions:

1) The letter belongs to an existing word on the board.

2) That other word on the board is not also ACROSS.

3) That letter above is not the last letter of the existing word on the board.

Now let's examine the Up case of the 'LegitimateOverlapOfAnExistingWord' method:

while (--y >= 0)
    if (matrix[x, y] == '\0') break;                                        // First walk upwards until you reach the beginning of the word that is already on the board.
++y;

for (int i = 0; y < Globals.gridCellCount && i < Globals.MAX_WORD_LENGTH; y++, i++) // Now walk downwards until you reach the end of the word that is already on the board.
{
    if (matrix[x, y] == '\0') break;
    chars[i] = matrix[x, y];
}

str = new string(chars);
str = str.Trim('\0');
wordOnBoard = (RegularWordDetails)wordDetails.Find(a => a.Word == str);     // See if the characters form a valid word that is already on the board.
if (wordOnBoard == null) return false;                                      // If this is not a word on the board, then this must be some random characters, hence not a legitimate word, hence this is a wrong placement.
if (wordOnBoard.WordDirection == Direction.Right) return false;             // If the word on the board is in parallel to the word on to be placed, then also this is a wrong placement as two words cannot be placed side by side in the same direction.
if (wordOnBoard.Y + wordOnBoard.Word.Length == originalY) return false;     // The word on the board starts right below the y-cordinate for the current word to place. Hence illegitimate.
return true;                                                                // Else, passed all validation checks for a legitimate overlap, hence return true.

The first WHILE loop travels upwards to find the beginning of the existing word on the board.

The FOR loop then traverses downwards from that starting point and coins a word in chars.

Then a string str is formulated from the chars array. It also truncates blanks ('\0').

Then it checks if the word is a legitimate existing word on the board (number 1 in the above-mentioned 3 conditions). If not, it returns false.

It checks if the word is also an ACROSS word or not. If it is ACROSS, then also the current word cannot be placed there (number 2 in the above-mentioned 3 conditions).

It checks if the existing word on the board ends just above the top cell of the current placement index y (number 3 in the above-mentioned 3 conditions).

If all the three conditions are satisfied, then this is a legitimate crossing overlap of the current word with an existing word.

Similar check is done to make sure if there are letters at the bottom cells of the ACROSS word, then together they formulate a valid crossing. This is accomplished in the 'BottomCellFreeForRightDirectedWord' method.

After the four flags are satisfied, this would mean the current word is good to be placed in the given axes (x, y) in the given direction. So it is placed in the word matrix, and also details are saved in the 'RegularWordDetails' object via the method 'SaveWordDetailsInCollection'. This is done in the following portion of the 'PlaceTheWord' method in the 'GameEngine' class.

for (int i = 0, j = x; i < word.Length; i++, j++)
    matrix[j, y] = word[i];
SaveWordDetailsInCollection(word, wordMeaning, x, y, direction, attempts, false);

Remember, for unicode, we have one more dimension in the character matrix. For regular words we have a single letter to place in the matrix, whereas for unicode, we need to place the composite letter (that comprises of a couple of unicodes). This is done in the following portion of the 'PlaceTheWord' method in the 'BanglaUnicodeGameEngine' class.

SaveWordDetailsInCollection(word, wordMeaning, x, y, direction, attempts, false);                        
for (int i = 0; i < unicodeLetters.Count; i++, x++)
{
    char[] atomElements = unicodeLetters[i].ToArray();
    int z = 0;
    foreach (char c in atomElements)
        matrix[x, y, z++] = c;
}               

Similar logic follows for the DOWN words, so this is not discussed to reduce the length of the article.

Marking Isolated Words:

As a minimal requirement, no word should be isolated in the matrix as every word should CROSS with at least another WORD. So at the end of placement, another check is done to flag the Isolated flag of the 'RegularWordDetails' object. This is done in the 'CheckIfTheWordIsIsolatedAndFlagAccordingly' method. For an ACROSS word, it simply walks along the top and bottom cells of the word; if there is at least a letter in any top/bottom cell along the word, then the flag is false (as it would mean the word is not isolated).

The blank check for TOP cells is done in the following portion. First it checks if the Y axis of the current word is not the first row (if it is the first row, then there is no point checking the row above as there is no row above). Then it walks along the word from left to right (incrementing x), and checks for each top cell if it is blank or not. If at any point it finds a letter in the top cell, then it sets the flag to false and returns immediately.

if (wrd.Y > 0)                                                                  // If there is a row of cells to the top of the right-directed word.
    for (int x = wrd.X, y = wrd.Y - 1, i = 0; i < wrd.Word.Length; x++, i++)    // Walk righwards along the top row of the word.
        if (matrix[x, y] != '\0')                                               // And see if there is any character to any cell of that row.
        {                                                                       // Which would mean another word passed through; hence this is not isolated.
            wrd.Isolated = false;
            return;
        }

Similarly, the blank check for BOTTOM cells is done in the following portion. First it checks if the Y axis of the current word is not the last row (if it is the last row, then there is no point checking the row above as there is no row above). Then it walks along the word from left to right (incrementing x), and checks for each bottom cell if it is blank or not. If at any point it finds a letter in the bottom cell, then it sets the flag to false and returns immediately.

if (wrd.Y < Globals.gridCellCount - 1)                                          // If there is a row of cells to the bottom of the right-directed word.
    for (int x = wrd.X, y = wrd.Y + 1, i = 0; i < wrd.Word.Length; x++, i++)    // Walk righwards along the bottom row of the word.
        if (matrix[x, y] != '\0')                                               // And see if there is any character to any cell of that row.
        {                                                                       // Which would mean another word passed through; hence this is not isolated.
            wrd.Isolated = false;
            return;
        }

If both the sweeps are done and the code didn't return from them, this would mean there was no letter in the top and bottom cells of the word. So this is definitely an isolated word. So it is flagged accordingly in the 'RegularWordDetails' object and the word is erased (set to '\0') in the word matrix to resist rendering them (not to display them). This is done in the following portion:

if (!wrd.FailedMaxAttempts)
    wrd.Isolated = true;

if (wrd.WordDirection == Direction.Right)
    for (int i = 0, x = wrd.X, y = wrd.Y; i < wrd.Word.Length && i < Globals.gridCellCount; i++, x++)
        matrix[x, y] = '\0';

For unicode, the logic is same. But there is one more thing to keep in mind. What's that? You guessed right - there is a third dimension to consider. This part is not discussed to reduce the article length and should be easily perceivable by the reader.

Some LINQs:

LINQ is used extensively in the project – to search key-value in a dictionary collection or finding an element in a list. Following is a LINQ query for obtaining a list of words which have the same starting axes:

var wordsStartingAtSameAxes = from j in detailsCopy
                              group j by new { j.X, j.Y } into d
                              where d.Count() > 1
                              select (d).ToList();

LINQ is also used to clone an existing list:

detailsCopy = new List<IDetails>(wordDetails.Select(x => x).ToList());

Automatic Window Scaling and Resizing:

Automatic window resizing can be accomplished either in the load event or the resize event. Both the events are utilized in different forms to justify that, either of them can be used.

Automatic window scaling is applied which makes it resolution-independent. The design-time resolution was 1680x1050. However, the higher the resolution, the better is the quality of print. The trick for automatic window scaling is beyond the scope of this article, please refer to here.

Checking Mix of Regular and Unicode:

Version 2.0 offers the provision to enter and save own words. However it obviously doesn't make sense to mix regular and unicode words. Normally the user won't do that, but still it makes sure that the user didn't do it. This is checked in the 'GetEncoding' method of 'CreateAndSaveOwnWords' class.

First it segregates each code of the word - whether it is regular or unicode. For regular letters, the code must be between 65 and 255 inclusive. Hence, if the first code is regular, then all the other codes in the other letters (as well as for all words) should be regular. Similarly, if the first code is Bangla Unicode (between 0x0980 and 0x09fe inclusive), then all the subsequent codes of the other letters (as well as for all words) should lie in that range. It might be noted that for other Unicode words, the range will be different and coders need to change it according to the respective Unicode pages.

WordTypes type = WordTypes.Unknown;
WordTypes prevType = WordTypes.Unknown;
foreach (KeyValuePair<string, string> kvp in wordAndClue)
{
    char[] ch = kvp.Key.ToCharArray();
    if (ch[0] >= 65 && ch[0] <= 255)
        prevType = WordTypes.Regular;
    else if (ch[0] >= 0x0980 && ch[0] <= 0x09fe)        // Refer to Bangla Unicode chart: http://www.unicode.org/charts/PDF/U0980.pdf, modify the code range for other unicode letters.
        prevType = WordTypes.Unicode;

    for (int i = 1; i < ch.Length; i++)
    {
        if (ch[i] >= 65 && ch[i] <= 255)
            type = WordTypes.Regular;
        else if (ch[i] >= 0x0980 && ch[i] <= 0x09fe)    // Refer to Bangla Unicode chart: http://www.unicode.org/charts/PDF/U0980.pdf, modify the code range for other unicode letters.
            prevType = WordTypes.Unicode;

        if (type != prevType) return WordTypes.Mix;
        prevType = type;
    }
}
return type;

Points of Interest

If we contemplate on the work flow, following are the sequences:

  1. The code loads a JSON word dictionary with around 86,000 words.
  2. Parses them in a collection.
  3. Picks random words from them.
  4. Places them in the matrix.
  5. Some of the words fail to find a place after 200,000 attempts; they are flagged as fails.
  6. Another sweep is performed to flag isolated words.
  7. Finally, the graphics renderer renders the matrix on the display.

All these activities are accomplished in the twinkling of an eye. Thanks to the processors, compilers and after all, technology.

As obvious, the Unicode logic takes a little more time than the regular words, as the Unicode logic deals with one more dimension.

Glitches

Please put in comments if any found.

Limitations

There are some strict crossword rules like all the words on the board should be connected to each other; there should not be any group of words in isolation. Mr. Crossworder doesn’t conform to this rule, hence there might be isolated groups of words on the board.

Disclaimer

I am not a sexist, ladies should not loathe me for the title, LOL. It is just that I was listening to Steve Perry’s (Journey) ‘Trial by Fire’ and hit up the line:

“Hello Mr. Moon,
Can I have some time with you?”

Just to mimic:

“Hello Mr. Crossworder,
Can I have some time with you?”

Future Works

A software is never at its peak; there is always a chance to improve. Further, this is just a prototype. A lot of things can be done.

  1. The logic itself can be revised and optimized. In fact, teachers in universities can place it as an optimization problem to the students. There are scattered groups at the moment and a better algorithm might bring them closer. Especially for Unicode languages the words are observed to be a little more sparsed than expected.
  2. The application can be extended as a web app to consume an online web dictionary. There are some online web dictionaries that expose the words and meanings through APIs.
  3. There can be a separate GUI so that the user can create his/her own preset of words and save it on the disk. The GUI should also facilitate loading those presets. (this is accomplished in the second release).
  4. For Bangla Unicode, the indices of the clues, and the numbers on the board are still in English; I would leave that to the user as a practice to output them in Bangla.
  5. This is not coded as per supreme design concepts. I focused more on the logic and get it going as an initial prototype. A lot of coding standards and best practices are out there which can be and should be implemented.
  6. The project is coded in a denormalized form – there is more code that can be compacted. The purpose of such denormalization is to understand what is going on. After the purpose is served, codebase can be further compacted. For example, checking the freeness of the left or right cells of a DOWN word are mostly similar and can be further compacted into one method with minor tweaks and parameters. But such compaction would devoid the reader of the understanding of the purpose. So, it is left like that and the compaction is left as an exercise to them.
  7. It might sound too optimistic, but how about applying machine learning or AI algorithms to be more effective?
  8. The project worked up to 3rd dimension. How about adding a 4th dimension? (never mind, joking!)

Summary

This is a crossword creator based on a pre-defined set of dictionary words. It also experiments on a different human language (Bangla) where the language has its own Unicode. Different languages have their own Unicode pages, and each language differs from the other with regards to semantics and structure. However, this project gives an idea of how to extend the segregation logic to different human languages.

References

Open source English word dictionary

Newtonsoft JSON

Pick random element from a Dictionary collection

All Unicode charts

Bangla Unicode FAQ

Bangla Unicode Chart

Unicode standard - core specification

Paint a letter using .NET graphics engine

Fill a colour rectangle using .NET graphics engine

Capture a screenshot using .NET graphics engine

ESC button to close a form

A complete word puzzle game in C# .NET

A responsive design technique for Winforms

Force datagrid to refresh content

Custom keypress event handler

DataGridView - Set column width

Dynamically restrict row addition to DataGridView after a maximum row count

Setting DataGridView font size

ASCII chart

Google Bangla

Determine a file's encoding pattern

History

14 Dec 2018: First release.

7 Jan 2019: Second release.

Added menu for creating own word-clues, and loading previously saved word-clues JSON file.

There was a bug when the final crossword board was being created as it removed the isolated and failed words from the list. This was accomplished by taking a clone of the list. The change is in the method 'createCrosswordToolStripMenuItem_Click()' of the MainBoard.cs file.

Added 'How It Works' section in the article.

'A Gimplse of the Code' section comes with more explanations of the code.

Added more references.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Mehedi Shams
Software Developer
Bangladesh Bangladesh
A software developer mainly in .NET technologies and SQL Server. Love to code and learn.

You may also be interested in...

Comments and Discussions

 
QuestionImplementing Fractal Trigonometry for Geometrifying Trigonometry(C) Pin
Member 125426257-Jan-19 3:24
memberMember 125426257-Jan-19 3:24 
QuestionLooks nice Pin
Mike Barthold15-Dec-18 0:17
professionalMike Barthold15-Dec-18 0:17 
AnswerRe: Looks nice Pin
Mehedi Shams15-Dec-18 13:44
memberMehedi Shams15-Dec-18 13:44 
QuestionNice Work Pin
adam-jw14-Dec-18 4:17
memberadam-jw14-Dec-18 4:17 
AnswerRe: Nice Work Pin
Mehedi Shams15-Dec-18 13:36
memberMehedi Shams15-Dec-18 13:36 
QuestionNice project Pin
Dirk Bahle14-Dec-18 1:30
mvaDirk Bahle14-Dec-18 1:30 
AnswerRe: Nice project Pin
Mehedi Shams14-Dec-18 1:54
memberMehedi Shams14-Dec-18 1:54 
GeneralRe: Nice project Pin
Dirk Bahle14-Dec-18 10:54
mvaDirk Bahle14-Dec-18 10:54 
GeneralRe: Nice project Pin
Mehedi Shams15-Dec-18 13:29
memberMehedi Shams15-Dec-18 13:29 
GeneralRe: Nice project Pin
Dirk Bahle17-Dec-18 6:11
mvaDirk Bahle17-Dec-18 6:11 
Questiondata-tree Pin
Christ Kennedy14-Dec-18 0:47
memberChrist Kennedy14-Dec-18 0:47 
AnswerRe: data-tree Pin
Mehedi Shams14-Dec-18 1:48
memberMehedi Shams14-Dec-18 1:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web05 | 2.8.190114.1 | Last Updated 6 Jan 2019
Article Copyright 2018 by Mehedi Shams
Everything else Copyright © CodeProject, 1999-2019
Layout: fixed | fluid