15,901,205 members
Articles / General Programming / Algorithms

# How to Solve Word Ladder Problem?

Rate me:
22 Nov 2020CPOL5 min read 17.4K   2   4
A look into graph based algorithm
In this post, we will take a look at what are graphs in the context of algorithms and how we apply them to solve such problems?

Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. I believe she stumbled across this while preparing for data structures and algorithms.

### Problem Statement

Typically, the puzzle shared is a flavor of what's given below:

Quote:

Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.

#### Explanation

Assuming all these 4 letter words are there in the dictionary provided, it takes a minimum of four transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN

The intent here is to know about `Graph algorithm`. So, what are graphs in the context of algorithms and how do we apply them to solve such problems?

### Graph Data Structure

Graphs are flow structure that represent entities connection with each other. Visually, they are represented with the help of a `Node` (Vertex) & an `Edge` (Connector).

Quote:

A `tree` is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.

The most common way to represent a graph is by using an `Adjacency matrix`. In it, Element `A[i][j]` is `1` if there is an edge from node `i` to node `j` or else, it is `0`. For example, adjacency matrix of the above unidirected graph is:

```  | 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0```

Another common way is via `Adjacency list` (List format of the data instead of a matrix).

### Related Algorithms

Graphs are applied in `search algorithms`. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.

Quote:

For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.

#### Depth First Search (DFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to finding all the nodes in G traversed from s through its edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.

Quote:

For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.

Thus, with `DFS` approach, we can have multiple deduced trees.

Quote:

Knight’s tour is a classic example that leverages Depth First Search algorithm.

#### Shortest Path First OR Dijkstra’s Algorithm (SPF)

Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.

Quote:

Many real world example fits in here, e.g. what would be shortest path from home to office.

With `BFS` (a simple queue), we visit one node at a time whereas in `SPF` (a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are `O(log n)` compared to operations on a regular queue which is `O(1)`.

### Code

I will be using a breadth first graph algorithm here based on the problem need:

Python
```import collections
from collections import deque

class Solution(object):
# method that will help find the path
endWord, wordList):
"""
:type beginWord: str
:type endWord: str
:type wordList: Set[str]
:returntype: int
"""

# Queue for BFS
queue = deque()

# start by adding begin word
queue.append((beginWord, [beginWord]))

while queue:
# let's keep a watch at active queue
print('Current queue:',queue)

# get the current node and
# path how it came
node, path = queue.popleft()

# let's keep track of path length
# traversed so far
print('Current transformation count:',
len(path))

# find possible next set of
# child nodes, 1 diff
for next in self.next_nodes(node,
wordList) - set(path):
# traversing through all child nodes
# if any of the child matches,
# we are good
if next == endWord:
print('found endword at path:',
path)
return len(path)
else:
# keep record of next
# possible paths
queue.append((next,
path + [next]))
return 0

def next_nodes(self, word, word_list):
possiblenodes = set()

# all the words are of fixed length
wl_word_length = len(word)

# loop through all the words in
# the word list
for wl_word in word_list:
mismatch_count = 0

# find all the words that are
# only a letter different from
# current word those are the
# possible next child nodes
for i in range(wl_word_length):
if wl_word[i] != word[i]:
mismatch_count += 1
if mismatch_count == 1:
# only one alphabet different-yes

# lets see the set of next possible nodes
print('possible next nodes:',possiblenodes)
return possiblenodes

# Setup
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
"MAIN","RUIN"]

# Call
print('Transformations needed: ',
endWord, wordList))

# Transformation expected == 4
# One possible shortes path with 4 transformation:
# SAIL -> MAIL -> MAIN -> RAIN -> RUIN```
Quote:

Used `deque` (doubly ended queue) of Python

`deque` helps with quicker append and pop operations from both the ends. It has `O(1)` time complexity for append and pop operations. In comparison, list provides it in `O(n)` time complexity.

A quick look at the code workflow to validate if all nodes at a particular distance were traversed first and then moved to the next level:

```Current queue: deque([('SAIL', ['SAIL'])])

Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']),
('MAIL', ['SAIL', 'MAIL'])])

Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']),
('MAIL', ['SAIL', 'BAIL',
'MAIL'])])

Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL',
'MAIL']),
('BAIL', ['SAIL', 'MAIL',
'BAIL']),
('MAIN', ['SAIL', 'MAIL',
'MAIN'])])

Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL',
'BAIL']),
('MAIN', ['SAIL', 'MAIL',
'MAIN']),
('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL',
'MAIN']),
('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN']),
('RAIN', ['SAIL', 'MAIL',
'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL',
'MAIN', 'RAIN']),
('RAIN', ['SAIL', 'BAIL',
'MAIL', 'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN',
'RAIN']

Transformations needed:  4
Overall path: ['SAIL', 'MAIL', 'MAIN',
'RAIN', 'RUIN']```

### Complexity

For the above code that I used to find the shortest path for transformation:

#### Time

In `next_nodes`, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did `M×N` iterations, where `M` is the length of each word and `N` is the total number of words in the input word list. Further, to form an intermediate word, it takes `O(M)` time. This adds up to O(M2×N).

In `ladderLength`, BFS can go to each of the `N` words and for each word, we need to examine `M` possible intermediate words. This adds up to O(M2×N).

Overall, it adds up to O2(M2×N) which would be called O(M2×N).

#### Space

In `next_nodes`, each word in the word list would have M intermediate combinations. For every word, we need a space of M2 to save all the transformations corresponding to it. Thus, it would need a total space of O(M2×N).

In `ladderLength`, BFS queue would need a space of `O(M×N)`.

Overall, it adds up to O(M2×N)` `+ O(M×N) which would be called O(M2×N)

### Wrap Up

It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.

Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.

I had a read of the following reference and it has much more details if needed.

Keep problem solving!.

Written By
Architect Intuit India
India

A software professional for more than 17+ years building solutions for Web and Desktop applications.

Currently working at Intuit India.

Website: Learn By Insight
Github: Sandeep Mewara

Strongly believe in learning and sharing knowledge.