## Introduction

The main objective of this article is to propose a way to use Dijkstra's algorithm without the memory limitations imposed by a huge static allocated bi-dimensional array, allowing it to handle a large set of nodes.

As "large set" I mean something from 100 to 1000 nodes. With lesser than 100 nodes, you can use any solution that uses arrays found all around the web.

The code we propose is not limited to 1000 nodes (actually I've not yet found a upper limit), but after 1000 nodes the time to find a solution is not acceptable.

The second part of this article will propose a solution that will be able to handle millions of node with a very low processing.

## Background

Djikstra is a very generic greedy algorithm and it's solution includes the worst case scenario: A graph where every node can be connected to all other nodes.

But after studding many problems, I found that most of them are based on "2D map like" sets of nodes. In this scenario the connectivity between the nodes are very low. Nodes are normally connected only to nodes immediately around it creating a very sparse link matrix.

The basic Djikstra's algorithm that exists in the web does not take that in consideration and allocates resources to map all possible connections between nodes, even to itself.

Let's analyze this sample grid:

It represents a 15x12 "2D map like" grid. This grid is the one used in the sample source code available to download above.

Now... if we consider it as a full connected matrix we would need 180 (15x12) nodes to represent it. That would lead to a link matrix with 32400 (180^{2}) decimal (or integer) values to feed Djikstra. All the codes I've download from the web crashed ("memory overflow") when I tried to produce a link matrix with more than 100 nodes.

If the program does not crash the allocated memory will make the code (and all the machine) slow.

The matrix size is always the square of it's number of nodes. That means that the problem grows exponentially. So even if you get a very huge and powerful machine soon you will get to a point where you will not be able to even process it.

So... let's solve this problem first...

## Preparing the Data

First thing to do is to notice that not every point in the grid is used. There are voids. So instead of starting from the grid dimensions we will build a list of nodes. This reduces the problem a little bit: from 180 to (in this sample, of course) 120 nodes.

That didn't help much, but it's a start...

We also notice that the nodes are not completely connected and not every connection is bidirectional. We can reduce the problem to the links we really use. That means reducing the entire problem to only 246 links (again, in this sample).

Now we are talking…

In my studies I found that the total number of links normally stays between twice to three times the number of nodes. Surprise! The problem became linear!

At this point we became aware that the main problem is not Djikstra’s algorithm itself but actually preparing the input data to use with it. But we still can improve the code to avoid too much memory consumption. We will talk about it later.

With that in mind, let’s see an extract from the input file I used on the source code sample:

120
0 4
1 4
2 1
2 2
2 3
2 4
2 5
2 6
2 7
2 8
3 0
3 1
...
246
0 1 0.0625 4.5
1 0 0.0625 4.5
1 5 0.0625 4.5
2 11 0.0625 4.5
3 2 0.0625 4.5
4 3 0.0625 4.5
4 13 0.0625 3.0
5 1 0.0625 4.5
5 4 0.0625 4.5
5 14 0.0625 3.0
6 5 0.0625 4.5
7 6 0.0625 4.5
8 7 0.0625 4.5
8 17 0.0625 2.0
8 9 0.0625 4.5
9 18 0.0625 4.5
...
5
7 65
18 98
...

The first part is the list of nodes. It describes the point coordinate in the grid. It's only necessary if you want to plot the grid. Otherwise you will need only the second part. This one is a "table like" list that shows the first node, the second node and any amount of data that connects the two. In this sample, I’m giving a distance in miles and a time in minutes. So, the line...

0 1 0.0625 4.5

... means that the node 0 is connected to the node 1 and their distance is 0.0625 miles and it takes 4.5 minutes to go from 0 to 1.

The next line...

1 0 0.0625 4.5

... says that there is also a connection from 1 to 0 making it bidirectional. The values are the same but it's not necessarily true. You could have different values if you are coming or going.

The last part of the input just gives a list of cases we want to use to get some results over the code. So... The section that really matters in the input is the second one. That data can come from anywhere, a database, a GPS batch file, GoogleMaps.

## Let's see the code...

The whole solution can be found in only two lines of the code:

...
_distanceSolver = new Dijkstra(_numberOfNodes, GetDistance);
...
var solution = _distanceSolver.Solve(_cases[i].StartId, _cases[i].EndId);
...

The constructor of the Dijkstra class is very simple and just initializes the internal fields...

public Dijkstra(int numberOfNodes, Func<int, int, decimal> values)
{
if (numberOfNodes < 0) throw new ArgumentException("The number of nodes must be positive.");
if (values == null) throw new ArgumentNullException("values");
NumberOfNodes = numberOfNodes;
_values = values;
}

But it holds the main feature of this solution. By passing the values as a delegated function to the class, no static matrix is required at all.

The values are fetched "on demand". And you have total control over the input and therefore over what is served to Djikstra.

You just need to keep in mind that Djikstra is a cumulative algorithm so all valid values must be non negative. A value of -1 means that those nodes are not connected, but in our way of representing the nodes the better thing to do would be just take them out of the list.

That control gives some important advantages:

1 - All validation can be made inside the function. No exceptions need to be generated during the solution interactions, just return -1.

2 - The values really do not need to be static. You can build functions that alter the output of the function depending of some extra information.

For instance, if you want that the GetTime output changes along the the day (due traffic for exemple), maybe the GetTime function could have this signature: Func<int, int, DateTime, decimal>.

Imagine that you would normally take 2 hours to go from one place to another, but in the middle of the way (after 1 hour) the rush hour starts. The values for time you started with are not valid anymore and have to be adjusted. That 2 hour trip (in normal traffic) became a 2,5 hour trip (with part of it during rush hour). That's a more precise output.

The possibilities are endless.

The most important part of this code is not exposed. Is the Interaction function.

private void Iteraction()
{
var source = (from r in _nodes
where !r.Done
&& r.Path.Any()
orderby r.Path.Sum(p => p.Value)
select r).First();
var connectedNodes = (from r in _nodes
where r.Node != source.Node
&& r.Node != _startNode
&& _values(source.Node, r.Node) != -1
&& (!r.Path.Any() || r.Path.Sum(v => v.Value) > (source.Path.Sum(p => p.Value) + _values(source.Node, r.Node)))
select new { node = r, newValue = _values(source.Node, r.Node) }).ToList();
connectedNodes.ForEach(i => {
i.node.SetPath(source.Path);
i.node.AddStep(source.Node, i.newValue);
});
source.Done = true;
}

This is where Djikstra magic occurs and its logic was almost not changed, just improved with LINq and without any explicit static arrays.

The only list that is always allocated during the life span of the solution is the internal `_nodes`

list. That list has the maximum dimension of `NumberOfNodes.`

So it's really tiny.

Parallelism may be an alternative here but I will talk about that in the next part of this article.

One final thing I'd like to mention here is the return value of the Solve funtion.

public IEnumerable<int> Solve(int startNode, int endNode)
{
if (startNode < 0 || startNode >= NumberOfNodes) throw new ArgumentException("The start node must be between zero and the number of nodes");
if (endNode < 0 || endNode >= NumberOfNodes) throw new ArgumentException("The end node must be between zero and the number of nodes");
_startNode = startNode;
Reset();
for (var i = 1; i < NumberOfNodes; i++) Iteraction();
GetNode(endNode).AddStep(endNode, 0);
return GetNode(endNode).Path.Select(p => p.Node);
}

I've tried many outputs but I found out that the best one (for me) was the list of nodes holding the calculated path.

Since I have the function(s) that gives me the values between two nodes, if I receive the chain of nodes that connects the start to the end, I can zip then and fetch the values for each step individually. An even apply a different function to compare results.

That's what is done in the solution available to download.

Well... Here is the complete code of the Dijkstra class:

public class Dijkstra
{
private int _startNode = 0;
private List<destination> _nodes;
private Func<int, int, decimal> _values;
public int NumberOfNodes { get; private set; }
public bool IsSolved { get { return !_nodes.Any(n => !n.Done); } }
public Dijkstra(int numberOfNodes, Func<int, int, decimal> values)
{
if (numberOfNodes < 0) throw new ArgumentException("The number of nodes must be positive.");
if (values == null) throw new ArgumentNullException("values");
NumberOfNodes = numberOfNodes;
_values = values;
}
public void Reset()
{
if (_nodes == null) _nodes = new List<destination>();
_nodes.Clear();
for (var node = 0; node < NumberOfNodes; node++)
{
_nodes.Add(new Destination(node, node == _startNode));
var value = _values(_startNode, node);
if (value != -1) GetNode(node).AddStep(_startNode, value);
}
}
private Destination GetNode(int n)
{
return _nodes.SingleOrDefault(i => i.Node == n);
}
private void Iteraction()
{
var source = (from r in _nodes
where !r.Done
&& r.Path.Any()
orderby r.Path.Sum(p => p.Value)
select r).First();
var connectedNodes = (from r in _nodes
where r.Node != source.Node
&& r.Node != _startNode
&& _values(source.Node, r.Node) != -1
&& (!r.Path.Any() || r.Path.Sum(v => v.Value) > (source.Path.Sum(p => p.Value) + _values(source.Node, r.Node)))
select new { node = r, newValue = _values(source.Node, r.Node) }).ToList();
connectedNodes.ForEach(i => {
i.node.SetPath(source.Path);
i.node.AddStep(source.Node, i.newValue);
});
source.Done = true;
}
public IEnumerable<int> Solve(int startNode, int endNode)
{
if (startNode < 0 || startNode >= NumberOfNodes) throw new ArgumentException("The start node must be between zero and the number of nodes");
if (endNode < 0 || endNode >= NumberOfNodes) throw new ArgumentException("The end node must be between zero and the number of nodes");
_startNode = startNode;
Reset();
for (var i = 1; i < NumberOfNodes; i++) Iteraction();
GetNode(endNode).AddStep(endNode, 0);
return GetNode(endNode).Path.Select(p => p.Node);
}
private class Destination
{
public Destination(int n, bool d)
{
Node = n;
Path = new List<origin>();
Done = d;
}
public int Node { get; private set; }
public List<origin> Path { get; private set; }
public bool Done { get; internal set; }
public void AddStep(int d, decimal v)
{
Path.Add(new Origin(d, v));
}
public void SetPath(List<origin> p)
{
Path = new List<origin>(p);
}
}
private class Origin
{
public Origin(int n, decimal v)
{
Node = n;
Value = v;
}
public int Node { get; private set; }
public decimal Value { get; internal set; }
}
}

That code can handle a really large set of nodes.

## The Output

In the solution available to download it's shown how to use two different functions: One to fetch the shortest and the other to fetch the fastest path. You will see they (almost always) are not the same.

**Before running the code in DEBUG mode, change your console output to 160 characters per line for a better view.** It will show a representation of the link matrix. That is just to demonstrate how sparse it is. Each character is one connection between two nodes. It represents a 120x120 character matrix with a blank where there is no connection and a dot representing an existing connection. Notice how the connections go side by side the main diagonal. It means that all nodes almost always connect to other nearby node. Also the main diagonal itself has no dots meaning that there is no connection between the node and itself.

That clearly shows how much memory is wasted if we allocate a full static bi-dimensional array.

If run it in the RELEASE mode the output will be send only to a file. That will show how fast this code can be.

Here is a sample of the DEBUG mode output:

Link Matrix:
.
. .
.
.
. .
. . .
.
.
. . .
.
.
. . .
. .
. . .
. .
. .
. .
. . .
.
. .
. .
.
. .
. . .
. . . .
. . .
. . .
. . .
. . . .
. . .
. . .
.
.
. .
. .
. . .
.
. .
. .
. . .
. .
. .
. .
. .
. .
.
. .
. . .
. .
. .
.
.
. .
. .
. . .
.
. .
. . .
. .
. .
. .
.
. .
. .
. . .
. .
. .
. .
. . .
. .
.
.
. .
. .
. . .
. .
. .
. .
. . .
. .
. .
. .
. .
. . .
. .
. .
. .
. . .
. .
.
.
. . .
. . .
. . . .
. . .
. . .
. . .
. . . .
. . .
. .
.
. . .
.
. .
. .
. .
. .
.
. .
.
. .
. .
.
. .
.
. .
.
.
.
.
Case 1:
Shortest: 7->6->5->4->13->24->35->44->54->64->65: 0,625 miles, 29,5 minutes; Solution Time:0:00:00:00,0900000;
Fastest : 7->6->5->14->25->24->35->44->54->64->65: 0,625 miles, 28,0 minutes; Solution Time:0:00:00:00,0810000;
Case 2:
Shortest: 18->29->40->48->58->69->79->88->98: 0,500 miles, 25,5 minutes; Solution Time:0:00:00:00,0800000;
Fastest : 18->29->28->39->47->57->68->78->87->97->98: 0,625 miles, 21,0 minutes; Solution Time:0:00:00:00,0810000;
Case 3:
Shortest: 10->11->12->13->14->15->16->17->18->29->40->48->49->50->60->61: 0,938 miles, 52,5 minutes; Solution Time:0:00:00:00,0810000;
Fastest : 10->11->12->13->24->25->26->27->28->39->47->48->49->50->60->61: 0,938 miles, 44,0 minutes; Solution Time:0:00:00:00,0810000;
Case 4:
Shortest: 2->11->12->13->14->15->16->17->18->29->40->48->58->69->79->88->98->107->108->115->119: 1,250 miles, 67,5 minutes; Solution Time:0:00:00:00,0810000;
Fastest : 2->11->12->13->24->25->26->27->28->39->47->57->68->78->87->97->98->107->108->115->119: 1,250 miles, 53,0 minutes; Solution Time:0:00:00:00,0810000;
Case 5:
Shortest: 81->82->73->63->53->43->34->23->24->25->26->27->38: 0,750 miles, 32,0 minutes; Solution Time:0:00:00:00,0810000;
Fastest : 81->82->83->74->64->54->44->35->24->25->26->27->38: 0,750 miles, 27,0 minutes; Solution Time:0:00:00:00,0810000;

## A Chalenge

I haven't found yet the memory limitations of that code. The time limitation for a single fetch starts to appear if you go above 1000 nodes (not including the time used to load the data).

Here is a table with the approximated times I obtained for a NxN 2D map with all connections set as bidirectional:

# of Links | # of Iteractions | Total Time (milliseconds) | Average Iteraction Time (milliseconds) |

360 | 100 | 80 | 0.8 |

840 | 225 | 660 | 2.9 |

1520 | 400 | 3640 | 9.1 |

2400 | 625 | 12940 | 20.7 |

3480 | 900 | 37800 | 42.0 |

4760 | 1224 | 96800 | 79.1 |

**So now is the processing time that is increasing exponentially.**

If someone want to try to find memory limit please don't forget to post here your benchmark result also.

## What is Next?

This article deals with a imaginary grid with only 120 nodes that gives us around 240 links. In the real world even a small neighborhood in a small city can have around 10000 nodes. Any simple game map or maze will have much more than that.

And what happens when we need to go to a even greater area. For instance how to connect a place in New York with one in Los Angeles. Imagine the number of nodes. After a very raw and superficial calculation I’ve got that this number is in the order of 10^{10} (!!) nodes.

Not even the best machine could do that in a reasonable time with the code above. We know that Google does that. So... Where is the trick?

In the next article I will talk about one possible approach to solve that kind of problem.

Well... That's all for now folks...