12,241,814 members (53,013 online)
Add your own
alternative version

40.3K views
146 downloads
22 bookmarked
Posted

# Reuse of iteration algorithms

, 5 Sep 2007
 Rate this:
Please Sign up or sign in to vote.
Examples of decoupling iteration algorithm from actions on collection items

## Introduction

Using the `foreach` loop is easy. It is so straightforward and natural that I rarely use any other loop construct. In fact, when it's time to use the `for` loop, I have to concentrate and recall the syntax.

I have to deal with a lot of tree traversals. The obvious choice is to have a `foreach` loop to go over all elements of a root, making a recursive call for each node that has children. This works out great until one notices that there are about twenty functions that have a `foreach` loop at their core together with a bunch of statements that operate on the same data type.

Using flags in order to exploit common pieces of code did not work out well. The functions simply became unreadable and much more error-prone. Something radical had to be done about this problem.

This article does not contain directly reusable code. Rather, it presents an idea on how to separate an algorithm that iterates over a custom collection from actions done on each item of that collection. The example code works with `DirectoryInfo` and `FileInfo` objects, but at no point does it modify files. The file system is used simply because it is a tree structure that is available on every computer. For simplicity reasons, there is no error handling and no security checks.

## Separating algorithm from data

Staring at a bunch of functions that differ ever so slightly did have a good side effect after all. There are two things to note: first, the algorithm is always a `foreach` loop with a recursive call. Second, we are working with objects of the same data type, or a very limited number of data types.

The iteration algorithm does not care what happens to each object that it visits. Iteration's responsibility is simply to visit each and every item of a collection. Then who will do the useful work on visited items? A callback function, of course! We know the data type we are working with, that's the one in the `foreach` statement.

Let's use the file system as an example of a tree structure that we want to iterate over. For example, a function that displays all file system objects under a certain `root` directory could look as follows:

```private void DisplayAllElements(DirectoryInfo Root)
{
foreach (DirectoryInfo dirInfo in Root.GetDirectories())
{
DisplayAllElements(dirInfo);
System.Diagnostics.Debug.WriteLine(string.Format("Dir\t{0}",
dirInfo.FullName));
}
foreach (FileInfo fileInfo in Root.GetFiles())
{
System.Diagnostics.Debug.WriteLine(string.Format("File\t{0}",
fileInfo.FullName));
}
}
```

Let's modify it such that iteration code is completely separate from the output code. We are using file system as an example and will be dealing with `DirectoryInfo` and `FileInfo` instances. Declare callback types that take `DirectoryInfo` and `FileInfo` objects as parameters:

```private delegate void FileInfoHandler(FileInfo File);
private delegate void DirectoryInfoHandler(DirectoryInfo Directory);
```

Write the functions that perform actual work on items visited by iteration:

```private static void ShowDirectoryDetails(DirectoryInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format("Dir\t{0}",
Info.FullName));
}
private static void ShowFileDetails(FileInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format("File\t{0}",
Info.FullName));
}
```

Instantiate delegates. That is, create variables of delegate types that point to the functions above:

```private FileInfoHandler ProcessFile = new FileInfoHandler(ShowFileDetails);
private DirectoryInfoHandler ProcessDirectory = new DirectoryInfoHandler
(ShowDirectoryDetails);
```

Finally, change the iteration algorithm to use the instantiated delegates instead of function calls:

```private void DisplayAllElements(DirectoryInfo Root)
{
foreach (DirectoryInfo dirInfo in Root.GetDirectories())
{
DisplayAllElements(dirInfo);
ProcessDirectory(dirInfo);
}
foreach (FileInfo fileInfo in Root.GetFiles())
{
ProcessFile(fileInfo);
}
}
```

Now the iteration code is completely separate from the output code.

## Improving design of iteration function

Though we are not displaying directory and file information inside the `foreach` loop anymore, there is no real difference between the new and the original `DisplayAllElements` functions. Calling a function callback was a step in the right direction. Let's improve it even more.

In my opinion, the iteration algorithm should be stateless. To maintain state it would need some knowledge about contents of nodes it traverses. This complicates the algorithm and makes it more difficult to verify its correctness. My approach is to keep it very simple. A `static` function will do.

```public static class Iterators
{
public delegate void FileInfoHandler(FileInfo File);
public delegate void DirectoryInfoHandler(DirectoryInfo Directory);

public static void IterateFileSystemInfo(
DirectoryInfo Root,
DirectoryInfoHandler DirectoryProcessor,
FileInfoHandler FileProcessor)
{
foreach (DirectoryInfo dirInfo in Root.GetDirectories())
{
IterateFileSystemInfo(dirInfo, DirectoryProcessor, FileProcessor);
if (DirectoryProcessor != null)
DirectoryProcessor(dirInfo);
}
foreach (FileInfo fileInfo in Root.GetFiles())
{
if (FileProcessor != null)
FileProcessor(fileInfo);
}
}
}
```

`IterateFileSystemInfo` is almost identical to the modified `DisplayAllElements`. The only difference, except for the callbacks' names, is the fact that callbacks are provided as parameters and checked for `null` before being called. This allows certain flexibility. For example, we may not wish to display directory names. In this case we can pass a `null` parameter for `DirectoryInfoHandler`. All subdirectories and files will still be visited, though directory entries will not be "processed".

`Static` `Iterators` class could have several functions that traverse a collection in a certain way. The algorithm above is depth-first. One could include functions that do a breadth-first search.

## Wrapping callback functions

The real work on collection members is done by the callbacks, which are provided to the iteration algorithm. We could simply have several instantiated delegates and call `IterateFileSystemInfo` using those delegates. However, design can be improved a lot if function callbacks are wrapped into their separate classes. It will let all object oriented design techniques to be applied to classes containing the callbacks. For example, Decorator pattern could be chosen for the processor class design. Passing parameters to callbacks can be done the same way as in "Passing parameters to predicates".

The goal is to make a wrapper class for the following functions:

```private static void ShowDirectoryDetails(DirectoryInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format("Dir\t{0}",
Info.FullName));
}

private static void ShowFileDetails(FileInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format("File\t{0}",
Info.FullName));
}
```

Here is an example of such a class:

```public class DisplayAllProcessor
{
#region Delegate properties

public Iterators.DirectoryInfoHandler DisplayDirInfo
{
get { return MyDisplayDirInfo; }
}

public Iterators.FileInfoHandler DisplayFileInfo
{
get { return MyDisplayFileInfo; }
}

#endregion Delegate properties

#region Functions returned by delegate properties

private void MyDisplayDirInfo(DirectoryInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format
("Dir\t{0}", Info.FullName));
}

private void MyDisplayFileInfo(FileInfo Info)
{
System.Diagnostics.Debug.WriteLine(string.Format
("File\t{0}", Info.FullName));
}

#endregion Functions returned by delegate properties
}
```

How can this new class be used with the `IterateFileSystemInfo` tree traversal algorithm? We need a new instance of the `DisplayAllProcessor` class. Delegate properties of that instance are used as input parameters for the tree traversal algorithm. As the tree traversal algorithm executes, it will fire `MyDisplayDirInfo` and `MyDisplayFileInfo` callbacks of `DisplayAllProcessor` instance. The way to use it is as follows:

```private void DisplayAllElements(DirectoryInfo Root)
{
// Create a processor that will display all FileSystemInfo objects.
DisplayAllProcessor dap = new DisplayAllProcessor();
// Execute the iteration algorithm using the processor.
Iterators.IterateFileSystemInfo(
Root,
dap.DisplayDirInfo,
dap.DisplayFileInfo);
}
```

## Inheritance of wrapper classes

My testing program for this article is a WinForms application. It displays output of all callbacks in two columns of a `ListView`. One can immediately demonstrate benefits of wrapping callbacks in a class: storage of results that UI needs is built into a base class:

```public struct DisplayItem
{
public string Info;
public string Message;
}

public class ProcessorBase
{
// Messages from processors to show on UI.
protected List<DisplayItem> m_DisplayItems = new List<DisplayItem>();

// Gets the messages to show on UI.
public virtual List<DisplayItem> DisplayItems
{
get { return m_DisplayItems; }
}
}
```

Instead of `Debug.Writeline()` calls, all processors will add display items to the list in the base class. When iteration runs to completion, property `DisplayItems` will contain all the data that UI needs. The new `DisplayAllElements` looks as follows, where `ShowOutput` simply fills the output `ListView`.

```private void DisplayAllElements(DirectoryInfo Root)
{
// Create a processor that will display all FileSystemInfo objects.
DisplayAllProcessor dap = new DisplayAllProcessor();
// Execute the iteration algorithm using the processor.
Iterators.IterateFileSystemInfo(
Root,
dap.DisplayDirInfo,
dap.DisplayFileInfo);
// Output is ready.
ShowOutput(dap.DisplayItems);
}
```

The infrastructure has been set up. Let's have a couple of examples how it can be used.

## Maintaining state

State will be maintained by "processors", or instances of wrapper classes. Let's find the maximum and the average file sizes. To do so, we need to keep track of three things: the maximum file size seen so far, the combined size of all files seen, and the number of files seen. These will be member variables of our class.

```public class FindAvgAndMaxProcessor : ProcessorBase
{
private long m_MaxFileSize = 0;
private long m_TotalFileSizes = 0;
private long m_TotalFiles = 0;
. . . .
. . . .
}
```

The counters are updated in the callback function.

```protected virtual void MyCountFileSizes(FileInfo Info)
{
// Keep track of the biggest file.
if (Info.Length > m_MaxFileSize)
{
m_MaxFileSize = Info.Length;
m_MaxFileInfo = Info;
}

// Keep track of file count and the sum of all file sizes.
++m_TotalFiles;
m_TotalFileSizes += Info.Length;
}
```

Results are needed once iteration has run to completion. We know that `DisplayItems` property is normally accessed when iteration algorithm has finished its job. Therefore, statistics can be calculated in the `DisplayItems` property to take some calculations out of the iteration.

```public override List<DisplayItem> DisplayItems
{
get
{
m_DisplayItems.Clear();

DisplayItem diAvg;
diAvg.Info = "Avg file size";
long lAverage = m_TotalFiles == 0 ? 0 :
m_TotalFileSizes / m_TotalFiles;
diAvg.Message = string.Format("{0} bytes", lAverage.ToString());
DisplayItem diMax;
diMax.Info = "Max file size";
string strFileName = m_MaxFileInfo == null ? "None" :
m_MaxFileInfo.FullName;
diMax.Message = string.Format("File name: {0}, size:
{1} bytes", strFileName, m_MaxFileSize);

m_DisplayItems.Add(diAvg);
m_DisplayItems.Add(diMax);

return base.DisplayItems;
}
}
```

The `FindAvgAndMaxProcessor` class is used the same way as previous processor classes. The only difference is that instead of a list of visited files and directories it displays two rows of statistics.

## Providing parameters for callbacks

Suppose, we want to find the maximum and average sizes of .EXE and .XML files for a given directory. A processor that takes file extension as parameter is in order. Since callback functions are wrapped in a class, it is simply a matter of having a `public` property. The new processor class is so tiny that I will include it here in its entirety.

```public class FindAvgAndMaxByExtProcessor : FindAvgAndMaxProcessor
{
private string m_Extension;

public FindAvgAndMaxByExtProcessor(string Extension)
{
m_Extension = Extension;
}

public string Extension
{
get { return m_Extension; }
set
{
m_Extension = value;
ClearStats();
}
}

#region Functions returned by delegate properties

protected override void MyCountFileSizes(FileInfo Info)
{
if (0 == string.Compare(Info.Extension, m_Extension,
StringComparison.CurrentCultureIgnoreCase))
{
base.MyCountFileSizes(Info);
}
}

#endregion Functions returned by delegate properties
}
```

We let the base class keep the count. Child class controls which files to process by comparing their extension to a member variable.

The `FindAvgAndMaxByExtProcessor` class allows selecting different sets of items to be processed. If a class allows that, the state data has to be taken care of when a different subset of items is selected. When setting a new file extension to match in the example above, we have to reset all counters to `0`. Otherwise the example would produce combined statistics for both .EXE and .XML files, which is not what was intended.

`FindAvgAndMaxByExt` function demonstrates how the `FindAvgAndMaxByExtProcessor` class can be used.

## Using the code

To run an example from the source code, select a directory, select the test case, and then click the Run Test button.

## Conclusion

The collection iteration algorithm was decoupled from actions performed on each collection item. The algorithm function is a very simple `static` function, whose correctness is easy to verify. Once that was done, not a single looping instruction or recursive call was needed to execute test cases. All "processor" classes are also very simple. Some are literally a couple of lines long.

The iteration algorithm is stateless. It's simple and it does its job. A good reason to have this function in its own class would be to implement the `IEnumerable` interface, for example. I may do this just for the fun of it.

## License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

## About the Author

 Software Developer (Senior) Canada
No Biography provided

## Comments and Discussions

 View All Threads First Prev Next
 What about STL? Michal Blazejczyk12-Sep-07 6:15 Michal Blazejczyk 12-Sep-07 6:15
 Re: What about STL? Alex Perepletov13-Sep-07 16:05 Alex Perepletov 13-Sep-07 16:05
 Last Visit: 31-Dec-99 19:00     Last Update: 29-Apr-16 11:35 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.160426.1 | Last Updated 5 Sep 2007
Article Copyright 2007 by Alex Perepletov
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid