13,256,716 members (50,281 online)
alternative version

Stats

44.8K views
54 bookmarked
Posted 31 Jan 2009

An Object-oriented Approach to Finite State Automata

, 3 Feb 2009
 Rate this:
A brief introduction to FSA and a ready-to-use class library

Introduction

From Wikipedia:

A finite state machine (FSM) or finite state automaton (plural: automata) or simply a state machine, is a model of behavior composed of a finite number of states, transitions between those states, and actions. A finite state machine is an abstract model of a machine with a primitive internal memory.

Using the Code

In this article, I will focus on Moore machines. The easiest way to understand Moore FSA is to analyze its graph.

Example 1

• Symbols in the circles are states and they make up a set `{q0, q1}`. Every state must have a unique symbol.
• Labels written by every state `{y0, y1}` describe data which is generated when the machine enters the state.
• The arrows represent transitions between states. The `{z0, z1}` set can also be called an "input alphabet".

Concluding, a Moore FSA is composed of states with assigned output symbols and transitions between those states. A Moore FSA is a specific case of definition from Wikipedia.

Let's find out how it works. If we were in the `q0` state, then sending `z0` signal would not change anything -- we would stay in the same place and repeatedly produce `y0` (check it on the graph!). However, sending `z1` would move us to the `q1` state and produce `y1`. Now the situation is similar: `z0` = stay here; `z1` = go back to the `q0` state.

This machine can be coded as follows:

```MooreAutomaton moore = new MooreAutomaton(new MooreState[] {
// state q0
new MooreState {
Table = new int[] {
0, // z0: stay in q0
1, // z1: move to q1
},
Output = 0
},
// state q1
new MooreState {
Table = new int[] {
1, // z0: stay here
},
Output = 1
}
});```

Example 2

The automaton above detects words in a string assuming that they can be separated by white spaces only. If any other character was encountered, the machine is supposed to enter a sink state. State Q is a sink state if and only if there are no transitions leading from Q to another state. Again, we assume that the initial state is `q0`.

Sending a `string `"`abc de #fg`" would give an output `1110110222`. Note that the machine gives an output when entering a state. This machine can be easily implemented with an `AsciiTableGenerator` class:

```const int sNothing = 0; // avoid "magic numbers" in your code!
const int sWord = 1;
const int sError = 2;

MooreAutomaton moore = new MooreAutomaton(new MooreState[] {
// q0 = sNothing
new MooreState {
Table = new MooreAsciiTableGenerator {
EmptyTransition = sError, // "any other"
Expressions = { // put Regex here
{ @"\s", sNothing },
{ @"\w", sWord }
}
}.GenerateTable(),
Output = 0    // between words
},
// q1 = sWord
new MooreState {
Table = new MooreAsciiTableGenerator {
EmptyTransition = sError,
Expressions = {
{ @"\s", sNothing },
{ @"\w", sWord }
}
}.GenerateTable(),
Output = 1    // inside a word
},
// q2 = sError
new MooreState {
Table = new MooreAsciiTableGenerator {
EmptyTransition = sError
}.GenerateTable(),
Output = 2    // error
}
});```

The `AsciiTableGenerator` generates a table of 128 integers (ASCII numbers for printable characters). The regular expressions will be evaluated just once when a table is created, so the index of an character is equal to its ASCII value. This solution gives us an `O(1)` complexity of finding the next state.

If you wish to support all the UNICODE characters, you may want to use a `ParsingAutomaton`:

```ParsingAutomaton parser = new ParsingAutomaton(new ParsingState[] {
new ParsingState {
EmptyTransition = sError,
Expressions = {
{ @"\s", sNothing },
{ @"\w", sWord }
},
Output = 0
},
new ParsingState {
EmptyTransition = sError,
Expressions = {
{ @"\s", sNothing },
{ @"\w", sWord }
},
Output = 1
},
new ParsingState {
EmptyTransition = sError,
Output = 2
}
});```

The declaration is similar to the previous one but it evaluates Regex at run-time. Both machines can be executed using the following code:

```for (int i = 0; i < inputString.Length; i++)
Console.Write(automaton.Send(inputString[i]));```

BTW using a `fixed` statement instead of `inputString[i]` is generally recommended.

Example 3 : The INI File Parser

If you are looking for a good INI file parser, then navigate to my article "INI files". Now I'm going to discuss a simple INI file parser which is built of a Moore machine.

First of all, there is another useful visual representation of a Moore machine. Let's examine a structure of an INI file, shall we?

```[SectionName]
Key=Value```

I can see seven possible states here:

```<sub>0</sub>[<sub>1</sub>SectionName<sub>2</sub>]<sub>3</sub>
<sub>0</sub>Key<sub>4</sub>=<sub>5</sub>Value<sub>6</sub>```

Now we have to analyze the expression above and decide when a parser should move from one state to another. The important stage of creating an FSA is to test it. A complete solution is included in a test project.

Generating a Super-fast Moore Code

Every Moore FSA can be converted to a program similar to this:

```fixed (int* _q = Q) { // Q - table of states
int* q = _q;
int y_i = 0; // Y - output buffer
int[] Y_old;
fixed (int* _r = R) { // R - state-output table
fixed (int* _z = Z) { // Z - input alphabet
for (int* z = _z; *q >= 0 && z - _z < Z_len; z++, y_i++) {
int nextq = *(q + *z);
Y[y_i] = *(_r + nextq);
q = _q + stateLen * nextq;
if (y_i >= buferLen - 1) {
Y_old = Y; // reallocate if needed
Y = new int[buferLen *= 2];
Array.Copy(Y_old, Y, Y_old.Length);
}
}
}
}
Y_old = Y;
Y = new int[y_i];
Array.Copy(Y_old, Y, Y.Length); // update the size of the buffer
}
return Y;```

The method `FastRun(int[] input)` does a quick execution and returns the output as an `int[]` table. Internally, it generates these `Q` and `R` tables and runs the above algorithm. It is quite fast and makes Moore automata programming a considerable solution for performance-critical problems. More documentation can be found around the `MooreTables` structure.

Useful Wrappers for FSA

Each `XxxAutomaton` class derives from the `FiniteStateAutomaton` abstract. It provides a set of basic features and can be used as a bridge between the machine core (e.g. a `MooreAutomaton` or a `MealyAutomaton`) and a wrapper class.

Mapped & Match Automata

As you probably noticed, both input and output of a `MooreAutomaton` are integers. In some cases it is sufficient, but usually we need to send an input of type `TInput` and receive an output of a type `TOutput`.

This functionality is provided by the `MappedAutomaton` and `MatchAutomaton` classes. For example, we can map the output of an FSA from example 2.

```OutputMappedAutomaton<string> mapped = new OutputMappedAutomaton<string>(moore) {
OutputMap = {
{0, " SPACE "},
{1, " WORD "},
{2, " ERROR "},
}
};
mapped.Reset(); 	// remember to reset the machine if it is a wrapper
// to a machine which might be used before!
for (int i = 0; i < length; i++)
Console.Write(mapped.SendAndMap(s[i]));```

The output:

```WORD  WORD  WORD  SPACE  WORD  WORD  SPACE  ERROR  ERROR  ERROR
```

A matching (any idea for a better name?) automaton uses a `Converter<TInput, TOutput>` delegate to map the external input to internal input and the internal output to external output.

```MatchAutomaton<byte, string> mapped = new MatchAutomaton<byte, string>(moore) {
OutputMatch = output => string.Format(" q{0} ", output),
InputMatch = input => (int)input
};```

The EventedAutomaton

The evented automaton is extremely useful since it memorizes its recent history and fires events which notify about any change.

```EventedAutomaton evented = new EventedAutomaton(moore);
evented.OutputChanged += delegate (object sender, FsaEventArgs e) {
// of course an enumeration should be used here to avoid "magic numbers"
if (e.PreviousOutput == 1) {
StringBuilder build = new StringBuilder();
foreach (int c in e.Automaton.OutputInput) {
build.Append((char) c);
}
Console.WriteLine("Word: {0}", build);
}
};
evented.StateChanged += delegate(object sender, FsaEventArgs e) {
Console.WriteLine("Transition: q{0} -> q{1}", e.State, e.CurrentState);
};```

The output:

```Transition: q0 -> q1
Word: abc
Transition: q1 -> q0
Transition: q0 -> q1
Word: de
Transition: q1 -> q0
Transition: q0 -> q2```

Asynchronous Execution

The `FiniteStateAutomaton abstract` class provides a few nice features. For example, it supports an asynchronous execution of an FSA.

A sample from the IniFile.cs file:

```void OpenAsync(Stream stream)
{
Pipe outputPipe = new Pipe();
IAsyncResult asyncResult = mooreAutomaton.RunAsync(new RunParameters {
Input = stream,
Output = outputPipe
});```

However we cannot blame the `RunAsync` method itself (it is a very simple thing), the code in the IniFile.cs is very slow. A better solution for the thread synchronization problem must be found. Feel free to leave comments and suggestions in the message board below.

Points of Interest

The difference between Moore and Mealy FSA is a place where the output symbols are. In Moore machines, they were stored as part of a state definition. On the other hand, the output in Mealy machines is assigned to transitions. This approach might greatly reduce a number of states and improve performance but each transition consumes twice as much memory as before. Example 2 can be defined as a Mealy machine below:

```MealyAutomaton mealy = new MealyAutomaton(new MealyState[] {
new MealyState {
Table = new MealyAsciiTableGenerator {
EmptyTransition = new MealySymbol { Output = 2, State = 1 },
Expressions = {
{ @"\s", new MealySymbol { Output = 0, State = 0 } },
{ @"\w", new MealySymbol { Output = 1, State = 0 } }
}
}.GenerateTable()
},
new MealyState {
Table = new MealyAsciiTableGenerator {
EmptyTransition = new MealySymbol { Output = 2, State = 1 }
}.GenerateTable()
}
}
);```

We can also use a `ToMealy` conversion method which does all the optimization in O(n3):

`MealyAutomaton mealy = moore.ToMealy();`

Note about the `MealyAsciiTableGenerator` class: No special implementation is needed here. The definition of the `MealyAsciiTableGenerator` class is as below:

```// to make life easier and don't need to put everywhere those nasty < > brackets
public class MealyAsciiTableGenerator : AsciiTableGenerator<MealySymbol>

{
}```

The output is obviously the same: `1110110222`.

Performance Comparison

```Generate: 00:00:03.0150000 // 1000 keys and 1000 values
Save to a file: 00:00:00.6060000

Parsing

Classic: 00:00:01.9270000		(100%)  // the IniFileLight from my another article
Mealy automaton: 00:00:00.8690000	(45,1%) // fewer transitions
Moore automaton: 00:00:02.3850000	(123,77%)// fairly comparable with classic parsing
Parsing automaton: 00:00:06.6780000	(346,55%)// regex at run-time
Asynchronous Moore automaton: 00:00:09.2800000 (481,58%)    // need to be improved
Fast Moore: 00:00:00.8860000	(45,98%)// ASCII-only, but hey not only for parsing```

A Bonus Content

I give you some bonus classes.

1. The `Pipe` is a multithreading-oriented memory stream. It is supposed to be read and written from two or more threads. Both `Write` and `Read` methods block the current thread until it is possible to obtain a requested amount of data or until there is enough space to write. It uses a fixed-sized buffer which can be defined in a constructor. Closing the stream disallows writing to it but data can still be read from the buffer. This prevents from a lock-forever state when a source of data finishes its job and the reader has its own buffer (for example a `StreamReader`). There is also a time-out support.
2. The `RestictedList<T>` is a class which derives from the `List<T>`. It gives a possibility to set criteria which items must meet to be added to a list. If a user tried to add an item which didn't meet the criteria, an exception would be thrown. The criteria are defined by a `Predicate<T>` delegate.
3. `RestictedDictionary<TKey, TValue>` - analogous to the `RestrictedList<T>` class.

Extending FSA Functionality

You need to implement 3 methods to make an extension to a standard FSA machine.

```public class MyAutomaton : FiniteStateAutomaton
{

public MyAutomaton(FiniteStateAutomaton fsa)
{
this.fsa = fsa;
}

public override void Reset()
{
// Reset the current state of your machine (not a whole structure!)
fsa.Reset();
}
public override int Send(int input)
{
int output = fsa.Send(input);
// and probably here as well
return output;
}
public override FiniteStateAutomaton ShallowCopy()
{
MyAutomaton copy = new MyAutomaton(fsa.ShallowCopy());
// do a shallow copy of all new fields.
return copy;
}
}```

Summary

In this article, I tried to prove that FSA systems are not as obsolete technology as they are considered. I am sure that there are cases where FSA could be used even as a design pattern. With a well-designed automaton, a fast and robust system can be created. Since every automaton may be presented in a graph form, a myth about obscurity of FSA is false. Even if a graph was lost, with a little effort we could develop a software which generates such graphs or just take a sheet of paper and make a damn artwork. Because of their generic nature, bugs like not investigated cases can be discovered quickly, even without a human contribution.

I hope that I managed to refresh the FSA technique well enough to bring it back to developers. Happy graphing!

TODOs

1. Improve performance of asynchronous operations
2. Find a better name for the `MatchAutomaton`
3. As usual, track down bugs (so far so good)

History

• 31st January, 2009: First version posted
• 3rd February, 2009: Article updated

Share

 Software Developer (Junior) Poland
My name is Jacek Gajek. I have graduated in computer science with a master's degree from Polibuda in Wrocław. I like C# and Monthy Python's sense of humour.

You may also be interested in...

 Pro

 First Prev Next
 More advanced alternatives B Todd Stout3-Feb-09 2:24 B Todd Stout 3-Feb-09 2:24
 Re: More advanced alternatives gajatko3-Feb-09 8:31 gajatko 3-Feb-09 8:31
 Thanks for reading and useful info. I agree that my library is very simple and probably not applicable in a big development. Maybe somebody will find it useful or informative after all. Anyway, thanks for your post and the links provided. Greetings - Gajatko Portable.NET is part of DotGNU, a project to build a complete Free Software replacement for .NET - a system that truly belongs to the developers.
 Pipe.cs and FSA.ITableGenerator not in zip Karijn2-Feb-09 22:57 Karijn 2-Feb-09 22:57
 Re: Pipe.cs and FSA.ITableGenerator not in zip gajatko3-Feb-09 8:24 gajatko 3-Feb-09 8:24
 Boost: statechart Jon Summers2-Feb-09 22:36 Jon Summers 2-Feb-09 22:36
 Re: Boost: statechart gajatko3-Feb-09 8:22 gajatko 3-Feb-09 8:22
 Last Visit: 31-Dec-99 19:00     Last Update: 23-Nov-17 0:39 Refresh 1