Update October 3, 2014: Math Editor (the full version) has been made Free and Open Source. You can get the full source code from https://github.com/kashifimran/math-editor. I am still using the mini version because it's simpler and is easier to understand due to smaller code size. I encourage the interested users to join the project for further development.
Update April 21, 2013: Added a new section Tips for extending the sample application.
An equation or formula editor is a computer software that helps us typeset mathematical content. In this article I will try to provide the readers with a real world application of Object Oriented design and programming techniques as we build our equation editor. So this article is a kind of double treat!
For simplicity, I am going to use a simplified version of the actual application called Math Editor. Let's call the application we are building Math Editor Mini. As I mentioned above, you can get the full source code from the project's gitbub page.
The language used is C# and we are going to use WPF as our GUI framework. However, the techniques provided are NOT dependent on the programming language or the GUI platform and can be applied in any other OO language like Java.
Before I go any further, I would like to express my gratitude to the STIX Fonts Project for creating such a great free font for mathematical & scientific typesetting.
Here is a screenshot of Math Editor Mini:
I was trying to create an equation editor from scratch in C#. As my work progressed, I realized that the application I was working on was a great example of where Object Oriented concepts like polymorphism, abstraction and code re-use helped very greatly and best of all fitted into the overall design of the software very naturally. Never before had I seen a better real world model (and not a fabricated example) to see all these concepts in action with such vividness.
So I decided I must share my work with the developer community so that other people who wanted to see the OO concepts in a real world application in a most abstract and pure form could benefit from my experience.
There are numerous resources available on the Internet and in the form of text books that teach the basic principles like objects, inheritance, polymorphism, encapsulation and so on. My focus will not be on the jargon as I assume that readers of this article are already familiar with the fundamentals of the subject and have some experience programming in an object oriented language like C# or Java.
In this article I primarily intend to address the following two categories of audience
- Programmers who know the OOP basics but are not able to apply the techniques in real world problems
- Experienced programmers who already know all the tricks but want to build an equation editor and want a starting point.
About my Approach in this Article
My approach in this article is learning through doing. This is not a tutorial on OOP. This article and the accompanying code is intended to be used by intermediate to advanced users. My assumption is that the readers are capable of understanding, using or extending the code on their own after I have presented the main model. I will, however, try to share a few ideas I find useful when creating an OO model for my projects.
Analysis of Object Oriented Design Process
The best way to learn OOP is to apply the techniques in real programs. Only finishing a few examples given in the tutorials and books is not really sufficient as the examples presented there are usually very superficial. The biggest challenge in OO design and programming is identification of objects and inheritance and delegation of responsibilities to different classes. In a typical example found in text books the objects are almost clearly visible and there is hardly any work needed to define their roles and responsibilities. The following figure is a typical example of such a case:
As we can see, the relationship chain given in the above figure is quite natural and there is hardly any effort required on our part to define it. However, we are not always so fortunate to have that kind of natural inheritance hierarchy in real world problems. For example think of GDI+ or WPF. Do the inheritance chains in those platforms really represent a natural underlying system? Did they just pop up from a real existing hierarchy or you think there was some effort needed to build them the way they are? I am sure the answer is obvious to you!
Sometimes the boundaries are so blurry that we have hard time figuring out how to devise a convenient OO model. Even the best of the best can find themselves quite perplexed at times (ask the MFC people!). The only good strategy in such cases is to give yourself a bit of time and keep looking for the invisible objects and the links between them. You may even need to experiment with a few different models before you decide which one to pick!
Designing the OO Model for the Equation Editor
It is time to turn to the specific case we have decided to tackle and see if we can find our objects and their responsibilities and relationships. Let's first have a real deep look at a couple of equations typeset using Math Editor (the official version):
Now try to answer the following questions:
- Do you see any objects that we can use in our OO model for the equation editor?
- Are the lines, letters and the brackets we see going to be our objects as they stand or will we need some other higher or lower level objects distinct from yet capable of representing these figures and letters?
- Is there any specific relationship among these objects? If yes, is that relationship suitable to be represented in an OO Model?
- Are there any common features that could be put in well defined related classes?
- Will it be possible to create a common framework to handle input and output for different kinds of equations? Or will we need different strategy in different components?
Our goal is not the identification of objects, relationships and roles for only the currently visible equation entities. We want to be futuristic. We would like to create a robust model that is natural, flexible, extensible, maintainable and easy-to-understand and not too complex to implement.
Our task list consists of the following:
- Identification of a set of classes that are capable of representing all the currently supported as well as yet-to-be-supported equations.
- Building of a common model that allows us to typeset, serialize and represent equations in a uniform manner.
- Creation of a unified framework for processing input using mouse and keyboard etc.
The task-list looks very short. However, if you give it some thought, you will see that it is very demanding as well as very inclusive. The first task wants us to not only cater for the few equations we can see in the given examples, but it also wants us to create basic support for more kinds of equations we are not yet considering. The second task demands to represent and save our model in a manner that it will be relatively easy for us to convert the equations to some other representation e.g. MathML or TeX when need be. The third task requires to create an input mechanism that suffices the basic needs of all kinds of equations we will ever support.
Now that we have identified some of the most important tasks and goals, let's once again have a look at the couple of equations we saw above and try to find the answers to the questions we asked. Do you see a pattern? Can you answer some or any of the questions? If yes, you have done a great job. But the chances are you will not discover much in just a few minutes!
Please remember that when the concept is more abstract than concrete, creating of a good OO model becomes relatively difficult and is almost always open to debate. There is never a mathematical proof that a particular OO model perfectly fits the given situation. It is more of an art than science. You may need to start over and over again and even sometimes just wait for the divine inspiration! My only advice is to keep looking for what fits best and make your decisions after as much thinking as needed.
I will now try to facilitate you finding the answers we are looking for. Here is a figure in which I represent a few equations as I see them at a lower level ready to be implemented in an OO manner:
From the above figure, we can see a pattern emerging. The following are a few interesting facts we can notice:
- Some of the entities are cyclic in nature i.e. they can host their own relatively smaller form as a container. For example, we can see a division appear inside another division and a bracket inside another bracket with no strict limit to the level of nesting.
- There is an arranging of equations both vertically and horizontally in a repetitive manner.
- The Unicode text either appears inside some other container or inside the top level container.
Understanding the Code
The entire core functionality resides inside just 6 classes. You only need to have a basic understand of just these core 6 classes in order to able to understand the whole picture. After understanding these classes, you should be able to modify and extend the code according to your needs and wishes. Before I give a brief description of the core classes, let's have a look at the main class hierarchy (the class names in italics represent abstract classes):
This is an abstract class. It contains the basic characteristics possessed by every equation we will ever typeset using our equation editor. This class is the ultimate base class of all equations we are going to create. Every other class in the equation model must inherit from this class or some of its descendents.
This class is also abstract. As every equation must, it derives from
EquationBase. This class is the base class of every equation that wants to host other equations inside it (hence the name!). This class is actually the cornerstone of equation nesting functionality we want implemented.
This class allows us to add Unicode text to the document. It is the responsibility of this class to store and draw all textual content. This class is not a container itself, so it directly derives from
This class is the primary container of all the equations that need to be horizontally arranged. It supports both text equations as well as other container equations to nest inside it. All the container equations inside
EquationRow must come between
TextEquation instances. This way we are able to allow the user to enter text and more container equations anywhere they need.
Just like the lines of a paragraph or paragraphs of a section in a word processor, equations need to able to typeset in horizontal lines.
RowContainer is the class which supports that kind of functioanlity. The only equations it ever contains are instances of the class
EquationRow. All other equations are then nested inside those
As the name suggests, this class is the first object to be created by the higher level GUI container. This class then creates a single
RowContainer to get the ball rolling. Every subsequent equation is then created inside that single
Some Points of Interest
The 6 core classes discussed above construct the entire backbone of the model. From here we are ready to start implementing more useful equations like brackets, divisions, integrals and so on. However, building those kinds of equations from this point on is almost trivial. All the other container classes derive from
EquationContainer and create one or more instances of
For ease of development, I have placed all the equation classes, including the core classes, inside a folder named equations in the project directory. Apart from the core classes, all the other equations reside in their respective sub-folders.
Tip for Extending Math Editor Mini
We have seen the main model. If you are interested in a more complete equation editor using this model, you can get the source code of the actual product on it's github page. You may find these additional tips helpful if you want to try to create a more useful version yourself:
- Fully understand the cyclic/recursive nature of the model. Without that, you will be lost.
- If you want to persist data in disk files, consider creating a recursive serializer in which each equation calls its inner equations to serialize themselves. Get help from the current keyboard and mouse handling routines, which use this approach.
- If you want to create new kinds of equations, try to first sub-class them from the existing classes. Most of the functionality should be found inside one or the other of the existing classes. Apart from sub-classing, you can also just copy the existing code to new class hierarchies if appropriate.
- Should you want an undo/redo stack for the editor, try to only involve the core classes. This part will not be that easy due to the cyclic nature of the model. However, restricting yourself to the 6 core classes will make the job relatively easier.
- The text output mechanism used in the sample is not that sophisticated. You will probably need to create a more robust model for a bigger piece of software.
- Have a complete understanding of the fonts used for rendering the math notation (STIX fonts in this case).
Let's summarize what we tried to convey:
- Object Oriented Design process is not a strictly defined set of principles or practices. It is more of an art than science.
- The ability to find common characteristics and behavior in the entities forming a model is the key to good Object Oriented design.
- Abstraction is the cornerstone of flexible and extensible OO model.
- Defining objects and relationships when the actual model to be represented is rather abstract is more challenging.
- OOP rocks!
Please let me know if you find anything missing or unclear. You feedback will definitely help me make my work better and more useful.