The work of the software engineer is a never-ending effort of solving complex logical problems with ever-changing tools and technologies. We spend a great deal of time learning trending technologies and keeping up with new frameworks and methodologies. Yet, we frequently neglect to develop the very core skill of our profession, the ability of thinking critically and creatively about problems and their solutions.
In 1945, the Hungarian mathematician George Pólya published “How to solve it”, a very unique and insightful book on heuristic, the art of reasoning upon a problem. While mostly focused on mathematics, many of the guidelines provided in this small volume are generic in nature and can be applied to any logical problem. Following is my attempt to ‘flavor’ these key problem solving concepts based on my experience as a developer.
There are four main phases that are clearly identifiable in the process of solving problems:
Understanding the problem, Devising the plan, Carrying out the plan and Retrospection.
Understanding The Problem
It may seem obvious that in order to be able to solve a problem, we have to first understand it. Nothing is farther from reality in the IT business. It is not uncommon in my profession to see entire applications and architectures flawed and crippled by initial misunderstandings of a problem or requirement. While spending time to deeply understand what we have to build may not sound like the most ‘agile’ thing to do, the price to pay for a faulty start could be quite high.
We usually start learning about the problem when analyzing software requirements that explain how things should work from the user’s perspective.
a. Understand the statement of the problem
This is a sanity check to make sure that the software specifications are correctly stated:
- Are the specifications precise enough to be coded and tested?
- Is there a clear relationship between input and output?
- Are all the use-cases covered?
- Can we see potential contradictions/collisions of constraints or goals?
- Are there arbitrary concepts or sentences subject to interpretation? Are the terms of the requirements measurable? Can we see fuzzy words like ‘probably’, ‘usually’, ‘fast’, ‘many’, ‘almost’, etc.?
b. Understand the goals
Here we question the goals and outputs of the project to ensure that they are sound, but also to reveal and explicit all the problems that we may have to face while undertaking the project.
- Which goals are mandatory and which ones are desirable?
- Which ones are mission-critical and which ones are ancillaries from the business standpoint?
- What are the most challenging goals and what makes them difficult to achieve?
- How can we measure if the goals have been successfully reached?
- What are the risks that can jeopardize the goals? What would be the impact of a defect or a downtime?
- How are the goals of the software expected to expand or change overtime?
- Which external factors or dependencies can prevent the achievement of the goals?
c. Understand the data
Here, we question the inputs of the software and all the data provided upfront or expected to be available for the software to be properly functional.
- Is there any unused input/data provided? If so, why is it provided? Can some data be derived, calculated or inferred?
- What assumptions can be safely made about the correctness and integrity of the input?
- What is the level of concurrency in accessing or changing the data? What are the boundaries and volume of the transactions?
- Are the relationships between data elements easy to be used? Is the format of the data convenient for processing?
- What is the value of the data? What are the related security threats to be expected?
- Will the data be available during development? Can it be simulated in a realistic way?
- How is the size of the data expected to grow over time? Can the old data be purged?
- What happens if the data is unavailable for a time, due to maintenance or emergency situations?
- Can the data be distributed or centralized? If not, why not?
d. Understand the conditions
Here we question all the assumptions, constraints and conditions specified for the software, such as validations, business rules, quality, usability, security and performance requirements, etc.
- Are the constraints realistic? Why are they all necessary? Are they hiding other problems?
- Can additional constraints be derived from the existing ones (dependencies between functionalities, external dependencies, unavoidable sequentiality of steps, etc.)?
- Are there conditions based on wrong or unverified assumptions (e. g., the customer may think that a certain feature would be easier or cheaper by adding some limitations)?
- Can more constraints be added, even if not necessary, to simplify some scenarios?
- Can some constraints be removed by modifying processes or workflows?
- Could there be unnecessary, self-imposed conditions used implicitly or explicitly and perhaps based on a fallacious mental model of a process or functionality?
e. Build a Model
Geometric problems are easier to understand and reason upon if we can visualize them in a drawing or a 3-D Model. Software functionalities are also easy to understand and reason upon if we build simplified models, wire-frames or prototypes that can help us visualizing relevant aspects. This can give confidence on the most challenging or critical tasks and an advantageous intimacy with the problems that we will have to solve with much more hard work in the real solution. Models are great helpers to reason on problems and solutions within the technical team or with all the stakeholders.
Devising the Plan
Once the problem has been properly understood, we enter the core phase of problem solving: planning. This is the phase where we evaluate and devise the different solution strategies; here comes the time to brainstorm and breed the ideas that will allow us to produce quality software and achieve the project goals.
In this phase, Pólya reminds us that there is no infallible methodology to solve a problem, by stating the following Rules of Discovery:
- Have brain and good luck
- Sit tight and wait until you get a bright idea
While there are no mechanical rules to solve problems, Pólya also observes that there are heuristic procedures, mental operation, stereotyped questions and suggestions that can hint solutions to intelligent people. In this article, I will examine a non-exhaustive list of four strategies: Analogy, Decomposing and Recombining, Variation of the Problem, and Working Backwards. While these strategies will be presented individually for the sake of clarity, in real scenarios they are likely to be combined to derive a solution.
Analogy is a strategy of using the knowledge from previously solved problems that are closely related to the one at hand or at least share some commonalities. In order to be able to use the known solution to a related problem, it is frequently necessary to introduce auxiliary elements that can adapt such solution to our goals and needs.
Do you know a related problem?
Can you think of a familiar problem having the same or similar solution? How can you use it?
In Software Engineering…
Before tackling a complex problem, a good software engineer should spend some time researching well-known solutions to well-known problems that fall under the same category. We would likely find books, blogs and articles discussing different ideas and approaches, code snippets, open source projects, commercial components, etc. Even if our problem is such that we cannot entirely use any of the solutions that we find, we may still be able to adapt some algorithms or pieces of code to well serve our needs. If nothing else, we would at least acquire more knowledge of the problem and have a term of comparison for our design choices.
Decomposing and Recombining
To solve a complex problem, we may try to decompose it into other problems that are both easier to solve and that can be used as a stepping stones to reach our original goals.
Could you solve a part of the problem? Can you separate the various parts of the conditions?
In Software Engineering…
Software engineers are grand masters in this mental operation and they perform it all the time. We break complex applications into small, focused components that we then aggregate and wire up to form an organic complex solution. Object Oriented Analysis, Functional Decomposition and design patterns (e. g., MVC, MVVM, etc.) are all examples of decomposing problems and recombining solutions.
Most of our design principles and procedures can be seen as practices to decompose the complexity of problems: separation of concerns, separation of state and behavior (functional programming), dependency inversion, Law of Demeter, etc.
Decomposing and recombining is not only useful to tame complexity but also to facilitate reusability by taking components used in a previous solution and aggregating them differently to solve different problems, much like lego pieces. Other notable examples are the map-reduce big data pattern, which is a decomposition for parallel computation; the decomposition for testability, which is encouraged by TDD/BDD; and also the decomposition of the SDLC itself into small iteration, to reduce project risks.
Variation of the Problem
When a problem seems too complex to be solved, we instead solve an auxiliary problem derived from the original through different types of alterations. The purpose of the alterations is to obtain a different problem, simpler or more familiar, in the hope that its solution may help us with the original problem, or at least give some useful insights.
Can you think of more accessible goals? Can you make the goals more accessible by altering the input or the constraints?
Here are some typical types of problem variation:
- Auxiliary Elements
When we do not know how to solve a certain problem, we try can always try to solve a related one. Adding auxiliary elements can help bridging the solution of the related problem to the solution of more difficult problem.
Can we introduce auxiliary elements to make use of a known solution or a more accessible problem?
In Software Engineering…
Inheritance is a common example of how we can extend a basic functionality to more perform more specific or complex tasks. Applying design patterns such as proxy or decorator are also examples of achieving a complex solution by adding elements on top of a simpler one. For instance, we can obtain a secure functionality by implementing the plain functionality (simpler problem) and then wrapping it with a data encryption decorator (auxiliary element). Adapters can also be seen as auxiliary elements to be able to fit other components into the design of our solution.
When we are stuck on a difficult obstacle that is slowing down our progress, we may pretend to make the obstacle disappearing with an imaginary magic wand. Great problem solvers use this mental operation to throw their mind over the obstacle, to explore what would happened next if the impediments where suddenly resolved. Mentally removing an obstacle may force our brain to step back and look at the impediments under the light of a broader context.
What would you do if the obstacle was not there?
In Software Engineering…
Developers are easily obsessed by impediments such as high level of optimization, without considering what they are really gaining (or losing) as result of their effort. If we are asked to build an easy-to-carry luggage, we may struggle and spend all our time and energy trying to obtain light and durable construction materials. But if we pretend for a moment that we already have the lightest and most durable material, then we stop being obsessed with this aspect and we perhaps can see that the weight of the luggage will always be at least the weight of its content. We can finally move on and consider alternatives, such as putting wheels underneath the luggage.
The more ambitious plan may have more chances of success. This idea is also known as the Inventor’s Paradox. With generalization, instead of solving a specific problem, we solve a broader and more generic one whose solution includes the solution of the specific problem that we need to solve. Mathematical induction is a popular example of solving a problem by generalizing it.
Is the complexity of the problem at hand caused by its excessive specificity?
In Software Engineering…
Generic solutions can be, in some cases, much simpler. Such is the case when specificity leads to nested, complicated conditional logic and flaky code that needs to be frequently and significantly altered to accommodate new business rules, with an elevated risk of breaking something.
For example, let us consider a validation engine whose implementation is based upon hundreds of configuration parameters. Each parameter may have different possible values expressing what kind of validation rule should be applied or skipped. Moreover, some dependencies between rules are identified: if a validation rule A is skipped, then also another rule B should be skipped, etc. If we try to build a prototype of a specific solution, implementing only a small subset of configuration rules, we would likely be horrified by the jungle of switches and if statements that we will have to write. We may then use an analogy with software firewalls that can validate multiple complex conditions by decomposing the problem into prioritized cascading rules that can be dynamically defined (instead of having all the possible variations hardcoded with static configuration values). The complexity is reduced by implementing a generic validation engine.
Opposite to the generalization, this mental operation can be very useful for exploratory purposes. When a problem complexity is caused by its generality and the high number of variables, we may decide to try to solve special, extreme cases. We do this to explore the boundaries of the problem in the hope to obtain more knowledge or some hints of solution. If lucky, we may reuse the solution of the special problem, or at least find out if we are going in the wrong direction.
Can you imagine a more accessible related problem? A special problem?
Can you use it? Could you use its result, perhaps by introducing auxiliary elements?
In Software Engineering…
QA and testing engineers habitually consider special inputs that can break code or cause incorrect results. Stress testing is also an example of building extreme cases that can reveal performance bottlenecks and weaknesses of an application. When designing software, it is always useful to think about malicious special cases, like security vulnerabilities or potential issues caused by race conditions while accessing shared resources. Generally speaking, every good software engineer should exploit the code by considering all the things that can potentially go wrong, checking for unwanted side effects, causes for invalid states, etc.
When we have no clue on how to reach a solution from the given data/conditions, we can try examining the last point that we have reached in the analysis and retrace our steps backward until we discover a path between the data and the goal. Finding a solution in reverse is not intuitive and presents some psychological difficulties, since we devise steps that are bringing us away from our goals (the starting point), instead of moving us towards them.
What characteristics can you see in the goals? How can they be derived?
Can you think of a related solution that can achieve the same goals using different data and conditions?
In Software Engineering…
There are many complex problems that have a crystal clear input and a crystal clear goal but do not have any obvious deterministic solution. Expert systems solve problems by emulating the judgment of a subject expert (human being) in different situations. Similarly, the design of genetic algorithms starts from the end result to determine which fitness function to use. For this broad category of engineering problems, thinking backwards is a regular practice and sometimes the only option. Consider this problem: given any home page (HTML file) of a company website, find the company logo image. Without assumptions about the input, we may not have any straightforward solution strategy, so we start thinking from the end. Examining many manually identified (by a live person) company logos, we can collect many useful measurable properties: geometrical attributes (position, size and proportions), markup properties (names and attributes of the image), graphic properties (format, file size, number of colors), etc. Using a quantitative analysis we can build the steps backward to identify the set of rules that are effective to derive the known result. Finally, we plan the first steps to extract the needed properties from the html.
Some Additional Thoughts on Perspective
The brightest ideas to solve problems usually come by looking at them from the right perspective. The Ptolemaic model describes the orbits of the planets through complex equations and artificial constructions (epicycles). The Copernican model describes the very same orbits with outstanding simplicity by shifting the point of view from Earth to the Sun. The best perspective is frequently the most natural (closest to reality) and it is also the one that greatly simplifies the way we think of a problem.
Carrying out the Plan
Devising a plan requires analytic skills, good ideas and heuristic reasoning. The plan is what Pólya calls the “scaffolding of the bridge” that we need to build to solve complex problems. Scaffolding is essential but temporary in nature, and all the intuitions, assumptions and plausible arguments that we used in our plan needs now to be slowly replaced by solid working software.
Carrying out the plan is a work of synthesis, rigorous and scrupulous execution. We need to painstakingly verify and prove each step without losing sight of the connections and relationships between all the steps.
a. Top-Down Execution
Nobody knows better than a software engineer that the devil is in the details.
A top-down order is very relevant when digging into the details of the solution. Before to tackle the minor aspects, we need first to work out the major ones to make sure that they are sound. Starting to code immediately is tempting but also risky when the big picture is fuzzy; therefore, we should first resolve all the important doubts and verify the major assumptions that can significantly affect the outcome of our work. Coding the details can be extremely time-consuming; it would be costly to find out later that our magnificent code implements a wrong or unwanted functionality.
b. Importance and Challenge
Risk is another essential factor in determining the execution order. Some tasks are more important than others in the big picture. Some tasks also present a higher challenge then others. If a step is both important and challenging at the same time, then we should make an effort to prioritize it due to the great impact that it may have on the overall plan.
c. Breaking Dependencies
Steps may naturally depend upon each other. Each software component usually relies on others to achieve its goals. In software engineering, it is sometimes possible and convenient to cheat what seems to be a natural execution order by creating mock or fake dependencies that allow skipping the less relevant details (that can be addressed later) and focus on the high priority tasks that give us the highest level of confidence on the whole solution. This strategy is also useful for organizational purposes (e. g., coding in parallel modules that depend upon each other).
d. Consolidating the Efforts
Complex software solutions can be carried out by many developers and many teams, eventually spread out in different geographical locations. This scenario poses the risk of multiplying the independent efforts of solving similar problems.
To consolidate the execution efforts, the following steps can be taken:
- Architects/Technical leads should make each developer aware of the context of his/her work within the high level solution. Typically, this is done through kick off presentations and providing solution background documents.
- The main structure of the source code should reserve specific physical locations to store components that can have a broader utilization other than the specific task for which they were created.
- Common components should be documented and submitted to code review sessions. When ready, they should be advertised to all the developers to make them aware of their existence.
e. Bottom-Up Testing
The mathematician demonstrates a theorem by formally proving each and every step, from the hypotheses to the thesis. Likewise, the engineer proves that a software solution works by writing formal tests. Testing is usually a bottom-up process that starts by writing unit tests and then moves up to functional tests for modules, integration tests, all the way up to the whole solution. The advantage of the bottom-up testing is that if a low level test fails, we can immediately pinpoint the defect; on the other hand, if a high level test fails we can concentrate in finding defects in the wiring and interactions between major components. Low level tests should in fact be created as soon as possible (by the developers) to avoid the time-consuming and expensive bug-fixing that is typically associated with logical defects. Written tests are usually automated to ensure the correctness against future changes (regression testing).
f. Pedantry and Mastery
Pólya described two opposite attitudes towards rules that apply quite well in the context of modern software design and development methodologies.
The pedant software engineer relies conscientiously and indiscriminately on a limited, well-known set of tools, patterns and practices that are proven to be successful in most cases. Such type of engineer strictly applies standards and follows verbatim a methodology.
The master engineer instead focuses on the purposes of patterns and methodologies, seizing the opportunities and judging case by case the tools that best fit each situation.
Pólya teaches us that complex problems are never completely exhausted. The final phase of problem solving is looking back at our completed solution to expand its potentials and consolidate our knowledge.
In software engineering, this process usually starts with code reviews, agile retrospectives and postmortem meetings. Following are some of the major goals that can be achieved with retrospection:
a. Clean Up
To sharpen our solution, we need to remove duplication, redundancy, and code verbosity. Simplicity and clarity are also important goals of the clean-up phase: removing dead code and unnecessary steps, replacing convoluted algorithms with equivalent but more straightforward ones, selecting more meaningful names for classes, modules, etc. This helps to make the solution more intuitive and easy to see at glance.
b. Maintenance and Scalability
Good software solutions need to be able to cope with future changes. Therefore, we need to verify that the impact of reasonably expected maintenance conforms to the initial expectations. Retrospection is also the right phase to exploit performance bottlenecks that may affect our scalability plans.
The main objective of modular software is to be able to reuse as much as possible its components in different contexts. Retrospection can reveal opportunities to generalize or adapt pieces of the solution to be employed in other projects, performing similar tasks.
d. Compare and Improve
Authoring a solution gives us a point of view on the problem that we solved.
Our point of view may not necessarily the best; nevertheless, it is always a valuable term of comparison with other solutions. Surveying the solution can consolidate our knowledge of a business domain and identify which areas of our solution can be further improved.
Writing an informal document can be an exceptionally useful to record a high level description of the strategies adopted, the strong points, the identified limitations, and any interesting idea or suggestion that emerge during the retrospection.