The quality of software is assessed by a number of variables. These variables can be divided into external and internal quality criteria. External quality is what a user experiences when running the software in its operational mode. Internal quality refers to aspects that are code-dependent, and that are not visible to the end-user. External quality is critical to the user, while internal quality is meaningful to the developer only.
Some quality criteria are objective, and can be measured accordingly. Some quality criteria are subjective, and are therefore captured with more arbitrary measurements.
The table below lists the most obvious software quality criteria, as well as some lesser known.
By definition, the internal quality (code characteristics) is a concern to the developer only, while all the external quality aspects (coming from using the software) are critical to the end user. However the developer also has interest in performance (speed, space, network usage) and determinism, because they make testing the software easier. Developers treat ease-of-use, back-compatibility, security, and power consumption as requirements.
It is important to consider how difficult it is to measure each of these criteria. It can be difficult because there is no simple variable to look at, or because the measurement process is costly, or because it requires a complex infrastructure. For instance, speed has an objective measurement that is easy to measure. Power consumption has a simple measurement (how many µW the application consumes), but it is complex to measure. Security is difficult and costly to estimate.
Features. This is the very reason for the software to be written: to provide a service. By features, we really mean the output produced by the software – e.g., a numerical result, a string, a screen shot, a web page, an audio, etc., regardless of the performances (speed, memory).
Speed. How quickly does the application provide the service? The user experiences the actual time elapsed between the moment she/he requests the service, and the moment the service is delivered. The real elapsed time, or wall time, is the sum of the CPU time, system time, and network latency. Thus the developer should not only focus on the CPU time (how much time the CPU actually spends on executing the program). The CPU time can easily be overshadowed by disk access (a write on the disk is very costly), swapping (due to an excessive virtual memory size), or time spent by the network (latency issue, or too many round trips).
Space. How much RAM and disk space is taken by the application? The aggregate numbers are important –peak memory, virtual memory size, etc. But even more so, how often do we move data that triggers a CPU cache miss or a disk write, has a dominant impact on the speed of the application. A mediocre data design can lead to very poor performances.
Network usage. It is a matter of bandwidth and latency. Mismanaging sockets and channels can lead to unnecessary extra time spent in opening and closing sockets, handshakes, and round trips. As for memory, caching techniques can be used to reduce consuming network resources.
Stability. How often does one need to patch the software to fix problems? For the user, this is an inconvenience. For the developer, it means that the code is fragile and might benefit from better testing or partial rewrite.
Robustness. How often does the application stale, freeze, or crash? How tolerant is it to extreme conditions –limited CPU and memory/disk/network resources, corner cases, system failure or unresponsive 3rd-party resources? This aspect is strongly related to testability and coverage.
Ease-of-use. It can be a very subjective factor, hard to quantify. It includes user documentation, clarity of the error message, management of exceptions, and recovery after failure.
Determinism. Also known as repeatability: does the program produce the very same result given the same input? There are many reasons for which a program can exhibit a non-repeatable behavior. A non-repeatable behavior is confusing and frustrating for the user. This also makes the program very difficult to test and debug. Repeatability is strongly dependent on a good data model design.
Back-compatibility. Can a new version of the application be used with an older version’s data? It is essential to the user, because a new version should not require a costly migration of the existing data.
Security. Who is authorized to access the data? Can the data processed by the application be compromised? This is a crucial aspect of many applications, and it is getting more and more difficult to assess with the dissemination of mobile and web-based software.
Power consumption. It is increasingly important with mobile applications, as a program may have to consider how it manages the device’s power producers and consumers (battery, cores, wireless, screen, audio), and not to rely entirely on the operating system.
Test coverage. What is the proportion of code that is executed by some unit or regression test? This is measured by the number of lines, number of functions, and number of control branches that are exercised by the tests. Usually, one expects coverage of at least 85% for any moderately complex application. In practice, reaching high coverage can be achieved only if testability is high, which has deep implication on the architecture and development methodology.
Testability. An often overlooked or simply ignored aspect of code development, testability is the ability to trigger any specific line of code or branching condition. Highly testable code requires a discipline of architecture and development that is difficult to find. It is very costly to fix poorly testable software, as this requires major redesign. This justifies major investment in software architecture, design, and development methodologies.
Portability. Can the application run on 32 and 64 bits machines? Should it run on a mobile phone? Does it run on multiple OS (e.g., Windows, Linux, Mac OS-X, Solaris, iOS, Android, RIM)? Does it run smoothly on all web browsers (Internet Explorer, Firefox, Chrome, Safari, Opera)?
Thread-safeness. Is a specific component thread-safe? Can two threads collide on non-atomic operations? Can the application get into a deadlock? As concurrency is still mostly the result of a manual process (there no compiler that automatically parallelizes the code), these questions are critical to ensure the good functioning of a program, as well as its performance –it is not rare to see the a program running slower when two many threads are available, as the cost of synchronization can become dominant.
Conciseness. Also known as compactness. Is there any dead code, or duplicated code? Is the code shared and factorized properly? A compact code usually means faster compilation and smaller binary size. Also compactness naturally leads to fewer bugs, because the number of bugs is historically constant w.r.t. code size.
Maintainability. How easy it is to debug the code? How fast is it to provide a fix? How quickly can a new developer understand the code? Maintainability is a very important aspect, quite difficult to quantify. Maintainability is increased with good testability and flexible (abstract) design.
Documentation. This is a pretty subjective topic. Some people claim that a separate documentation written in plain English is necessary. Some others state that at least 30% of the code should be comments. Some finally argue that the code itself is the best documentation –the names of the types, classes, functions and arguments, together with plenty of assertions.
Legibility. Also known as readability. This is another subjective topic. It is about how easy it is to read the code. Guidelines are established to unify the style of the code, so that a developer can easily read code written by another developer. Code guidelines abound, and they go from a small set of directives, to a full set of rules that specify every syntactical aspect of the language. For example, see Industrial Strength C++, High Integrity C++, Google C++ Style Guide, and many more.
Scalability. How easy it is to extend a feature? Or to add a new one? Or to add extra cores, or increase the size of the cluster the application runs on? Again, this is all about software architecture and anticipating future needs.
Software quality is the result of the user experience. But software quality should not and cannot be a reactive action to external defects. Software quality is built from the ground up, with design and development methodologies, and with a special focus to testability, coverage, and flexibility.
- Test-driven design, a methodology for low-defect software
- Software outsourcing, a necessary evil