Complexity is the quality of consisting of many interrelated parts. When software consists of many interrelated parts, it becomes more difficult to reason about. Software that is difficult to reason about is a more fertile breeding ground for bugs than software that is simple.
Every problem space contains some level of inherent complexity, which is shared by all possible solutions. However, as programmers, we can reduce the complexity of our chosen solutions by limiting the interrelatedness of their constituent components. This is commonly referred to as favouring cohesion over coupling, and forms the bedrock on which axioms such as the single responsibility principle are built.
In codebases that are large and/or unfamiliar, it can be difficult to know whether regions of complexity exist and where they might be. By defining metrics of complexity, the search for offending components can be automated and brought into the existing build process alongside other forms of static analysis and unit tests. Although the metrics themselves are far from perfect, they can be useful in helping to identify areas of code that warrant closer inspection. They can also be tracked over time, as an indicator of the direction that overall code quality may be moving in.
The metrics that are reported by this site are generated by a command-line tool, available from npm under the package name complexity-report, which can be used in such a way on JavaScript projects. Currently, it is able to report on four different types of complexity metric: lines of code, cyclomatic complexity, Halstead complexity measures and the maintainability index.
This can be either physical (a count of the actual lines in the file) or logical (a count of the imperative statements). The physical count is widely considered to be a less useful metric because it is easily subverted by collecting multiple statements on a single line of code. However it should be noted that the logical count can be similarly flawed, since the tersest expression of a solution is not necessarily the optimal one.
Created by Thomas J. McCabe in 1976, this metric counts the number of distinct paths through a block of code. It takes its name from counting the number of cycles in the program flow control graph. Lower values are better; McCabe suggested using ten as a threshold value, beyond which modules should be split into smaller units.
In 1977, Maurice Halstead developed a set of metrics which are calculated based on the number of distinct operators, the number of distinct operands, the total number of operators and the total number of operands in each function. This site picks out three Halstead measures in particular: difficulty, volume and effort.
(# distinct operators / 2) *
(# operands / # distinct operands)
(# operators + # operands) *
log2(# distinct operators + # distinct operands)
difficulty * volume
Designed in 1991 by Paul Oman and Jack Hagemeister at the University of Idaho, this metric is calculated at the whole program or module level from averages of the other 3 metrics, using the following formula:
171 -
(3.42 * ln(mean effort)) -
(0.23 * ln(mean cyclomatic complexity)) -
(16.2 * ln(mean logical LOC))
Values are on a logarithmic scale ranging from negative infinity up to 171, with greater numbers indicating a higher level of maintainability. In their original paper, Oman and Hagemeister identified 65 as the threshold value below which a program should be considered difficult to maintain.
The key point with all of these metrics is that the proscribed threshold values should not be considered as definitive indicators of whether a particular piece of code is "too complex", whatever that might mean. Software development is a broad, varied practice and every project is subject to a unique set of countless environmental factors, rendering such general absolutes as essentially arbitrary. Further, complexity itself is such an amorphous, multi-dimensional continuum, that attempting to pigeon-hole chunks of code at discrete points along a single axis is an intrinsically crude model.
It is better to use them as a somewhat fuzzy, high-level mechanism that can identify regions of interest or concern and from which your own programming and domain expertise can take over for a more comprehensive analysis.