Is Benford’s Law Mantissa Test Easier to Comprehend?
At Forestpin we’ve been analysing financial data from various companies, and Benford’s Law analysis was one of the techniques used by our software to find irregularities and fraud in financial data.
If you are new to Benford’s Law, this article gives a nice introduction. And the book Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection by Dr. Mark Nigrini is a great resource to learn about Benford’s Law and it’s applications in financial data analysis.
However, one of the things that is not much discussed and used in applying Benford’s law is the sum invariance. This characteristic was first observed by Nigrini in 1992.
[First Two Digit Distribution](//vpj.github.io/images/posts/ftd-distribution.png)
The basic use of Benford’s law is to take the first digit distribution or first two digit distribution and compare it with the expected logarithmic curve. An example first two digit distribution of invoices at a company in Sri Lanka, is illustrated in the chart above.
Sum Invariance #
The sum of all entries with leading digit d is constant for various d.
- Mark Nigrini
This states that for each first digit d, the the sum of mantissae of all numbers starting from d is equal if the dataset conforms to Benford’s Law. Therefore, by plotting a chart of sum of mantiassae against first digit, the irregularities could be identified. This chart would be simpler and easier to understand since the human eye can compare heights against a constant level effortlessly, whereas you would be comparing heights of bars against a logarithmic curve in a normal first digit chart with frequency.
The mantissa is the significand of a number in scientific notation. For example, the mantissa of 437,000 = 4.37 * 105 is 4.37. That is, the mantissae of 5123, 51.23 and 5123,000 are all equal to 5.123.
[First Two Digit Distribution and Sum of Mantissa](//vpj.github.io/images/posts/ftd-mantissa2.png)
The above chart shows the first two digit distribution (bars on top) and first two digit summation of mantissae (bars on bottom), and you can observe that the summation chart gives a visualization that is easier to comprehend.
Sometimes the summation is used without calculating the mantissa (just calculating the total of values), but in that case the chart could get largly influenced by a few large values, and therefore not recommended. The chart below shows the summation by first two digits instead of the summation of mantissae, for the same data set.
[First Two Digit Distribution and Summation](//vpj.github.io/images/posts/ftd-sum.png)