Understanding Benford's Law: A Forensic Accountant's Tool for Detecting Fraud
Explore how forensic accountants use Benford's Law to detect anomalies in large datasets, ensuring data integrity in financial and numerical analysis.
File
How to Detect Fraud Using Benfords Law
Added on 09/29/2024
Speakers
add Add new speaker

Speaker 1: [♪ music begins to play ♪ and continues throughout video ♪ [♪ music continues to play ♪ and continues throughout video ♪ Hi, welcome to another New Jersey Forensic Accountant discussion. Today's discussion is going to be very interesting because it's one of the techniques we use in the vast majority of our forensic accounting analysis. It's called Benford's Law, and I've been getting a lot of people saying, hey, how do you actually, they want me to go into some of the techniques we actually use, so I'm going to discuss this Benford's Law. Now, as a forensic accountant, I have many ways and techniques to spot fraud. One of the ways we detect fraud is especially when analyzing tax returns, general ledgers, and other items that contain a large amount of numerical data. Now, remember, we get into a case, a lot of times people will give us a huge amount of information, millions or hundreds of millions of pieces of data, and we have to find out if it's random or is it fraud, is it manipulated in some way. And the first thing we always do is we apply Benford's Law. And what the law basically states is that any random number will have a specific result as to which digit appears in each data set. And the way it does this is through what's called a base 10 logarithm, and it's very, very accurate. And when you apply this to a large amount of numbers, you should get something that looks like this bar chart right here. Okay? You can see here that this is 30, 18, 12, and you can see how the bar chart kind of slants down. And if you do this analysis, so if you run this information on data and you don't get something that looks like this, there's a high probability that it is manipulated in some way. It's not a natural occurrence. For example, when you take Benford's Law, okay, and you apply it to, for example, the distance of the planets from the sun, okay, to see if it's manipulated, if someone put the planets there, or if it's random, you'll get a histogram that will look just like this. Okay? Or if you plot the distance of the stars from the Earth, you'll get something like this. Or if you take all the phone numbers in the phone book, you'll get the same histogram. So the reason for this is that what it simply does is it takes the first number that appears in a data set, and we analyze this data set, and we spot anomalies that tip us off that there's a high probability that the numbers have been manipulated. From there, we can perform a forensic accounting. Once I know that there's problems with the data we have, we can then dig down and find out what happened. For example, here's what it should look like, Benford's Law, based on Benford's Law. But now you look at these here, okay, revenue per PSE firms, population, motor vehicle theft cases. They're not really in line. So what that's telling me is this data is probably manipulated in some way. Okay? Something's wrong with the population count. Something's wrong with the number of motor vehicle theft cases. Okay, maybe some of these aren't thefts. So, I mean, the one that's pretty close is population. Right? Almost. You know, so maybe there's a problem up here, something going on. But anyway, we apply this, and you can look at the data, and it's pretty easy to see that there's an issue there. Now, the steps in using Benford Law. Okay? The one, let me just say, it's very difficult to understand Benford's Law. It's very complex. You know, the logarithm explaining all that can take days, if not weeks, to understand. But let me just give you an example. I'll go through an example because it's the easy way to understand it. When I'm making this video, it's when COVID-9 is pretty prevalent. It's the end of the summer, and a lot of cases have been reported to CDC. Now, some people are saying that the cases are overreported. Hospitals are overreporting cases, that it's really the deaths are overreported. And so what I'm going to do is I'm going to go to the CCD, CDC website, and I'm going to apply Benford's Law to the data. And I'm going to go through and show you how we actually do it in real life. Now, here is the website for, okay, Center for Disease Control and Prevention. Okay, and here they're talking about the deaths, USA. I mean, this disease is horrible. But total cases, new cases, the USA deaths over 200,000. So what I'm going to do is I'm going to download cases the last seven days by territory. I'm going to download this data here. Let's see what this looks like. Okay, here's what I get when I download this from the CDC website. Okay, I'm going to fix up this data so that we can utilize it. Now, so you could see here the total cases confirmed, probable cases, et cetera. Okay, it goes through all this good data here. And let's say someone hired my firm to do a fraud analysis. The first thing I would do was I would utilize Benford's Law. Okay, what I would do is basically get rid of all this data here and just focus on the total cases in the state. Now, we've already done this, so we would take this data, put it in an Excel sheet. So Excel has some decent formula capabilities. And what it would look like once we took the data, it would look like this. Okay, you could see here that we have the states, the number of reported cases, and then what we do is we utilize this formula, which is left B2. What it does is it goes to here and takes the first digit and puts it in this column, okay, and for all the states and some of the territories of the United States. And then what we do is we want numbers 1 through 9, which are the digits, and Benford's Law states that a certain number of these numbers here should start with a 1, a certain number should start with 2, a certain number should start with 3. Then I go in this other formula, it's COUNTIFS. What it does is it takes all the numbers that start with 1 and puts them, there's 23. All the numbers that start with 2, there's 8. And all the numbers that start with 9 is 2. Okay, so it went through all these columns. There's a total of 56 here, okay, and I verified that because some of these have zeros and, you know, there's 56 basically states and territories in this database that we downloaded. And then we do a calculation, and this says 41% of the numbers here start with 1, 14 start with 2, and 3. So now we have the percentages, and then we just do this histogram here, and you can see this does not look like, you know, a typical Benford's Law would predict, okay? The logarithms are way out of whack. So this is telling me here, just looking at this data, that this is not legitimate data, okay? At this point, if we were doing this case, I would tell, you know, my client, say, listen, the data we're looking at is definitely manipulated, okay? Now, what we need to do is then we'd go in and look at the various hospitals, see how they're reporting these cases, where are they coming from, how reliable the data is. We'd test it, and we'd, you know, actually back up. But this is telling me, if I had to go to court, this is the first thing I would show, is that Benford's, they're saying that the data is manipulated, okay, because we know what it should look like, right? What should it look like? It should look like, it should look like this. It doesn't, okay? It looks like this. So anyway, so we have some situations here. Benford's Law is great. I recommend it if you get into large databases. We have to quickly find out if it's something you want to look into. So listen, guys, we went through this quick. If you have any questions, just leave it below. And if you like this video, please join my YouTube channel. It helps a lot with, you know, getting the recognition and name out there for this kind of stuff that we hope you enjoyed it. Thanks a lot. Bye. ♪♪♪

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript