Using Excel for Predictive Analysis: A Step-by-Step Guide to Linear Regression
Learn how to set up a predictive model in Excel using linear regression. Follow along as we analyze factory costs based on product quantities.
File
Using Multiple Regression in Excel for Predictive Analysis
Added on 09/29/2024
Speakers
add Add new speaker

Speaker 1: We're going to look at using Excel to do some predictive analysis. We're going to set up a predictive model for our factory, and in our factory we make three types of products, A, B, and C, and depending on the quantity of each of those products is dependent upon how much it costs to run the factory. So, we've collected this data over the course of 19 months, and we're going to use linear regression to do the predictive analysis. Now, in linear regression, we have multiple independent variables, and the independent variables that are in this model are the three products we create in our factory, A, B, and C. Those are the independent variables. What's the dependent variable? What is dependent upon those three variables is the cost of running the factory. So, the dependent variable, in this case, is the factory cost. So, when you look at the regression equation, the multiple regression equation, you will see that what we have is an equation for a straight line. And, we can assume in this example, and all the examples in this course, MIS 204, that there is a linear relationship. So, that's why we're using an equation for a line. So, where do we begin in building our predictive analysis model? Well, first of all, we need to make sure that the regression tool is loaded. And, we go click on data, in the tab, and when we click on data, we see data analysis is loaded. Well, how do we get that loaded? We go over to file, click options, click add-ins, choose analysis tool pack, click go. Make sure analysis tool pack is checked. We're not using solver in this particular instance, so we can uncheck that. Click OK, and the tool pack is loaded. So, let's click on data analysis, and when we click on data analysis, we have all of these different statistical functions, but the one that we want to use for predictive analysis is regression. So, click on regression, OK. Now, this dialog box comes up, and it's very simple to do multiple linear regression in Excel. The first thing we want to do is we want to select the Y range, and the Y range is our dependent variable. So, we'll select our dependent variable. Cost is dependent upon the products that are made in our factory. The X range, we have multiple independent variables. In this case, we have X1, X2, and X3, denoted by products A, B, and C. Perfect. And, if you noticed, I've included the column labels. And, we're going to have the output range. Let's put the output range on the same sheet, so we can compare side-by-side the different outcomes. We'll put the output range right there. Well, let's move it over one so we don't interfere with our formula. And, that's it. Click OK. And, Excel does most of the work for us. We see down at the bottom, we have the intercept. And, the intercept will be the constant in our formula. And, we have the coefficients for each of our independent variables, which are listed there. Remember, we always need to inspect the p-values, or the predictive values, of each of our independent variables. And, any predictive value that is 0.15 or greater, we're going to exclude. 0.15 or greater says that the predictive value of that particular independent variable is such that it really doesn't matter in predicting our outcomes. It is not of significance. So, when we look at the p-values, we see for product A, the p-value is 0.23, which obviously is greater than 0.15. So, we will exclude using the values for the independent variable for product A. Product B and product C both have p-values well below 0.15. So, their predictive value is much, much greater. So, at this point, we need to rerun the regression, excluding independent variable A. So, let's go ahead and do that. I'm going to go up here, run the regression one more time. And, instead of selecting A, B, and C, this time we're only going to select B and C. Why? Because we evaluated the p-values, and found that A's p-value, the predictive value, was just simply too low. So, there we go. We're going to run it again. We're going to put it on the same worksheet. Let's put it down here, right below it. And, let's click OK. And, we'll scroll down and look. Look at the p-values. Now that we've gotten rid of independent variable A, the p-values, the predictive values of the other independent variables, have fallen significantly. And, they have very, very strong predictive values. So, at this point, all we need to do is take the different coefficients, plug it into the formula, and do our prediction. So, how do we do that? Well, first of all, let's look at the line formula. The constant is the intercept. The B values is what we're looking for. So, in this case, we are predicting the monthly cost for our factory, if we make 1,200 model A's, 800 model B's, and 1,000 model C's. Well, remember, we excluded A, so we don't have to include that in our formula. So, let's go write our formula, plugging in the variables, and predict our cost. So, again, beginning each Excel formula with the equals sign. Equals the constant, and I'm reading left to right here, the constant, which is the line intercept, plus, I'm excluding A because the p-value was too low, 800 times the coefficient, plus 1,000 times the coefficient for C, and that's it. So, let's press enter and make our prediction. And there it is. We can predict that if we may, given our historical data from 19 months, that if we made 1,200 model A's, 800 model B's, and 1,000 model C's, it will cost $45,149.21 to run our factory. There it is. Using multiple linear regression to do predictive analysis.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript