Speaker 1: Hello and welcome. In this video I will explain what a regression analysis is and I will give you an overview of the different types of regression. And we start right now. A regression analysis allows you to infer or predict an other variable based on one or more variables. Let's say you want to find out what influences a person's salary. For example you could take the highest level of education, the weekly working hours and the age of a person. You could now investigate whether these three variables have an influence on the salary of a person. If they do, you can predict a person's salary by taking the highest level of education, the weekly working hours and the person's age. The variable you want to infer, the one you want to predict, is called the dependent variable or criterion. The variables you use for prediction are called independent variables or predictors. Regression analysis can be used to achieve two goals. You can measure the influence of one variable or several variables on another variable or you can predict a variable based on other variables. In order to give you a feeling for this, let's go through some examples. Let's start by measuring the influence of one or more variables on another. In the context of your research work you might be interested in what has an influence on children's ability to concentrate. You are interested in whether you can prove that there are parameters that positively or negatively influence children's ability to concentrate. Another example would be that you want to investigate whether the educational level of parents and the place of residence has an influence on the future educational level of children. This area is strongly research-based and has a lot of application in the social and economic sciences. The second area using regression for predictions is more application oriented. To get the most out of hospital occupancy you might be interested in how long a patient will stay in the hospital. So based on the characteristics of the prospective patient such as age, reason for stay and pre-existing conditions you want to know how long that person is likely to stay in the hospital. Based on this prediction bad planning can then be optimized. Let's look at one more example. As an owner of an online store you are very interested in which product a person is most likely to buy. You want to suggest this product to the visitor in order to increase the sales of the online store. This is where regression comes into play. So what you further need to know is that there are different types of regression analysis but they are not difficult to understand so let's start right now. When talking about regression analysis a distinction is made between simple linear, multiple linear and logistic regression. In a simple linear regression you use only one independent variable to infer the dependent variable. In the example where we want to predict the salary of a person we use only one variable for example if a person has started or not, the weekly working hours or the age of a person. In multiple linear regression several independent variables are used to predict the dependent variable. So you use the highest educational level, the weekly working hours and the age of a person in order to predict its salary. So therefore the difference between a simple and a multiple regression is that in one case only one independent variable is used and in the other case we use several variables. Both types of regression have in common that a dependent variable is metric. Metric variables are for example the salary of a person, the body size, the shoe size or the electricity consumption. In contrast logistic regression is used when you have a categorical dependent variable for example when you want to infer whether a person is at risk of burnout or not. Whenever you have yes and no answers you use logistic regressions. So in linear regressions the dependent variable is metric, in logistic regressions it is categorical. Whenever the dependent variable is yes or no you will use a logistic regression. So for example does a person buy a product yes or no, is a person healthy or is the person sick or does a person vote for a certain party or not. In all cases it does not matter what scale level the independent variables have, they can either be nominal, ordinal or metric. As I already explained the scale level of the dependent variable can be metric, ordinal or nominal in all three cases, in the simple linear, in the multiple linear and in the logistic regression. The dependent variable is metric in the linear case and nominal or ordinal in the case of a logistic regression. It is important to note that in the case of nominal or ordinal independent variables the variables may classically have only two characteristics such as gender with male and female. If your variables have more than two characteristics then you must form so-called dummy variables. In order to explain dummy variables I have a separate video for you which comes in the course of the playlist. Before I show you how to easily calculate a regression online let's start with a quick recap. There is the simple linear regression. A question could be does the weekly working time have an influence on the hourly wage of employees. In this case we only have one independent variable. If we look at the multiple linear regression we could have the question do the weekly working time and the age of employees have an influence on the hourly wage. Here we have at least two independent variables, in this case weekly working hours and the age. And the last case is the logistic regression. Here we could ask the question do the weekly working time and the age of employees have an influence on the probability of having at least one employee. And now I will show you how you can easily calculate a regression online. In order to do this just visit datatab.net and click on the statistics calculator. If you want to use your own data you can click on clear table. I will use the example data now. If you now want to perform a regression analysis you choose the tab regression and now you have the possibility to choose the dependent variable and your independent variables. Depending on the scale level of your dependent variable datatab will calculate a linear or a logistic regression. So for example if you choose salary as your dependent variable which is a metric variable and as the independent variables you choose age and weight datatab will calculate a linear regression. Here you can already see the results which is the model summary, the ANOVA and the coefficients. If you select place of residence as your dependent variable which is a nominal variable and you choose some independent variables datatab will calculate a logistic regression for you automatically. Below you can see the results. It is very important to note that we have not proven causality just because we have calculated a regression model. If this is not clear to you yet just watch my next video on the relationship between correlation, regression and causality. Otherwise continue with the video on simple and multiple linear regression or the video on logistic regression. I look forward to seeing you.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now