Speaker 1: Hello friends, from this video we are going to learn another most important concept, tools and techniques in multivariate analysis. The multivariate analysis includes a variety of tools used to understand and reduce the data dimensions by analyzing the data cover and structure. Before learning the details of each of these multivariate tools, we will go through some of the concepts and introduction of various tools used in multivariate analysis. So let's begin. Variance and Standard Deviation Variance is a measure to compute the distance of any observation in the dataset from the mean of the distribution. The population variance is commonly referred to as sigma square, while the sample variance is expressed by s square and it is calculated as s square is equal to summation goes from i is equal to 1 to n into bracket xi minus x bar bracket square divided by n minus 1. Here s square is the variance, n is the total number of data points, xi is the mean of individual values of data, x bar is the average of all the data points. The reason for dividing by n minus 1 instead of n is that there are only n minus 1 independent deviations i.e. xi minus x bar. Whereas, to bring the variance in the same units as the data under observation, we need to compute the square root. This is called as standard deviation. Covariance The covariance is used to determine the direction of the linear relationship between two continuous variables. The covariance is commonly expressed by sxy square and it is calculated as sxy square is equal to summation goes from i is equal to 1 to n into first bracket xi minus x bar into second bracket yi minus y bar divided by n minus 1. Here sxy square is the covariance, n is the total number of data points, xi is the mean of individual values of variable 1, x bar is the average of data points for variable 1, yi is the mean of individual values of variable 2 and y bar is the average of the data points for variable 2. If we calculate the covariance between one variable and itself, we will get a variance. Eigenvectors and eigenvalues Eigenvectors exist for square matrices with an equal number of rows and columns. But not all the square matrices have eigenvectors. When we multiply an original matrix by eigenvectors, we get a resulting vector which is an integer multiple of original matrix. Let's understand this concept with examples. In the first example, the resulting vector is not an integer multiple of the original vector. Let's see another example. In this example, the resulting vector is exactly 4 times the original matrix. In this example, the vector 3 and 2 is called as eigenvector and value 4 is called as eigenvalue. It is normal to represent eigenvectors by unit vectors. We divide the eigenvectors by its length to get the unit vectors. In the above example, the length of the vector is square root of 3 square plus 2 square which is equal to square root of 13. Thus, the unit vector for the eigenvector is given by 3 by square root of 13 and 2 by square root of 13. Principal Components also denoted by PC. Principal components are variables that usefully explain variation in the data set. Each principal component is one of your original variables or a combination of some of your original variables. For the interpretation of each principal component, examine the magnitude and direction of coefficients of the original variables. The larger the absolute value of the coefficient, the more important the corresponding variable is in calculating the component. How large the absolute value of a coefficient has to be in order to consider it as important is subjective. Use your specialized knowledge to determine at what level of the correlation value is important. Multivariate Analysis The multivariate analysis is used to analyze your data when you have made multiple measurements on atoms or subjects. You can use multivariate analysis to analyze the covariance structure of the data to understand it or to reduce the data dimensions, as well as to assign observations to the groups and explore the relationship between categorical variables. Multivariate Tools The following common tools are used in multivariate analysis. Principal Component Analysis This analysis is used to form a smaller number of uncorrelated variables from a large set of data. The goal of the principal component analysis is to explain the maximum amount of variance with the fewest number of principal components. Principal components have wide applicability in social sciences and market research. Factor Analysis Factor analysis is used to determine the underlying factors responsible for correlation in the data. Factor analysis summarizes data into a few dimensions by compressing a large number of variables into a smaller set of hidden factors that you do not directly measure or observe, but which may be easier to interpret. Atom Analysis Atom analysis is used to assess how well the multiple atoms in a survey or test measure the same characteristics. Using this analysis, you can assess the strength and direction of the relationship between pairs of atoms, to evaluate the overall internal consistency of the test or survey, and to determine whether omitting atoms improves internal consistency. Cluster Observations Cluster observations are used to join observations that share the common characteristics into groups. This analysis is appropriate when you do not have any initial information about how to form the groups. Cluster Variables Cluster variables are used to group variables into clusters that share common characteristics. Using variables allows you to reduce the number of variables for analysis. This analysis again is appropriate when you do not have any initial information about how to form the groups. Cluster K-Means Cluster k-means is used to group observations into clusters that share common characteristics. This method is appropriate when you have sufficient information to make good starting cluster designations for the clusters. Discriminant Analysis This analysis is used to classify observations into two or more groups when you have a sample with known groups. Using this analysis, you can determine how accurately the observations are classified into the known groups. As well as to evaluate how the predictor variables differentiate the groups and to predict the group of observations that have unknown groups. Simple Correspondence Analysis This analysis is used to explore the relationship in two-way classifications. This procedure decomposes a contingency table in a manner similar to how principal components analysis decomposes multivariate continuous data. Using this analysis, you can create graphs to visually represent the row and column points and examine overall structure relationship among the variable categories. Multiple Correspondence Analysis Use multiple correspondence analysis to explore the relationship of three or more categorical variables. Simple Correspondence Analysis performs a simple correspondence analysis on a matrix of indicator variables where each column of the matrix corresponds to a level of the categorical variable. This is all about some of the concepts and introduction of the various tools used in multivariate analysis. We will see each of these tools in detail with practical example from the next video. For references, I have taken some part of this very detailed content from Minitab, ASQ and IOQR. Now to end, please like this video if you have found it useful, add your valuable comments and share this video to your friends and colleagues to improve and refresh their knowledge. If you want to get updates of such videos from my channel, please do not forget to subscribe it and click on the bell icon and select to get all notifications. And finally, thank you for watching. See you in the next video. Bye.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now