Speaker 1: Hello guys, welcome you all once again, like for this AML end to end session for a year. So you guys are going to learn everything from scratch. So in order to become a successful data scientist, so before coming to the session, please, please make sure guys you have an 100% understanding of whatever the session that we have done before. And please if you guys have not watched to the recordings that we are adding further like, please watch it because like it is in the playlist I think. So, so please watch all the recordings so that you can have a better understanding of what we are telling. So I think in the previous session, we have started the statistical inference. Today we are going to complete it. So today we are going to see some more depth of statistical inference. So first thing is inference statistics. Today we are going to see that. So it is hypothesis testing and confidence interval. So what is in the hypothesis testing? A formal procedure for evaluating a claim about a population parameter based on sample data. I will put it in a simple manner. We are going to say some statement, which is based on that we are going to find whether it is a null hypothesis or alternative hypothesis, consider. So based on the feature, let us consider we are having a heart disease prediction dataset. So we are going to say a null hypothesis is the person if he has a glucose level of something, he does not going to have a heart disease. So what is will be the alternative hypothesis? If the person have this glucose level, he is going to have a heart disease. So that is what null and alternative hypothesis. So it is a very, very important concept which we need to be very, very clear. So in order to make whenever the problem statement is given whether it is a null or alternative hypothesis, we need to fix by using this hypothesis test. Next is confidence interval. So it is a range of value that is likely to certain the true positive population parameter with a certain level of confidence. It will give whenever the data is around a different population, it need to give what is the confidence level of it belongs to this population or this population like that it will give. So these two parameters are like very important for making inference statistics of data, so that you can have a better understanding of what the data is. So yeah, that is it from me.
Speaker 2: Yes, over to you. So we have to understand these like inferential, so inferential statistics, we have to understand, we infer with the stat. So we get the insights, we infer from the data using statistics, okay. Hypothesis testing is one of the thing. So we work on the hypothesis, we do testing, confidence interval, okay. So hypothesis testing is a formal procedure to evaluate the population parameter based on the sample data. So you have to understand what is population first, okay. So population is, suppose you have, you are doing testing, you are doing hypothesis testing, how many number of data sets you have in a particular, so who is the, that is what is the population is, on whom, who is the population on which you are, you have the data or you have done the hypothesis testing. So when you, suppose you said 10,000 people you are collecting data, you have collected data from a 10,000, that is population. So what is the parameter on the population based on the sample, okay, sample, based on the sample data. So then you take the sample, so what is the total population and what is the, so when you do hypothesis testing, so you evaluate the claim about the population parameter, okay, based on the sample data. So you cannot take all the population as a parameter, you have to take from the population parameter as a, on sample data so that you will do the hypothesis testing, okay. You have to understand what is population, what is parameter and what is sample data. So sample data is, we do not take the entire data, we take certain part of data as a sample data from the population and then we do hypothesis testing. Then we have confidence intervals, so range of value that likely contains the true population parameter, okay, with a certain level of confidence. So again, confidence interval, so again from the population parameter we take a positive, positive true population, like we take the value likely which is true, which is positive from the parameter and with the certain level of confidence. We have a confidence on that particular data which we have taken that it is true, okay. With that parameter we do confidence interval, okay. Let us go to the other slide.
Speaker 1: Yes, sure. So next we are having regression analysis. So this is a very important thing. It comes under the model, but one of the, regression analysis is one of the, okay. So we are having linear and logistic analysis. So linear, it predicts linear target value based on independent value, consider we are predicting the house price, okay. Whenever a linear regression comes to your mind, you need to just think of, you are going to predict the price of the house, okay, based on different concept, like square feet, based on the number of bedrooms we are having, number of, like how many floors are there, based on the square feet, based on the price. So like those features makes the price of the house, so we are predicting the price by using linear regression. Whereas logistic regression is used to find the categorical value, okay. It is used to find the, like target value based on one or more independent values, like we can find whether the mail is spam or not, or whether the person is having heart disease or not, like those things, okay. So these are called as regression analysis. One is, for linear regression, if you say, it is related to the regression problem, whereas logistic regression, the name only implies like it is logistic regression, okay. But the logistic regression is more like, it is like a classification problem, okay. Guys, just understand, linear regression finds the continuous value, so it belongs to regression problem, whereas logistic regression performs and finds the classification values, that is discrete value, so it belongs to classification problem, okay. Yes, sir, over to you. So that is one of the major things that we are having, okay. For regression analysis, we will be having two different things. One is linear regression and then one other one is logistic regression. So whereas the linear regression, as I mentioned before, it is going to have the, it is going to predict the continuous value, so it belongs to regression problem. Even though logistic regression has a name regression, but it belongs to classification problem, since it predicts the discrete value. Yes, sir, any points that you need to add on it, sir? Yeah, so yeah, so that is the thing. So yeah, guys, so this regression analysis is a very, very major thing, guys. So you guys need to understand what it is happening, okay. So that is a major thing when it comes to statistical inference. It is one of the part of it, okay. So this is what all about this regression analysis that we are having, linear and logistic regression. So we will be learning somewhat deeper when we go for a model building, like what is the formula behind it, like for linear regression, we will be having, I think you guys have known, like for a line formula, we will be having y is equal to mx plus c. That is what we will be having for this linear regression, whereas for the logistic regression, we will be having something called as 1 by 1 plus e power minus x, it is for the S-curve, okay. For a line, we will be having linear regression, but for a S-curve, it will be like a logistic regression. So yeah, hope you guys have understand this, okay. So yeah, sir, anything that you want
Speaker 2: to add, sir? Okay. Yes, yes. Yeah, of course. See, guys, we have to understand linear regression, it is always the, it predicts the continuous target variable based on one or more independent variable, okay. So the application is predicting house prices based on size and location. You can apply it to anything. If you have a restaurant, coffee shop, you can find the different pricing, you can try and using linear regression, you can use a different products. So you need to understand, it should be a continuous target variable based on one or more independent variable. So there you can apply linear regression. Logistic regression is, it comes from log. So the word logistic comes from log, linear comes from a continuous line. Line is linear, right? It is continuous. Line is like, it is always continue. Logistic regression comes from predicting categorical target variable based on one or more independent variable. So here it is categorical. So you cannot use logistic regression instead of linear or linear instead of logistic. You need to understand what kind of characteristics of your data and model, then you have to select the model, which one you have to use. So it is always used for classifying the emails, okay, spamming or not spamming, right? So this is how you use logistics and linear regression according to the characteristics of the data and the problem you are solving. Yes.
Speaker 1: Yeah, that is the thing, okay. So yeah, guys, that is what about this regression analysis. So next we are having something called as time series analysis and forecasting. Guys, this is going to be very, very important concept. Guys, have a focus on to it. So we are seeing next thing like time series analysis and forecasting. So first we need to collect the data. We need to collect time series data over a specific period such as sales figure for a year. So for each year, for year kind of data, we need to forecast means, for each month, what is the sales. Next, trend analysis identify the long term direction or pattern in the data such as the increasing in trend. So trend analysis means, we can have like, for a summer season, we will be wearing some different kind of cloth, whereas for the winter season, there will be something different kind of cloth. So there will be some trends like, so like that we need to, we can find by using this trend analysis and forecasting. Next is seasonal pattern, like recognizing any repeating pattern or cycle within the data, such as higher sales during the holidays, holiday season, okay. So trend is also comes through this pattern, but like season is perfectly related to season, the summer and winter. But trend is like, like in a shark market, we will be having in a close time, there will be some increase in the trend, some increase in the value of shark, so star, that like those, okay star. So, but the seasonal pattern is like purely depends on season. Autumn, this kind of cloth will be there. Summer, this kind of cloth. Winter, this kind of cloth, like that pattern we can find. Forecasting, so using statistical model to predict future value based on the time series, based on historical data and iron frame pattern, because like once we have a data of past years, it will be very easier for us to predict the next year, because we are having a year of data, which is like 12 months, okay. So now we have a proper understanding of what is going to happen in the next month based on the data, which is informed decision you can make based on the previous data that we are having, okay, based on the historical data, okay. You are not saying just like that. You are having a proper data based on that you are just saying, okay. So, these are all about this time series analysis and forecasting where you will be collecting the data, trend analysis, seasonal patterns and then forecasting. So, yeah, that is it from my side.
Speaker 2: Yeah, sir, over to you. Yes, time series analysis and forecasting, we always do through the graphic you can see. So, this is how it will look like. So, you will see how the, when we started something and now, time series means through the time, through the time series, we do analysis, where is gone up, what is the behavior, when it is gone down. So, we usually use it in stock market, right? You will see all these trends. So, you will want, if you want to invest in SIP, you will see for last 10 years, how HDFC bank, SIP or mutual fund has behaved, okay. Based on that, you will find, okay, what was the average, what we say mean, okay, the returns of that particular stock. So, suppose it was 12%. So, you will say, okay, no problem, I will invest into this. You will do future prediction based on the past. So, this is what is, when you understand the time series of past, then you do forecasting based on that, you invest into that fund for next 5 years, you will say, oh, for next 5 years, I will get 12 to 15% returns on this, because you have done analysis on time series, okay. So, data collection, collecting data, time series over the specific period, such as sales figure of year, whatever. So, you collect data, I have given an example of SIP mutual fund here, and then you do trend analysis, identify long term directions, patterns, which I said, you will find that and then you find the mean of it. And then you say, oh, average, average, you know, average is a very, very powerful thing, if you understand average, okay. And then you go to average, oh, what was the average of this stock, sometimes it went to 20%, sometimes 15, 10. So, what was the average? Based on average, you make decision, okay, which is mean, what is the mean of that. So, then, such as increasing trends in the sale or decreasing and then you find them, okay. Then seasonal patterns, you do seasonal patterns, like what are the, which season has, suppose you have a clothing brand, and you want to do that brand, you want to do research, and you want to find that how we can bring more sales and stuff. So, you will go and find the pattern cycles within that data, such as higher sales during the holiday season, rainy season. And there is an example. So, in one country, whenever cyclone will come, before cyclone comes, the particular suite, the sale starts increasing. So, as a data scientist, you need to sometimes use different methods to find the reason why sale is going up, why sale is going down. So, sometimes you will see, there is a very good case study on diapers and beer sale. So, whenever in US, if you see the diapers, then they will, what they have done is, they will keep diapers and beer in the same section, so that people when they buy diapers, they will buy beer also. So, that correlation is different. But as a data scientist, you have to always observe the data and find new insights, hidden patterns from that and make changes so that you bring more sales or make better performance, okay. So, forecasting using statistical model to predict future based on the time series, behavior, historical data, as I said earlier, that you will invest into SIP based on the performance of 10 years. So, you predict, oh, for last 10 years, it performed this way, let me, you will find patterns and all the insights and then you invest for next 5 years. And you know, I will get average 10 percent, 12 percent return on it. This is how you make decision for last 10 years, for next future 10 years. And that gives credibility for that stock or anything. So, you can use that way forecasting. Let us move to the other slide. Thanks.
Speaker 1: Yeah, sure. So, yeah, so now we are going to see the final thing like what are the practical applications that we are having on the case studies that we are having related to this statistical analysis, okay. So, first is healthcare, we will, we can identify the patient data to improve the diagnosis, treatment and outcome. That is right, right. So, most of the time, if we can able to find these are the things that going to happen, it will be very useful for the doctors to treat the patient before anything that becomes very major, okay. So, next thing is finance, where we are predicting market trends, managing risk or optimizing the investment strategy. So, this thing, what happens means we can make sure the money of the investor is not going to waste, okay. So, because we already know this pattern, so that is going to happen like this. So, we can safeguard, okay. So, like that we can do. Next is the marketing behavior, like understanding customer behavior, target capital and then measuring campaign effectively. So, I can put it in a simple manner. So, whenever a person who need data science, then they can watch this entire session, right. So, this is a way of audience. So, like we are just picking the audience who need this, okay. So, without a person who does not like this or does not need anything from this data science role, then there is no point of just selling this course to them, okay. So, that is what about the marketing and then e-commerce is the personalizing recommendation, optimizing product prices and forecasting the demands, like we have explained this before itself, okay. So, we are the group of people where we will be giving something called as an e-commerce recommendation because like we are all learning data science. So, there is a possibility, high possibility that you all get the recommendation related to the books which is for data science, okay. So, this is all about the practical application and the use cases that we
Speaker 2: are having for this service. Yeah, sir, over to you. Yes. So, this is very important, the practical application and use case, you can use it healthcare, finance, marketing, e-commerce, okay. Analyzing in healthcare, analyzing patient data, improving diagnosis, treatment, outcomes and real-time analysis. You can build an application and which can monitor on behalf of doctor because doctors are very less in India. So, how you can use AI monitoring doctor, all the patients real-time so that if there is a priority, suppose you have 100 patients in the hospital and the treatment is going on, right, and there are fewer doctors. So, how can doctor prioritize that which patient needs attention and AI gives the diagnosis or the signal to the doctor on an application. So, you can do n number of innovations using data science and using AI models and then optimizing the healthcare. Then we have finance, you can predict trends, managing risk, optimizing investment strategies. You can make systems and applications which are risk like you will ask certain questions which will help the model to learn that, okay, this person algorithm to learn that this person is high risky medium or low risk. So, you can predict all those trends and if somebody is investing, so how can he get best returns for the investment. So, this is how you can optimize that marketing, how you can have better conversion rates and sales doing effective campaigns and understanding the customer behavior and all those things. E-commerce, you know Amazon. So, all the personalized recommendation happen through e-commerce, optimizing product pricing and forecasting demand. So, all these things are, you can develop applications and case studies, use cases based on these and other n number of fields. Wherever you think data can make sense and can optimize that particular field or industry, you can apply data science anywhere where you think data science can help improving the particular industry. So, you develop application for that. Okay. So, let us conclude this session. Thanks.
Speaker 1: Yeah. So, yeah, that is everything guys. So, this is what the statistical analysis that we are having. So, thank you so much guys. Thank you for joining with this thing and we will be meeting in the next session. Hope you guys have enjoyed this session. Please make sure that you guys, if you guys have any doubts, anything that we need to adjust like improve, just put it in the comment section. We are, everything is welcome. Okay. So, yeah, guys, just make sure that you guys have everything clear. If it is not, just put it in the comment. We are here to take it up. So, and please subscribe to our channel because it is going to be a year long program and very interesting thing. So, yeah, that is it from our side guys. Thank you so much. Thank you for your
Speaker 2: time and we will meet in the next session. Bye guys. Bye. Thank you. Thank you so much. Go through all the sessions 5 to 10 times. Do absorb the information, digest the information and build something great and solve the world's big problems through data science, AI researcher and develop some innovative startups, become an entrepreneur or build your career through these sessions and give us feedback how you're learning, how you're improving and how we can improve and what have you learned, where you have applied it. And if you have any projects or something, assignments, so send us at connect at Noble, connect at the Noble transmission hub.com and write in the comment section, subscribe to channels and give us feedback. Thank you so much.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now