Speaker 1: Hello, good evening. So today I'm going to be having a data analysis section with you guys using the SAS software. And this video is basically for SAS beginners. So let me go ahead and share my screen. So here's my screen, and these I'm really working with actual data, because I'm trying to analyze data for my dissertation. So let's get a new page to start our data, to start this data analysis section. So you go to New on the left-hand side of your SAS software, you see, and you click it New. Then if you're importing data, you click on Import Data. If you already have your data as Excel spreadsheet, but if you don't, you want to put it manually on the software, you just click on SAS Program. You click this, and it creates a new page, you see the blank page. So this is it. So let's start. For beginners, so you start by saying data, and you call your data any name. So let me say that I call my data, my data, any name, let me say, you call it any name, that's it. So whatever data you're doing, you call the name, that's what you're going to do. So you put your semicolon, do not forget to put a semicolon, if you don't, when you run the data, it's going to show error. So we go down, and you type in inputs, because you want to put in your variables. So inputs comes with variable. So let me say I'm using two, three participants. So with their names, let me click Names, normally we're not supposed to have names on data, we're supposed to create ID for them. But let me just use name to make this easier for you to understand. So name, since name is a categorical data, we're going to put a dollar sign, because it's not numeric. Then we have age, when I'm going to put a dollar sign, because age is numeric, is a numeric variable, then we'll look at education level, let's say a dual level, I don't need to spell it out, let's say a dual level, which is education level, then other variable will be, okay, we're looking at, okay, let's look at, we're looking at eye blood pressure among males. So we're using education level age. So let's use eye blood pressure. So we can do eye blood pressure, let's say IBP, so high blood pressure numbers. So since eye blood pressure is numeric, the numbers are numeric, we don't need to put a dollar sign at the front, then we'll put our column. So we're working with how many variables, one, two, three, four. So four variables. So if we analyze this data, we should have four variables on our log. Then you click on, you type in cards, and you put your semicolon. So cards mean you want to compute your data now. So the name, let's say we have Peter, which is our first participants, Peter, then our second participant, let's say Adrian, then our third participant should be who? Let's say John, John is our third participant. So for John, let's say John age, make sure you use your space, do not use tab, you won't get your roles. So we use a space, his age is what, let's give John 40 years, then let's give Adrian a space, Adrian is going to be 50, or you see our column with Peter and Adrian on age is not the same. So we have to shift the 40. So age needs to be on same column. Okay, so let's move to John. So John, we're going to space, space to the same column. And let's give John 55 years of age. Then we go ahead, after age, what do we have? Education level. So space, education level, let's say Peter is high school. So we put in high school, and Adrian, high school as well, we put in high school. Then John, let's say college. So college degree, so let me just say college. Then after that, we go to BP numbers. So for Peter, let's say Peter has IBP numbers, which is 160. Then for Adrian, let's say he has 155, which BP is high, because high blood pressure, because we know our normal blood pressure is 120 and below. So let's say John has normal blood pressure numbers, 119. Then you enter, we run. Remember to put a semicolon before you run your data. So we run, let's run this data. Good job. So when we run this data, this is our input. So let's come to the log. So the log should tell you that we're having what? Three observations. Three observations means, right here is three, where Microsoft is, is three observations, and four variables. Three observations mean three participants, which is what we call here. So three, one, two, three, and four variables is our domain, age, high school, and what city. So if this log is correct, there is no error. So you're good to go. So we have that. So this is our data. We got that, and our observations are good. And this is our output, and this is what it looks like. So we want to run frequency, to know how many number of participants have high blood pressure, and how many does not have blood pressure. With this, you can know that two participants have high blood pressure. So let's run. So you do Proc Freq, Proc Freq, that is a frequency. Proc Freq, and you put in the data, Proc Freq equals to Freq, you put a semicolon, and you
Speaker 2: run, okay?
Speaker 3: So Proc Freq, data, equals to Freq.
Speaker 1: So Proc Freq or monosyph frequency, then you say data Freq, which is this, right here, you bring it down here. That is what it looks like. Put any different data name, you're going to have error. Then we run this. So we got our data. So like I said, frequency table, we'll see our frequency. So I have blood pressure. This is what our frequency table looks like.
Speaker 3: So you run your data, and this is the result.
Speaker 1: So this is the results you have for your data. So you can see that for high blood pressure numbers, 119 come out once. So if we want to see that, we want to have, because I put 119 as one, so it's the same thing. So let's say we'll go back to the code and say, high blood pressure, we have 160. Let me pull, I'm going to pull 160 here. So I have blood pressure numbers, 160. So education level, high school, high school, and we have college. So I'm going to run this. So you see, because we have, for high blood pressure, we have two people, which is, we had two people who had 160. So the frequency is what's two, 119 is one person. So we have one person, that's for our frequency for high blood pressure. Then our frequency for age, because they all have different age. So 40 was only one person, 50 was only one person, 55, it was just one person. So that's, that is for age, and for frequency for name, we have any name that come out twice. So it's one, one person each. And then for frequency for education level. So we're going to go back, see why it's showing a frequency for education level missing. So now we come back here. So we need to put, since education level, you see why it didn't come up? Because education level is a categorical data, and I didn't put a dollar sign. So since education, then you're not going to get your, your, your, your variables are going to be missing if you didn't put a dollar sign. Just like the name, I put a dollar sign because it is a categorical data. So your age, no dollar sign in front, because it's numeric data. Your education level should be what? A categorical data, and it should have a dollar sign in front. So now let's run it, we're going to have our data. Go back and run. First, let's run this data first, run this. So I have it, then let's go and run our Proc Freq, Proc Freq, run, yay. So we've got our education level, you see? So whenever you're using a categorical sign, make sure you, it's categorical variable, make sure you put a dollar sign in front, because it's not numeric. So our education level, we have one person frequency, one person, and for high school, we have two people. We have two person for high school. So right now, if you want to run your data, and you want to see if education level have an impact in high blood pressure. Because on this data, we can see Peter has a low high school degree, and Adrian has high school degree, and their blood pressure levels are high. So if I want to check if education level, okay, now we're on Proc Freq, so now the frequency, we can go ahead and run Proc Means.
Speaker 2: Proc Means data equals to Preg.
Speaker 1: Okay, data equals to Preg, then we run. You can pull, perhaps you can pull lowercase, that's fine. Okay, so then I'm making something error, for that not to work, see, I purposely made this error. Why? Because there is no semicolon in front of Preg. So these are the little mistake you don't want to make when you're data. So let's go ahead and run it again. Good. So this is our mean procedures. So this procedure, you see age, the number, we have three people, then high blood pressure, we have three people definitely who reported high blood pressure. Then our mean, which goes on with our age, and you see the standard deviation, the minimum and the mass. So this is what you do, go back to code. So this is for beginners. If you want to run your data, if you want to use the Proc Freq, you did Proc Freq, and you get your frequency table, and you do Proc Means, and you get your means table. And my next video, we're going to be showing how to control for age, control for high school. Basically, we'll be running Chi-square, that's a control for age. If you want to see the impact of age and high blood pressure, or you want to see the relationship between having a low high school, having a low education level, it's related to having high blood pressure. So this is what we're going to do on my next video. So stay tuned for my next post on data analysis using SAS.
Speaker 2: Stop sharing.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now