Transform Raw Data to Clean Excel File in 10 Easy Steps
Learn to convert a raw data set into a clean Excel file in 10 steps. Download the sample file and follow along to enhance your Excel skills efficiently.
File
Master Data Cleaning Essentials on Excel in Just 10 Minutes
Added on 09/27/2024
Speakers
add Add new speaker

Speaker 1: In this video we'll go from having a raw data set like this to converting it into a clean excel file in just 10 steps. So let's get into it and you can download this exact same excel file in the video description for free. So over here you can see that we have the data set this is the raw one basically but before we make any changes to it we should probably save a copy so you can either save a new excel file or just save this sheet by clicking the ctrl key and just dragging that sheet so you have two separate ones. All right now we can get started with step one out of ten and first you can see here that these columns are just too short if I stretch them out the number looks okay but if I close it down you get this sign over here. Same thing with the rows some rows are just too long like this one right over here. So let's go ahead and refit this by just going to ctrl a that's going to select the entire table for us and then we'll go over to the side where it says format and you can auto fit the row width and the column width. Now the shortcut for these is just Alt H O I that's going to auto fit the columns and then Alt H O A to auto fit the rows. Now you can see that we can read everything a lot better. Now that we can start to see the data set a bit better you'll notice that the client names are very very long so let's suppose that we would like to shorten it such that we don't have anything within these parentheses. So for this what we can do is select the whole column you can just click up over here and then we're gonna go to the replace tool so that's gonna be over to the side under find and select and just clicking on replace there the shortcut for it is simply ctrl h and so within it what we want to find is anything within the parentheses but what makes it hard here is that there's sometimes of different lengths. So what we can do is just put the parenthesis sign, use this asterisk and close the parenthesis. That basically means that anything that's within a parenthesis is going to remove. So replace with nothing, then we're gonna hit on replace all, hit ok there and just close and now you can see that's looking a lot better. We can double click here to resize. Continuing on with the client names and let's suppose that we want these to be in lowercase so they're a bit easier to read. What we can do is first add a new column so we'll go ahead and select this column D and just hit ctrl shift plus and now that's a new column for us we'll call this a client again and then the function that we're going to use is called lower lower so equals lower and it converts all the letters into lowercase so we're gonna go tab there and we want to convert all of these letters over here so we'll just hit enter there on the first one and then we can double click here on the side to just drag it down all the way to the bottom there. Now because we have the same column twice we could just delete this one so hitting ctrl minus but the problem there is that it all breaks and the reason for that is because it's referencing this column over here so first we're gonna need to paste these as values instead of as a formula as they are now so we'll go ctrl shift down to select all of them ctrl c and then we're gonna paste these as values so down over here you can see that we have a lot of different pasting features and we want to paste it as a value which is this one right here the shortcut there is alt h vv. Now we can go ahead and remove this column so ctrl minus. Next up in number four you can see that we have the different contacts but the problem is here they have some very odd spacing and sometimes they're capitalized sometimes they're not. So let's go ahead and make that change. So we'll go ahead and insert a new column here so ctrl space ctrl shift plus. We'll call this one contact again and so first let's say we want to remove the spacing we can simply use the trim function hit the tab key there and we'll just select this and you can see that it's removed all that spacing problem and similarly we have what's known as equals proper hit the tab key there and just select that and what this one does is that it only puts a capital letter at the first letter of each word so we can now just merge these two together so we've got the proper there but at the very front we're gonna add the trim, hit the tab key there and then we need to close the parentheses for both and just hit enter. Now you can see that it looks cleaned up and we'll just double click there to drag it down and then again we need to paste it as a value so we'll select the column by hitting ctrl space, ctrl c and then alt h vv that's gonna paste it as a value there. Now we can remove this column over here by hitting ctrl minus. And if you're liking this video and want to level up your excel skills you can consider checking out our excel for business and finance course. And what makes this course different is that it's all applied to the real world. While we still cover theoretical lessons like formatting, formulas and charts, we also offer case studies that simulate the type of work you might be assigned in your day-to-day ranging from financial modeling to cleaning a data set and presenting some visual insights. And if you get stuck along the way, you can always ask us, the course instructors, any questions on the discussions forum. We also offer several other courses including Power BI, VBA and Macros, and more. So if you're interested in checking it out, go to the link in the description below. Alright, back to the video. Moving up to step 5 and you can see over here that the department is split into the department name and the region so we would like to separate this into two different columns. So let's go ahead and add a new one just by hitting ctrl shift plus and let's call this one the region. From here what we can use is this trick called text columns. So first let's select the whole area there so ctrl shift down to select all of these then we're gonna head over to data, text to columns so this one right here and then we're just gonna hit on next and here you can see basically this is what it's gonna separate each of these columns by and in our case it's that underscore so we want to head over to other and just type an underscore in there and you'll see under preview what that currently looks like. We'll hit on ok there and then the destination where we want the output to be, we want it to be right here in E3. Once we're all okay with it we're just gonna hit on finish there and you can see that it's split it into two separate columns. Now in step six before we move any further we should try to check if there's any duplicate values. So for this we can select the whole table by hitting ctrl a and then we're gonna go over to data again and it's gonna be this icon right here that says remove duplicates so just click on that we're okay with these our data does have headers and just hit on okay and it says that it's found three duplicates and let's remove them so that looks all cool. All right now in number seven let's move on to the next column which is gonna be the payments one in this case and you can see here that we have some issues in that there's some blank cells now for this it would be nice to not have it as blank and maybe type something like an na but suppose this is a very long data set and so it's going to take a lot of time to go one by one instead what we can do is select the whole table so ctrl a then under home we're going to go over to find and select go to special and within this we want want to select all of the blank cells so we're gonna go over to blanks and just hit on OK and you'll notice there that it selected the blank ones for us from here we want to add an NA sign so we'll go up over here under the formula bar just type NA and then it's very important that we hit the ctrl enter and not just enter there and you can see there that it's changed all of these into NA for us. Great, now moving on to step 8 and over here under column J you'll notice that we have this formula which is simply the profit divided by the revenue. The problem is down over here because we've got text on one side it doesn't quite work and it gives us this error sign. So to work around that we're gonna use the IFERROR formula. So right here at the very front we're gonna type if error hit the top key there and the idea here is that if there is no error it's gonna do the normal calculation we'll hit the comma there now if there is an error we need to give it an alternate result which in our case is the na which we need to put in quotations because it's a formula so we're gonna put it in there close the parentheses and hit enter and then we're going to drag that down. Awesome, now we can see how those signs have changed to an A for us. Moving on to step 9 and now that we're happy with the values let's go ahead and format the header row. So we'll select it by hitting ctrl shift right, let's say we want it boldened so ctrl B and we can also change the highlight colors let's say we go for a dark blue and we change this over here to a white so we can see the header there. And finally in step 10 we can get rid of these grid lines as it can sometimes be a bit distracting. We'll head over to view there and just click on grid lines for that. The shortcut is Alt W V G that's going to activate it again for me so let me just remove them there. Awesome. Now that we've cleaned the data the next step would be to create some visuals out of it and you can learn how to do that with this video over here or by taking our course over here. Hit that like and that subscribe and I'll catch you in the next one.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript