Building a Healthcare Data Analyst Portfolio: Top Datasets and Tips
Learn how to create a standout portfolio for healthcare data analysis. Discover key datasets, visualization tips, and resources to enhance your skills.
File
Portfolio Ideas for Healthcare Data Analysts that will LAND YOU A JOB
Added on 09/07/2024
Speakers
add Add new speaker

Speaker 1: If you're trying to get a job as a data analyst in healthcare, it can often be really hard to know where to even start if you're trying to build a relevant portfolio. That's because you cannot share data that could breach patient confidentiality, and when sharing data from a hospital or clinic, there's also proprietary considerations to keep in mind. So what does that leave us with? Well, despite the many protections on healthcare data, there's still an abundance of data out there that you can use to build some awesome portfolios. I'm going to be covering some of my favorites today. Now before we get into the weeds, I just want to first cover the difference between healthcare data and public health data. Healthcare data is largely concerned with monitoring and managing health problems at the micro level. This would be things like monitoring the day-to-day workings of a hospital and its patients, like how effective was our treatment of cancer patients, or how effectively did we move patients through the emergency department. Healthcare data is more scarce because it tends to deal with more sensitive information. Public health data, on the other hand, is more concerned with the study and surveillance of health problems and solutions at the macro level. Here, we are dealing with large regions of people, like in cities, states, countries. Because you're dealing with large geographic regions instead of individual patients or clinics, public health data is much easier to find. But whether you're dealing with public health data or healthcare data, both are going to be solid choices for you, so I'm going to talk about each of them today. Now let's talk about datasets. My first recommendation is Causes of Death, Our World, and Data. You can find this dataset on kegel.com. This dataset explores the number of deaths by cause in each country. In this dataset, you will find that people in developed countries tend to die of conditions related to old age and natural causes, like heart disease or Alzheimer's, whereas the leading causes of death in developing countries tend to be from things like HIV or malaria. This is an eye-opening dataset that really lends itself well to building some advanced visualizations called Pareto Charts. I actually have a video already where I explain what Pareto Charts are and how to build them using this dataset, so check out that video if you haven't already. Next, we have data.cms.gov. CMS is the Centers for Medicare and Medicaid Services, and they have tons of free data on their website. One of my favorites is the Hospital Compare data. Hospital Compare is a way of comparing one hospital to another to see what things they're really good at and what things they have opportunities to improve on, things like patient safety and quality of care. They have an overall star rating for each hospital, which looks at things like how many patients got infections while they were in the hospital, or how long did patients have to wait before they got treated in the emergency department. Now, the downside to this data is that they might use a lot of jargon that you might not be familiar with at first, things like SIR or PSI-90 or fee-for-service. I'll be releasing more videos in the future about this. Just know that you might have to do some research ahead of time to really understand what these things are getting at and how to best visualize that data in your portfolio. If you're looking for something a little bit more user-friendly, check out the CMS patient survey rating. This measures things like what percent of patients said that the hospital was always quiet at night, or what percentage of the patients said that the doctors always treated them with respect. I'm going to have all these links in the description down below, so do check that out. But if you want to check all that out on your own, you can go to this URL, scroll down to hospitals, and then you can look up datasets like hospital readmissions reductions program, hospital acquired conditions, and more. Now I'm going to talk about one of my all-time favorite sources of data, and that's hashtag projecthealthviz. When you're done watching this video, I fully encourage you to go straight to Tableau Public and type in hashtag projecthealthviz into the search bar. You're going to be overwhelmed by how much cool stuff there is on there. This is one of the coolest sources of data I think you're going to find, and it's not just one dataset, it's a collection of datasets. Project HealthViz was developed by Lindsay Betzendal, who was working in the healthcare field and realized that there was a real lack of healthcare data being used in the Tableau community for dashboard development. So she decided to do something about that. Here's how it works. Almost every month, Lindsay is going to publish a dataset. That dataset matches one of the following six themes. You have national monthly awareness, healthcare systems, diseases, quantified self, health equity, and public health. Quantified self, by the way, is where people provide their own data. For example, Lindsay made her own sleep data available to the public for analysis and visualization. To get to these datasets, go to her website at vizzendata.com. Click on health and healthcare datasets, and there you will find all of the data that she has collected over the past few years. You can also follow her on LinkedIn or Twitter to see all of the new data that she's publishing each month. Now, if you're building a portfolio using data from Project HealthViz, I want you to pick one out that corresponds to equity. What is equity, you might ask? Well, equity is ensuring that each person, regardless of their background, can reach their full health potential. For example, as of this recording, there is a brand new dataset that's been published called the Loan Hospital Index for Equity 2022. This looks at how fairly hospitals pay their employees, how much charitable care the hospital provides to the community, and how inclusive that hospital is of patients from diverse backgrounds. Now, you might be asking, why am I focusing on equity? Well, because many healthcare organizations around the country are really starting to recognize how extremely important this subject is. So if you have a portfolio entry about equity, and you can speak to it in some way, you are going to look awesome during your interview. But in general, I like to find a dataset that's interesting to me in some way. Then I see what other people have done on Tableau Public. I borrow from some of the best concepts that I see, and then I recreate them with my own personal touch. And I think that's one of the quickest ways that you can learn data visualization. Project HealthViz has inspired some of the coolest data visualizations I have ever seen. So I highly recommend that you check that out when you get the chance. Last, we have real-world fake data by Sons of Hierarchies. Wait a second, did Josh just say fake data? Yes, I did. If you want to build a public-facing dashboard that emulates what a hospital dashboard is really going to look like, you are going to have to use fake data. Because otherwise, that data could potentially identify individual patients, and that would breach patient confidentiality laws, known as HIPAA. Real-world fake data is a good solution for this. They have a dataset labeled Healthcare-Emergency Room. You can access this on data.world. The data has things like patient age, name, satisfaction score with the hospital, how long they waited in the emergency room to get treated, among other things. Now, one limitation to this dataset is that it's not very real-looking. In reality, your data in the emergency room is going to look way more variable than this. Here's one example. In this dataset, you'll find that there weren't any patients over the age of 79. In reality, you're going to have some 80-year-olds, some 90-year-olds, and even some 100-year-olds. There's other things about this data that are unrealistic, but for the most part, it's a pretty good mock-up of what you might expect to see on a hospital dashboard. There are some really awesome-looking dashboards that use this data on Tableau Public that will inspire you to make your own awesome dashboards. Speaking of which, here's my own dashboard that I have built off of this data. If you want to build this, check out this video next. I hope this inspired you to build an awesome portfolio. If you feel like your data visualization skills aren't that great yet, keep at it. I promise you that you are going to get better with practice. Thank you so much for watching. I'll see you in another video.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript