Survey Scale Choices: Likert, Points, and Prioritization (Full Transcript)

Guidance for new researchers on choosing rating scales, midpoints, 5 vs 7 points, and better prioritization methods like semantic differential, rank order, and constant-sum.
Download Transcript (DOCX)
Speakers
add Add new speaker

[00:00:00] Speaker 1: Hi, friends. I'm Katherine Korostoff, and I'd like to welcome you to another episode of Conversations for Research Rockstars. And today, I'd like to talk about something that I've been hearing from some of our students who are involved in survey research. And survey research is a topic that's near and dear to my heart. That's how I got my start in market research years ago. And those of you who are Research Rockstar students may be familiar with some of the survey research classes that we have at Research Rockstar on questionnaire design and project management, report writing, and so forth. There are a lot of topics in the world of survey research. But one of the questions that I've been getting lately is coming from a very specific context. And that's the context of people who are being asked to design their first professional level survey. They are being asked to do a questionnaire design from scratch, and they aren't necessarily being given a lot of direction. So they may know what the research objectives are. They may have some ideas about how they're going to approach. But they haven't really been given a lot of direction. They might not have even had training yet on how to design professional quality questionnaires. Maybe what they've been given are just a couple of examples. It's a common scenario. Somebody says, I need you to design a questionnaire. I know you haven't done it before, but here's some examples of ones we've done in the past. Now, that's great that they're giving you examples, right? But it still puts you in this uncomfortable situation because, well, do I know for sure that those examples are great questionnaires? Could there have been opportunities to improve there? And I see things in that questionnaire that maybe I wouldn't have thought would be appropriate if I'm trying to collect high quality data. But I don't know what the theory is behind it. I don't know why the question was asked this way. So I know it can be really stressful if you're being asked to design a professional quality questionnaire, but you haven't had time yet to have any training. You're just being given some examples, and maybe you're not even sure if those examples are awesome. Let me talk about one of the biggest issues that I see new researchers getting into that can unfortunately derail their questionnaire design process. When we're designing questionnaires, of course, one of the things we have to do is write questions. And we know that the questions have to support our stated project objectives. What I find is that a lot of times, even newer researchers do a great job of writing the question. It's the answer options where they stumble. Now, in survey research, obviously, we are collecting data that's fairly structured. So our questions are typically closed-ended questions. We're not doing a lot of open-ended questions. In an open-ended question, we're just asking a question and then adding a text box. I don't have to give answer options. Now, in typical surveys, we usually won't have more than two or three open-ended questions like that because they take more cognitive effort. And frankly, compliance with those questions isn't great unless you're doing telephone data collection. Usually, compliance on telephone data collection is good. But if you're doing online surveys, a lot of people really don't answer open-ended questions very thoughtfully. And that creates the issue of having to clean the data or just having to accept low compliance. So in survey research, most of our questions are closed-ended questions. Now, sometimes they're just very simple, check all that apply questions or check one questions. Sometimes the questions are very simple. But usually, we're doing something a little bit more. Usually, we're using this questionnaire to measure attitudes, perceptions, reactions to a concept, et cetera. So it's not just binary. It's not just yes or no. It's not just checking one of multiple boxes. It's something that's more nuanced. And that's when we start talking about rating scales. Now, when we talk about rating scales and questionnaire design, it's a really big topic. We have two whole courses. We have Questionnaire Design 101 and 201. They get into rating scales. So it's a really big topic. But I want to point out some of the things that I think do trip people up, especially if it is your first time having to do a professional quality questionnaire. So the first thing I want to talk about is a little bit about scale jargon. And when you look at a questionnaire, and you might see a question like, please indicate your level of satisfaction with your automatic opening trash can. So you work for a company that makes automatic opening trash cans, and you're doing a satisfaction survey. OK, well, that seems like a pretty straightforward question. Please indicate your level of satisfaction. Great. But then how do you provide the answer options for that? Do you do it as a rating scale, where it is rating from very dissatisfied to very satisfied? Or maybe do a rating scale that is what we would call unipolar. So it goes from not at all satisfied to extremely satisfied. So there are different ways of structuring that rating scale. It seems very simple, but there are two different options. So in the case of the bipolar scale, I might be going from the lowest score of very dissatisfied, somewhat dissatisfied, neutral, somewhat satisfied, very satisfied. So that's my rating scale. But again, I could have done it that way, or I could have done it the unipolar way. What would your preference be? What would your gut be? Now, there are different opinions. There is a lot of research on the efficacy of both approaches. But a lot of researchers do prefer the bipolar approach as a way to avoid any potential acquiescence bias. For example, when you've got the unipolar approach, you can see that the word satisfied is in every single answer option. Whereas with the bipolar approach, we're more encouraging or inviting extreme opinions. And so that's why some researchers do prefer bipolar. But again, it's a choice, and it's a choice that you might not have ever learned. The other thing about scales that is important is that there are considerations. We have to think about whether or not it's unipolar or bipolar, but we also wanna think about how many points do we want on that scale. Now, in a lot of online surveys, you see that the default is five points. Is five points always the right number of points? No, and there's a lot of research on research on this. And for those of you who are taking questionnaire design 201, you've seen a lot of the links to some of those different experiments cited in the course. There is a lot of data that suggests that in different situations, either longer or shorter scales are going to be appropriate. We also have to make decisions about scale direction. Are we going low to high or high to low? Now, that is a big debate I don't have time to get into today, but there again is a lot of research on what is going to be best when measuring customer attitudes and opinions. And then of course, there's debates about how to label scales. So the whole scales topic is really, really big, but I wanna just focus for the next little bit on what the scales are. Now, when we talk about rating scales, one of the phrases that you'll often hear is Likert scale. And Likert scales are a type of rating scale. And a lot of people sort of assume that all rating scales are Likert scales, but they're not. The Likert scale is actually named for a very specific gentleman who was doing very specific research about measuring attitudes. And so the common five point scale that you see described as being a Likert scale typically is an agreement scale. So from strongly disagree to strongly agree. So you're showing people a number of statements. Tell us your level of agreement with these statements about your automatic trash can. And people for each statement are gonna tell us whether or not they strongly disagree, disagree, neither disagree or agree, agree or strongly agree. And that would be a rating scale. Now, I do wanna share one little piece of research trivia with you. It is commonly referred to as a Likert scale, but actually if you ever hear somebody call it a Likert scale, they are factually correct. The gentleman for whom the scale was named, his name actually was Rensis Likert, and he was a very famous sociologist. And he's the one who was the first to create the scale and document its use. And it really became very popular in both psychology and in market research. But whatever you like to call it, Likert or Likert, it is a very common type of scale. And there are a lot of decisions that we have to make even once we decide that that's the kind of scale that we're going to use. One of the big decisions we have to make is about how many points. Am I going to be doing a four point scale, five point, six, seven, 10 or 11 point scales? So if you've done net promoter score research, you've probably seen 11 point scales. Outside of net promoter score, we typically don't use such long scales. So for a lot of researchers, the real debate is, is it gonna be four points, five points, six or seven? And that, of course, gets into the dilemma of, do I want an even number of points or an odd number of points? This, again, is almost a religious debate amongst people who do survey research. But there is a lot of existing research on research that I always encourage people to read. And based on my experience and my takeaways from the research, I feel very confident that in most cases, an odd scale is gonna give me the most accurate data. Because if I take away that midpoint and I force people to have an opinion, I might be forcing an opinion on something they truly don't have an opinion about. I mean, you might be measuring satisfaction with automatic trash cans, and you know what? There might be some satisfaction aspects that people really don't care about. Their honest answer is neutral. And so forcing them to have either a positive or a negative response is basically forcing them to report something that's not true. I'm in the camp that says in most cases, not all cases, but in most cases, I do prefer an odd scale. So I would choose a five or a seven point scale because I believe that giving people a midpoint is important to data quality. Now, some skeptics will say, oh, then everybody's just gonna always pick the midpoint and you're not gonna get any data. They're just gonna click that middle point all through an online survey. Or if they're doing a telephone survey, they'll always pick three instead of anything else. Actually, that's not been my experience. You will get some people like that, but mostly not the case. I've never actually had a project where I found that there was an inordinate number of people who are taking the easy way out by just doing the midpoint. Now, I will say that I used to do, many years ago, I used to do more paper surveys. I did see more of it then. But online surveys and phone surveys really hasn't been an issue for me. So you do have to make that decision though. And when you are looking at the samples, if your colleagues have given you samples of questionnaires from the past that they've used, note, did they use even or odd scales? If they used even scales, ask them why. Maybe there was a particular reason. And so always happy to learn, we should listen. But if they don't have a strong reason for why they took out the midpoint, it might be a good idea to consider putting it back in. So it is an important decision we have to make with rating scales. Are we doing even or odd? And how many points are we gonna have on that scale? Now, the question about five-point versus seven-point scales. When you look at a lot of online questionnaires, especially, you do tend to see a lot of five-point scales. And five-point rating scales, again, super common. However, depending on what your topic is, you could be better off with a seven-point scale. There has been a number of research studies that actually showed that in cases where you have a topic where people do have a lot of variability in their opinions, that they will actually use all seven points. And that's important. Now I'm collecting more granular data. Let's say I'm asking people about their experience with their automatic trash cans. I might be asking them for their overall satisfaction, with the weight, satisfaction with certain aesthetic aspects of it, satisfaction with the reliability with which it does automatically open. I could be asking about their satisfaction with a number of different things. And if those are things people really care about, and I give them a seven-point scale, I'll see it because people will use all seven points. But if I'm collecting my data with seven points and I can really see that the answers are clustered, people are just staying neutral or people are staying at both ends, maybe there's not that much variability and I could have done five points instead. But if you're doing a topic where you think people are gonna have a lot of variability in their attitudes and perceptions, why not give them seven points? It's not like it's a difficult task. Cognitively in an online or telephone survey, the difference between completing a five-point scale question and a seven-point scale question really isn't that big a deal. But for us as the researchers, it can be a big deal because now I'm getting more granular data. I might get more interesting patterns out of that data. And so that's great. And then if I think it's gonna be a lot of variability and it's not, well, then when I get to data analysis, I can always consolidate down to five for the purpose of reporting. But at least I've got the seven-point data in case there is a lot of variability, which is a long-winded way of saying, and if you are doing odd scales, typically you're gonna be choosing between five-point scales and seven-point scales. A lot of research exists that suggests that seven-point scales are actually more accurate, especially in product categories or topics where you are likely to get a lot of variability and opinions. So it is definitely something to consider. But sometimes we're doing research where we need priorities. So we wanna measure attitudes or perceptions, but we also want to prioritize them. So is the Likert or Likert scale going to be appropriate for those types of tasks? Well, let's look at an example. In a case like this where you're asking people a number of attributes and to indicate whether or not they're important or items they're satisfied with, there is a risk of being told basically that everything is very important or they're very satisfied with things. Believe it or not, people are nice, and sometimes they will default to giving you a positive answer. And if we really need prioritization, we can't let that happen. Well, as you can see in this visual, it is gonna happen. I could have a case where everything is very important. And that to me as a researcher is not awesome because now I have no prioritization. So it's really not going to be an appropriate choice. So what else can I do? And what else can you do that you may not have seen in the sample questionnaires that you were given? One option is the semantic differential scale. The semantic differential scale is another kind of rating scale. It's always bipolar. And here I'm showing you an example of what one might look like. So here we're asking, which of the following words most closely fit product X? And they're being shown a number of pairs, low quality to high quality, typical, unique, et cetera. The answer is going to be how they select a point on the scale between those pairs. Given a choice between low quality and high quality for this product, am I going to mark my opinion at the lower end or at the higher end or someplace in between? But notice that I'm not asking people to agree. That is, I'm not saying, do you agree that this part of the product or this aspect of the product is something that is accurate? Do you agree that this is a high quality product? It's really making it as objective as possible by giving it in a bipolar pair. So it's still a rating question. It's just not a question that's being rated in terms of agreement. We're actually putting the attributes into the answer options, not into the questions. With rank order, we get forced priorities. Not everything can be very important or not everything can be highly satisfied. So with rank order, people do have to put things into a sequence and each item in that sequence can only exist once. So what I'm showing here now is an example where one is the highest value, each number can only be used once. And in this case, how would you rate your criteria for selecting a laptop computer? And we're gonna rate those criteria from most important to least important. So now I get away from that other issue where everything's very important. I can clearly see what their priorities are. Now, the challenge, of course, with rank order questions is that they're difficult, especially if you have more than five items. I usually try to keep my rank order questions to no more than five items because it really does take people more time to think through and we don't wanna cause that much fatigue, right? We want people to answer the question without finding it tedious. So when I do rank order questions, I do keep them to no more than five items. Now, I know what some of you are thinking. Okay, so if I do a rank order question, I know feature A is more important than feature B, but I don't know by how much. Are they virtually the same in importance? Or is feature A really important and feature B is not so important? Well, you're right. With a rank order question, we don't know, which is why we sometimes will use a constant sum question instead. And that's what I'm gonna show you now. So you can see in this constant sum question that we're taking a different approach. Using 100 points, please assign a number of points to each of the following five items based on their importance to you when selecting which laptop computer to purchase. If an item has no importance to you, give it a zero. The total across all five factors must equal 100. And of course, if you're doing an online survey, it's self-correcting for that, and they'll get an error if they put in points that add up to 200, right? It'll let them know. Now, we're not only getting the order of importance, but we know how much of a gap there is. And we know if some items are really just not important at all. And the other thing is there can be ties, which could be true. In terms of buying a laptop computer, maybe for me, it's a tie. The number of points I would give the quality of the built-in webcam might be the same number of points that I would give to the weight of the laptop. And that could be valid data. So unlike a rank order question with a constant sum question, I'm not only getting an order, I'm allowing ties, which could be true. I'm allowing zeros, which could be true. And I know the magnitude of the difference between the different items. So hopefully for those of you who have been tasked with doing questionnaires, again, maybe it's your first professional level questionnaire and you haven't had time yet for real training on the topic, hopefully some of these points will make sense to you and will be valuable also as you talk to your colleagues about what is going to be best for collecting the best data for this particular survey project. If you have any questions or comments, please do add them below. And for those of you who are looking to learn more about questionnaire design, please do check out the quantitative courses at researchrockstar.com.

ai AI Insights
Arow Summary
Katherine Korostoff discusses common challenges faced by new researchers tasked with designing their first professional survey questionnaire with limited guidance. She emphasizes that while novices often write adequate questions, they frequently struggle with constructing high-quality answer options, especially rating scales. She explains key scale decisions—unipolar vs. bipolar satisfaction scales, number of points (even vs. odd), scale direction, and labeling—and clarifies that Likert (or Likert) scales are specifically agreement-based rating scales, not all rating scales. Korostoff argues that odd-numbered scales often improve data quality by allowing true neutrality, and notes research suggesting seven-point scales can yield more accurate, granular data when opinions vary. For situations requiring prioritization (avoiding “everything is important”), she recommends alternatives such as semantic differential scales, rank-order questions (kept to ~5 items to reduce fatigue), and constant-sum questions (allocate 100 points) to capture both order and magnitude of importance, allow ties, and permit zeros.
Arow Title
Designing Better Survey Answer Scales for New Researchers
Arow Keywords
survey research Remove
questionnaire design Remove
rating scales Remove
answer options Remove
Likert scale Remove
bipolar vs unipolar Remove
odd vs even scales Remove
five-point vs seven-point Remove
acquiescence bias Remove
semantic differential Remove
rank order Remove
constant sum Remove
prioritization Remove
data quality Remove
Arow Key Takeaways
  • New survey designers often struggle more with answer options than with question wording; scale construction is critical to data quality.
  • Choose between bipolar and unipolar scales intentionally; bipolar can reduce acquiescence bias by balancing positive/negative wording.
  • Decide scale length and whether to include a midpoint; odd-numbered scales often better capture true neutrality.
  • Seven-point scales can provide more granular and potentially more accurate measurement when attitudes vary; you can later collapse to fewer categories in reporting.
  • Not all rating scales are Likert; Likert specifically refers to agreement scales (strongly disagree–strongly agree).
  • If you need prioritization, avoid simple importance ratings that can produce uniformly high scores; consider semantic differential, rank order, or constant-sum formats.
  • Limit rank-order questions to about five items to minimize respondent fatigue; use constant-sum to capture magnitude differences and allow ties/zeros.
Arow Sentiments
Neutral: Instructional and supportive tone focused on practical guidance for improving questionnaire answer options; no strong positive or negative emotional content beyond acknowledging stress for beginners.
Arow Enter your query
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript