Thinking differently about Likert scales and multiple choice questions.
Tips, tricks, and thought leadership to improve impact measurement.
Good morning, afternoon, or evening! It’s Saturday, February 15 and I hope everyone enjoyed a lovely Valentines day. Instead of chocolate and roses, we treated ourselves to steak and dove into season 2 of Breaking Bad :)
Coming up next week on our free measurement events calendar ...
February 19: For those who work in the world of government and nonprofits, this is for you!
What you missed this week: A workshop by Rob Brinkerhoff where he demonstrates a new way to calculate learning impact!
And now, onto our resources… why we should think differently about Likert scales and multiple choice questions.
Why our current approach to survey design is failing us.
This week’s resources are inspired by a debate I recently had with a client. A debate I know some of you have likely had with your boss or supervisor too. When should we use common Likert scale answer options (like: strongly agree, agree, versus, disagree, and strongly disagree) and when we should use different answer options all together)? More importantly, if our colleagues, clients, or superiors insist on a certain approach to survey design, how can we convince them to think differently?
The answer to this debate lies in the history of Likert scales and standardized testing!
What you probably don’t know about our modern Likert scale methodology is that it is a simplification and significant departure from a much more complex process for developing survey questions. Prior to 1932, the accepted approach to developing scale questions was to create a large set of potential questions and then present these questions to a group of expert judges. These judges would then evaluate which questions were most appropriate for the concept that was being measured. You can imagine the burden of time and labor involved in creating just one survey.
Rensis Likert said, “Wait a minute - there is another way!” He believed that people’s attitudes could generally be “clustered or linked together.” He conducted several studies to compare the original expert judge methodology (called the Thurstone Scoring Method) versus what we now know as the Likert Scale methodology and found that the simplified Likert Scale method was an equally, if not more, reliable way of evaluating attitudes.
What I’d like to call to your attention is that the purpose of Likert Scales was originally developed to evaluate people’s attitudes and opinions and was regularly used in marketing research. However, the way we commonly use Likert scales today goes far beyond exploring attitudes and has expanded beyond marketing research into many other domains, including learning and development.
Another piece of history that influences our modern day measurement and evaluation practice are standardized tests. Those of us who are DIYing our own assessment and survey questions to explore the outcomes of learning rely heavily on Likert and multiple choice questions. While these questions may be the easiest to craft, they are often the least useful question format to evaluate growth or change, and to help us make wise decisions around future actions, initiatives and investments.
Likert scales are designed to help us understand people’s attitudes and opinions within generalized groups. Standardized tests are designed to categorize people into groups based upon test performance. Historically, standardized tests were used to identify which students have special needs, which members of civil society are eligible for military service, and more commonly which applicants should be admitted to higher education institutions.
One limitation of standardized tests is that they are not great predictors of future performance. How a student performs on a standardized college entrance exam is not a statistically significant predictor of how successful that same student will be in their college courses or how likely they are to get a job out of college. Similarly, how an employee performs on a multiple choice exam that evaluates knowledge is not a great predictor of how well that same employee will perform on the job.
Yet, because Likert scale and multiple choice questions are easiest to create, we continue to use them to make decisions about performance and capabilities that are well beyond the scope of what those assessments are designed to measure.
Investing time and money into measurement and evaluation is already a great pain point for most of us. However, the current practices we rely on to attempt to measure outcomes and results may be a great waste of the time and precious budget we’ve put into them! Why? Because the data we receive from these surveys aren’t accurate predictors of performance and they don’t offer useful data to inform our future decisions!
If not traditional Likert scales and multiple-choice questions, then what?
Like Rensis Likert, who said, “There is another way.” I suggest we reevaluate the approach we’re taking to survey design. The common way. The easy way. Is not always the approach that will get you the best results.
Here’s the change I invite us to make …
How many of you know what you want to do with the data before you create your survey questions?
My guess is that most of us don’t know IF the data will ever be used, and for what purpose. This is the first opportunity for change! Our data is not likely to be fully useful unless we have a clear vision for how we want it to be used!
It’s worth noting that sometimes we don’t know how we’ll use the data, but we know the data is important to collect. I still invite you to think about one or two ways you can imagine the data being used in the future. Then use that reason to guide your creation of survey questions.
An example to bring this to life!
Every survey or data collection effort we are charged with is situated within a specific context. That context includes the reason for the data being collected and the dynamics surrounding the data being collected. Context is critical to determining the best question type for our survey questions! Thus, pay close attention to the context as you consider what is the best question type in the example below!
Context
I am an expert speech writing coach who is helping a group of 10 people prepare a TedTalk.
Within our group coaching program, the opening activity asks all participants evaluate the current version of their opening paragraph from their talk. They are going to evaluate the opening paragraph by reflecting on criteria: credibility and compellingness.
My goal is to identify which of the 10 people need extra support refining their opening paragraph via a 1-1 coaching call. I only have 5 hours to provide coaching, thus I cannot offer everyone the extra 1-1 support. I also suspect not all 10 people will need extra support!
I need to use the responses to the evaluation questions to categorize participants into three groups:
1. Who really needs support.
2. Who does not need support.
3. Who could benefit from support - but it's not necessary now.
Which of the two question formats below (option 1 or option 2) gives me better data to help me accomplish my goals?
Option 1
Compelling: The opening paragraph is interesting and notable to someone who doesn’t work in my company or industry.
Strongly agree
Agree
Neutral
Disagree
Strongly Disagree
Credible: The opening paragraph leaves the audience feeling like the speaker has a good amount of experience and is a trustworthy source of information on the topic.
Strongly agree
Agree
Neutral
Disagree
Strongly Disagree
Option 2
Compelling: The opening paragraph is interesting and notable to someone who doesn’t work in my company or industry.
Yes
No
I don’t know.
Credible: The opening paragraph leaves the audience feeling like the speaker has a good amount of experience and is a trustworthy source of information on the topic.
Yes
No
I don’t know.
Which question format is better, option 1 or option 2?
The debate on the best question format!
The best answer options are generally debatable and not always a black and white answer! Looking at both question options above, I could absolutely debate for and against both options. The context is what helps us lean into the best option.
So what do you think? Option 1? Or, option 2?
I believe option 2 is better. And here’s why:
My goal is to easily categorize participants into three categories and then allocate my resources wisely to support those who need it most.
If someone answers yes to the questions of compelling and credible then they feel very confident they’ve met the criteria and are not in need of support with their opening paragraph at the moment.
If a participant answers no, this is an indicator they need help!
If a participant answers “I don’t know,” I could send them an email, or even better yet, I could use branching logic in my survey to invite them to share more about why they answered in this way. Based upon the response I get via email or via more details within the survey - I can determine if the participant needs coaching, or simply resources they can review on their own.
With the traditional Likert scale option, there aren’t clear degrees of difference between strongly agree and agree, and strongly disagree and disagree. For the context and purpose of the data being collected, it’s better for the user ( and for me) to simplify the answer options.
What can you do to apply this thinking to your own evaluation work?
Take a few of your recent evaluation questions - be they Likert, multiple choice, or another format - and evaluate the utility of the data you receive. Of course, the utility of the questions and the answers you receive are dependent upon the goals and purpose of your survey. Consider these questions as you reflect on your survey questions and their answer options:
Is it easy for you to take action with that data obtained from the survey question?
Do you have to do extra work to analyze the data before it can be useful (like in the Option 1 example above where I have to determine if there is a significant difference between strongly agree and agree before I can decide who gets the extra support)?
Does the user have to do extra work in answering the questions, where a simpler list of answer options would be easier for the user?
Do your answer options give enough detail for the user to respond accurately? Sometimes it’s better to exchange traditional Likert scale options like strongly agree/strongly disagree or never/always with something very specific. Dr. Will Thalheimer wrote a great book called Performance Focused Learner Surveys that helps us to think differently about Likert scale answer options! It’s worth the read!
If you’re struggling to get a colleague, client, or superior to back your choice to move away from traditional Likert scale answer options - show them the difference between Option 1 and Option 2. Invite them to reflect on the utility of the data received. So long as you’re clear on how the data will be utilized, your stakeholders will come around on a more purposeful set of answer options over the traditional Likert approach!
Thank you for reading all the way down! Let us know what you think of this week’s resources! Inspiration and improvement is our goal here at The Weekly Measure :)
See you in your inbox next weekend!
~ Dr. Alaina
Whenever you’re ready, there are two ways I can help you!
Gain greater influence and impact at work here.
Promote your product or business to 1.5K highly-engaged
learning professionals by sponsoring my newsletter here.