Current opinion research practice for rank ordering the importance of lists of feature or benefit stimuli generally use one of four approaches.
1. Direct rank ordering
2. Likert scale ratings
3. Q-Sort
4. Conjoint analysis
1. Direct Rank Ordering: Most Internet based survey platforms offer a direct ranking question format similar to the format shown in FIG. 16. The respondent is presented a list and asked to indicate the most to the least important on the basis of some specified criteria 1602. A variation on this approach is to ask respondents to select their top choices from a list of choices presented on one screen 1612.
Among established off-line research studies, the Hartman Value Profile, FIG. 17, is one example of a standard test that utilizes a direct rank order exercise. Respondents are presented 18 choices 1702 on one page and are asked to rank order the list. To guide respondents through the exercise, the survey form offers a work area to cross out rankings as they are assigned 1712 and the survey form presents the exercise as a “practice” followed by a “final” ranking 1707, all on one sheet.
Limitations of Direct Rank Ordering: Direct rank ordering provides the most precise rank ordering for a short list of choices, however, direct rank ordering is limited in the number of stimuli a respondent can effectively consider in one ranking exercise. In an online survey, the other practical limit for a direct rank ordering is the number of alternatives that can be effectively presented on one screen.
2. Likert Scale Ratings: In 1932, Renis Likert invented a unidimensional scaling measurement method, called the Likert Scales, for use in attitude surveys. It allowed answers that ranged from “strongly disagree” to “strongly agree” see FIG. 14. While numerous researchers have developed a multitude of scale variations over the past 70 years and almost every Internet survey platform offers an online version of the Likert scale, the basic methodology remains little changed.
A Likert scale is an ordered scale from “most” to “least” of some attribute, generally presented as a five or seven point scale with the mid-point presented as the neutral point on the scale, “neither like nor dislike”. Respondents are asked to rate a number of stimuli. An approximation of the rank order of a set of test stimuli can be inferred from Likert scale rating scores.
Likert scales are likely the most commonly used research question design for collecting data for ranking the importance of test variables. The popularity of Likert scales for marketing research is illustrated by the American Marketing Association publication of the Marketing Scales Handbook. The most recent handbook, Volume III, published in 2001, presents 941 Likert scales selected from articles published in the top marketing journals between 1994 and 1997. Combining this volume with the previous two, researchers have easy access to nearly 2000 Likert scales.
Marketing Scales Handbook: A Compilation of Multi-Item Measures, Volume III Gordon C. Bruner II, Karen E. James, and Paul J. Hensel, editors
Limitations of Likert Scale Ratings:
Level of precision: A key limitation of Likert scale ratings is that the level of precision is limited by the number of respondent choices on the scale (generally 5 to 7). Many stimuli may be rated at the same level thus providing no insight for comparisons among those equally rated stimuli. This limitation is particularly vexing for respondents who answer all rating questions at the high end of the scale or at the low end of the scale, respondent yea-sayers and naysayers.
Number of rating questions may result in respondent fatigue: A second limitation of Likert scales is the well recognized potential for diminished quality of response that can result from respondent fatigue from answering a long list of scale questions. Most professional researchers limit the number of Likert scales questions to fewer than 20 in an effort to guard against respondent fatigue. Respondent fatigue is a problem in two respects: 1) respondent fatigue diminishes the quality of response and 2) the reduced quality of response cannot be readily detected so response sets with reduced quality cannot be purged from the study results.
Rating frame drift and order bias: There is a tendency for a respondent's evaluations to drift higher or lower as they progress through a bank of rating scales. This is a form of order bias. A typical strategy to moderate the effect of order bias is to rotate the order of presentation of questions so that the overall rating of each stimulus is equally affected. While this can effectively neutralize order bias for the composite sample, the options for measuring order bias for individual respondents are limited. The lack of a mechanism to detect and measure order bias for individual respondents limits the utility of Likert scale data for segmentation analyses in which individuals' opinion profiles are the focus of the analysis.
Quality assurance practices: Since it is difficult to measure the quality of response to a set of Likert scales, some online research services eliminate yea-sayers and naysayers as a standard quality assurance practice. Some researchers also eliminate respondents whose answer sets are significantly different from the typical respondent in the study. These respondents, often labeled as “outliers”, are assumed to be invalid because they are different. These practices can eliminate some valid respondents and in so doing skew the overall conclusions of a study.
3. Q-Sort Ranking: The Q-Sort technique, see the example in FIGS. 15A and 15B, is a forced sort, i.e. putting under each point on the continuum a prescribed number of cards. The distribution of the piles usually follows a modification of a (flattened) normal curve 1507. Piles may be prescribed for five points up to 11 points 1502/1512. Procedures often call for the respondent to begin by sorting the cards into three piles, disagree, neutral and agree, then proceed to spread the cards across the continuum.
While the attention required of the respondent to sort cards in a manual sort or attributes listed on the screen in an online version of the methodology, may improve the accuracy of the final sort, the rigor of the sort requirements detracts from ease of use for respondents. Further, the Q-Sort methodology presumes that any list of attributes or characteristics will be about half positive and half negative which may not be an accurate framework for a benefit test or concept test in which the objective is to only include only positive attributes.
The greatest limitation of an online Q-Sort is that consideration of attributes for ranking is limited by the number of options the respondent can see on the screen and simultaneously consider for a sort.
4. Conjoint analysis: The defining structure of a conjoint analysis is the evaluation of utility functions of variables within a range of values for each variable by constructing stimuli that are combinations of values for the subject variables. Respondents are asked to weight their preference for these constructed stimuli. A structured sampling of these constructed stimuli provides a basis to mathematically infer the importance weight of the values within the range tested for each variable in the test. This approach is generally limited to three to five values of three to ten variables.
Limitations of Conjoint Analysis: While conjoint analysis provides valuable insights regarding the importance of the value of variables, conjoint analysis is limited in the number of variables, generally limited to fewer than 10, that can be practically evaluated in a study. Further, conjoint analysis is largely limited to variables that can be expressed in degrees of value.