“The trouble with measurement is its seeming simplicity.”
-Unknown
With the various types of scales that are used for survey and 360 measurement, how does one determine which one is best? Is there even a “best” response scale?
Many studies suggest that response scales have a large impact on the 360-degree feedback data and some response scales seem to be preferable to others.
For example, Bracken and Rose (2011) ((Bracken, D. W., & Rose, D. S. (2011) When does 360-degreee feedback create behavior change? And how would we know it when it does? Journal of Business and Psychology, 26, 183-192.)) suggest that commonly used frequency scales (e.g., never to always) are inferior to others but are quick to point out that the majority of research has focused on the anchors themselves and additional research is needed to identify optimal response choices and anchor format.
A recent meta-analysis by Heidemeier and Moser (2009) ((Heidemeier, H., & Moser, K. (2009). Self-other agreement in job performance ratings: a meta-analytic test of a process model. Journal of Applied Psychology, 94, 353-370.)) suggests that social comparison scales (scales with relative rather than absolute anchors) were able to reduce leniency in self-ratings and they suggest they should be employed much more often than in the past.
Common rating scales used by vendors in most 360 assessments include:
• Effectiveness
• Potential
• Ranking/Comparison
• Frequency
Effectiveness scales ask participants and raters to provide judgments about how effectively the individual demonstrates specific competencies and underlying behaviors.
Potential scales are commonly used for succession planning systems and ask raters to predict how well the participant might perform in the future or what potential the participant has to succeed.
Ranking scales typically ask raters to compare the participant to some type of standard (e.g., evaluate the participant compared to the most effective leader the rater has experienced within his or her organization).
Frequency and extent scales typically ask how often the participant has demonstrated or expressed specific behaviors. Kaiser and Overfield (2011) suggest the use of “too little/too much” frequency rating scales to distinguish when managers do an appropriate amount of a behavior from when a person may overdo it. Their findings suggest that raters are able to make these distinctions and are a reliable and valid method for measuring strengths overused.
The goal of any 360-degree feedback assessment is to provide targeted feedback on critical success factors that will be included as part of a coaching, training, or development program. If the program is successful, it can be hoped that talent will be more effective in practicing and demonstrating specific technical, leadership, task/project management, or communication competencies and behaviors. Feedback that is less ambiguous and more behaviorally-oriented will be most helpful to the talent using the 360-degree feedback process.
When questions measuring competencies are written to reflect effective or desired behavior, frequency or extent scales provide more clarity about what strengths should be leveraged and the potential areas for improvement. In general, response or rating scales are important to consider when developing customized 360-degree feedback assessments and interpreting off-the-shelf tools available from vendors.
Coach’s Critique:
Appropriate rating scales are one of the bigger issues when considering which 360-degree assessment to design or utilize. I believe the type of scale can actually make a difference in the quality of results. I have generally preferred surveys that have frequency scales because it allows raters to assess what they have actually seen, rather than how they perceive something subjectively (e.g. effective or ineffective).
At the same time, I am also a big believer that demonstrating too much of a behavior isn’t necessarily beneficial. With that said, I believe that one thing to keep in mind is not only the type of scale, but how the behaviors are described in the items, and how the range of the scale is interpreted. If behaviors are described in a way that “too much†of it is disadvantageous, then a high score isn’t necessarily preferable. Take for example the item, “provides constructive criticism when necessaryâ€, a high score on this item would mean that “to an extremely large extentâ€, this person is seen as providing constructive criticism when necessary.  A high score on this item can reflect a strength that is overused. So interpreting the range of a frequency scale from low to high as equivalent of bad to good can be misleading for the purpose of providing effective feedback. Therefore, it might be a good idea to interpret results of items of frequency scales in terms how much strengths are used or overused.
One Trackback
[…] to ensure the optimum number of rating points on a scale, most appropriate response scale (visit blog on “Best” Response Scale for 360 Feedback), provide clearly labeled scale definitions, and provide a positive scale to avoid ambiguity and […]