How Can you Measure if Raters Agree with Each Other in 360-Degree Feedback?

December 30, 2016 by Sandra Mashihi

“Qui tacet consentit.” (Translation: Silence means agreement)

-Cicero Ovid Seneca

Dispersion, agreement, or variance within rater groups is important to measure and report back to participants in 360-degree feedback reports because current research suggests only moderate correlation within rater scores (Nowack, 2009). At Envisia Learning, Inc., we provide up to three different types of metrics of rater agreement within each of our reports including:

  1. The range of scores. This indicates a band of responses, from the highest to lowest scores on a specific competency by all raters.
  2. The distribution of scores. This indicates a visual way to discern the spread of scores by raters on specific questions.
  3. The statistical measure of rater agreement. This is based on standard deviation from 100 percent agreement to no agreement; any agreement less than 50 percent statistically is meaningful and indicates enough variability within raters to suggest that the average score could be misleading if used to highlight strengths or potential development areas.

Coach’s Critique: 

A very typical question that comes up in my coaching practice has to do with who said what in the 360-degree feedback report. Individuals I coach try very hard to identify the extent to which there is consensus with ratings, and spot those individuals that are so called “outliers”. This question often comes from a place of defensiveness, or a need to justify a fault in the system. However, when I walk them through the report, and show them the various rater dispersion reports, I help them grasp the details behind ratings, while keep the anonymity of raters. I find that reviewing the various range and distribution of scores with them immediately decreases their level of defensiveness and frustration.

For this reason, I prefer to utilize tools that provide the maximum amount of metrics for interpreting rater agreement. I particularly find that since participants seek to blame and justify average low scores on the extreme ratings of outliers, breaking down ratings in terms of a consensus level or rater agreement index can be very helpful. For instance, if a score has a very low consensus level, or a high level of variability, it’s important to explore the reasoning behind the variations, since this weigh less importance on the average score. At the same time, those scores should not be prioritized in terms of creating a development plan. It’s a good idea to develop a plan based on scores where there is a sufficient level of agreement.

What has been your experience with interpreting rater agreement for participants?


Dr. Sandra Mashihi is a senior consultant with Envisia Learning, Inc. She has extensive experience in sales training, behavioral assessments and executive coaching. Prior to working at Envisia Learning, Inc., She was an internal Organizational Development Consultant at Marcus & Millichap where she was responsible for initiatives within training & development and recruiting.. Sandra received her Bachelor’s of Science in Psychology from University of California, Los Angeles and received her Master of Science and Doctorate in Organizational Psychology from the California School of Professional Psychology.

Posted in 360 Degree Feedback

If You Enjoyed This Post...

You'll love getting updates when we post new articles on leadership development, 360 degree feedback and behavior change. Enter your email below to get a free copy of our book and get notified of new posts:

Follow Envisia Learning:

RSS Twitter linkedin Facebook

Are You Implementing a Leadership Development Program?

Call us to discuss how we can help you get more out of your leadership development program:

(800) 335-0779, x1