“Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger.†  Franklin P. Jones
Cigarettes in the United States all come with health warning labels on boxes—perhaps vendors should do the same in marketing and selling multi-rater assessments that are so commonly used by coaches, consultants and organizational practitioners.
These same cautions also apply to many multi-rater assessments developed “in-house†by many organizations focusing on their own competency models.
At least five important factors should be considered when using and interpreting multi-rater feedback interventions if the proximal and distal goals include increased awareness, behavior change, enhanced individual effectiveness and positive organizational impact:
1. Ratings between rater groups are only modestly correlated with each other. Research consistently shows that ratings between direct reports, peers, supervisors, self and others overlap only modestly. Self-ratings are typically weakly correlated with other rater perspectives with greater convergence between peer and supervisor ratings (Nowack, 1992). These diverse perspectives amount to different perspectives held for the participant by the different rater groups.
Perceptual frames of reference by different rater groups are also important when participants try to interpret feedback. In general, direct reports tend to emphasize and filter interpersonal and relationship behaviors into their subjective ratings whereas superiors tend to focus more on “bottom line†results and task-oriented behaviors (Nowack, 2002; Nowack & Mashihi, 2012).
At a practical level, it means that clients we are coaching might be challenged to both understand how to interpret observed differences by rater groups and whether to decide to focus their developmental “energy†on managing upward, downward and/or laterally in light of these potentially discrepant results.
2. Ratings within rater groups are only modestly correlated with each other.
In one meta-analytic study by Conway & Huffcutt (1997), the average correlation between two supervisors was only .50, between two peers, .37 and between two subordinates only .30.
Given these findings, vendors who do not provide a way for participants to evaluate within-rater agreement in feedback may increase the probability that average scores used in reports can be easily misinterpreted—particularly if they are used by coaches to help clients focus on specific competencies and behaviors for developmental planning purposes such as reviewing the “most and least†frequent behaviors or items seen as “most and least†effective. It’s easy to observe our own clients react to these “most/least†lists so common in vendor’s feedback reports and focus on only a few items without a clear understanding of whether the agreement within raters is low or high.
3. Perceptual distortions by participants and raters make interpretation of 360-feedback results challenging.
The prevalence of self-enhancement is not hotly debated but there is continued controversy on whether it is essentially “adaptive” or “maladaptive” which has important implications for understanding and interpreting multi-rater feedback for performance evaluation or development. If self-enhancement is conceptualized as seeing one’s self generally more positively than others, then the outcomes (performance, health, career and life success) are frequently more favorable, but if it is defined as having higher self-ratings than others who provide feedback (self-rater congruence), then the outcomes are frequently less than favorable ((Taylor & Brown, 1988; Sedikides & Gregg, 2003)).
Brett & Atwater (2001) found that managers who rated themselves higher than others had more negative reactions to the feedback process, lower motivation to improve and were significantly less likely to show improvement when they were reassessed.
In our own coaching practice, using diverse multi-rater assessments measuring different competency models, we have repeatedly observed that under-estimators (those whose self-ratings are meaningfully lower than others) tend to be highly perfectionist, self-critical, overly achievement striving and likely to focus on their perceived weaknesses rather than leveraging their “signature†strengths in developmental planning discussions.
Despite trying to help our clients interpret the feedback findings in a “balanced†manner, these under-estimators appear to be hypervigilant to the perceived “negative†information contained in their report and often “fixate†on the lowest average scores on ratings scales and the open-ended comments that appear to be “neutral or critical†in tone relative to other more positive comments collected within rater groups.
4. There might be limits to the magnitude we can expect leaders to actually change and improve effectiveness following multi-rater feedback.
Research by James Smither (2005) on 360 feedback suggests that although feedback does result in significant performance improvement, effect sizes are relatively small suggesting that “zebras don’t easily lose their stripes.†It would appear that we must accept that all of us have some skill and ability “set pointsâ€Â that may provide an upward ceiling to our growth and development. One could also use these findings on the contributions of genetics to personality and leadership emergence to make a sound argument that “leaders are made†and not born but it would appear that both genetic predisposition and environment interact together shape the overall development of key skills and abilities that impact our client’s professional and personal successes and failures in life.
5. Feedback combined with coaching leads to better performance outcomes.
All too often vendors and some practitioners espouse the “diagnose and adios†approach to multi-rater feedback hoping that self-directed insight alone will result in motivated behavior change efforts. In one of the few empirical studies conducted on the impact of executive coaching, Smither et al. (2003) reported that after receiving 360 feedback, managers who worked with a coach were significantly more likely to set measurable and specific goals, solicit ideas for improvement and subsequently received improved performance ratings.
Our preliminary research suggest that special attention is required by leaders in organizations to develop their own talent development coaching skills and minimally be held accountable for tracking and monitoring progress of the development plans of their direct reports and follow-up discussions to ensure completion. Our leadership development platform called Momentor provides a link to translating insight from 360 assessments into actual behavior change which is a requirement for enhancing effectiveness in any position ((Nowack, K. & Mashihi, S. (2012). Evidence Based Answers to 15 Questions about Leveraging 360-Degree Feedback. Consulting Psychology Journal: Practice and Research, Vol. 64, No. 3, 157–182)).
I’m pretty sure you now have some feedback on what vendors don’t really tell you about feedback! Be well….
Thanks for the honest and candid assessment of the effectiveness of 360 feedback.
This will be useful in setting the context with both those leaders who participate in the process, and for those leaders who push for the use of them.
Paul Carroll
360 Feedback does work and these things all have solutions and/or explanations.
1) that’s why we do 360, i.e., because groups have different perspectives and insights.
2) That research also points out that having sufficient numbers of raters improves reliability, a point many practitioners do ignore (to your point).
3) Show rating distributions so leaders can see the distribution and don’t have to guess. And require they follow up with raters to get insights. Some systems prohibit that.
4) While there may be skill limits, 360’s are better set up to measure the “how” side of performance that relates to organizational values and there is no limit to improving those behaviors.
5) Totally agree. AND coaching skills need to be instilled in managers (bosses) to perform that role when coaches leave the relationship and/or when are not available at lower levels.