“The only relevant test of the validity of a hypothesis is comparison of prediction with experience.” -Milton Friedman
When trying to determine the best 360 tool to utilize, a first question that comes up is whether the tool is valid and reliable. Whether you create and develop your own or purchase a vendor’s assessment, you should make sure of the following 5 basic psychometric properties:
- Test-retest reliability. With this reliability, scores remain stable over a very short period of time when administered again. It is useful to know this, especially if the test is intended to be taken again after a certain period of time.
- Internal consistency reliability. It is very important for the competencies or scales to have items that are highly interrelated with each other. For example, if several items are intended to measure “Listeningâ€, items like “Takes the time to understand and listen to others†and “Waits out silences and listens patiently without interrupting others†have a high internal consistency reliability in that they are correlated with each other. Ideally, a customized 360-degree feedback questionnaire should have established scale reliability (e.g., Cronbach’s alpha greater than .70) to ensure that the questions are accurately measuring a single concept.
- Face validity. A tool has “face validity†when the competency model and questions appear to make sense for the job level or purpose of the assessment at a face level. Minimally, a customized questionnaire should have “face validity,” so that participants and raters tend to believe the questions and competencies are relevant to the purpose and goals of the feedback process. It is possible to establish face validity by running a focus group with a representative group from within your organization or by piloting the 360-degree feedback questionnaire before a wider roll-out. It is important to establish whether the 360-degree feedback questions are clear and can be answered, whether the questionnaire itself is relevant to the individuals participating in the 360-degree feedback project, and whether all of the organization’s key competencies are being measured.
- Criterion-related validity. A 360 tool meets this validity when scores on the assessment are correlated at a significant level with a relevant outcome like job performance, promotion or engagement of the work team. In other words, does the customized instrument actually predict anything meaningful like performance?
- Convergent and divergent validity. This is when a competency or scale being measured in the 360 is associated with constructs that are similar and not associated with those that are logically different. In other words, does the customized instrument correlate with like and dislike measures?
When seeking to utilize a 360-degree feedback tool, it is common and probably essential to ask whether the tool is valid and reliable. A simple answer to that question would be…YES. In fact, a vendor may claim that their tool measures what it’s intended to measure, or that their tool comes from a strong theoretical foundation. While this may very well be an honest claim, and to a certain extent it fits a certain criteria of reliability, would that be adequate enough to be considered reliable?
Vendors, consultants, buyers of 360 assessments should be aware that there are many different types of reliability as well as validity. In order for a tool to be adequately valid and reliable, it needs to at least meet the criteria listed above. This especially becomes necessary when trying to implement a customized 360, where frequently there is no real evaluation of reliability or validity (perhaps only a certain level of face validity). For example, when 360s are implemented for an entire organization, has there been some kind of verification of what constitutes high performers? Has a sample been linked to a generalized population? Often times, these things are overlooked and can greatly affect the overall results of the 360 program.
What has your experience been with reliability and validity of 360 tools? What do you consider to be adequate (regardless of psychometric laws)?
I am extremely disappointed at these postings that pose “musts” that have no basis. In this case, for starters, criterion related validity makes little sense. 360’s are like appraisals, i.e., descriptors of behavior, not predictors. 360’s are designed to create behavior change, not predict future behavior.
That’s a good point David. What this blog is trying to suggest is that when 360 assessments are being used for appraisal, promotion and succession, we want to make sure that the tool has added adequate psychometric properties including various forms of validity to ensure that participants scoring high are really performing at a high level in the organizational culture using them.
Thanks, Sandra. In my mind, what you describe is concurrent validity, not predictive validity. I would expect a 360, like a performance appraisal, to be a good performance measure and not necessarily a good predictor (which research shows is the case for appraisals).
[…] you will permit a digression, I will bring to your attention a recent blog by Sandra Mashihi (http://results.envisialearning.com/5-criteria-a-360-degree-feedback-must-meet-to-be-valid-and-reliab…) where one of her lists of “musts†(arrrgh!) is criterion related validity, which she defines […]
One Trackback
[…] you will permit a digression, I will bring to your attention a recent blog by Sandra Mashihi (http://results.envisialearning.com/5-criteria-a-360-degree-feedback-must-meet-to-be-valid-and-reliab…) where one of her lists of “musts†(arrrgh!) is criterion related validity, which she defines […]