We Love Ratings!

We Love Ratings!

(This article is available in PDF format at https://www.talentstrategygroup.com/publications/performance-management/we-love-ratings.)

While our article’s cover illustration may seem overly enthusiastic, it strikes an appropriate counter-balance to the hysteria of the ”stop performance rating” movement that’s recently gained traction. Articles like “Kill your Performance Ratings”[1] in strategy+business and “Reinventing Performance Management”[2] in Harvard Business Review have declared performance ratings an unnecessary evil and called for their elimination.

In these articles we read that being rated invokes a debilitating flight-or-fight response and that ratings are so biased as to not be worth gathering at all. They suggest that a “ratingless” system is a far more virtuous and effective.

Both in theory and in reality these arguments fail. On the theory side, the science that the authors say supports their case clearly does not. In some cases it’s not even related to their argument and in others it ignores entire bodies of research that contradict their findings.

As for reality, their “tail wagging the dog” approach tries to isolate ratings as the central problem with performance management, ignoring that ratings are part of a larger process. They ignore that businesses that have dropped ratings continue to rate through their compensation processes. They ignore that structured differentiation is necessary to avoid bias and to smartly invest in talent.

The cracks in the ratingless approach are starting to show as businesses that have tried ratingless slink back to ratings.

Our position is that ratings are neither good nor bad. Ratings are simply a tool that may be appropriate depending on your company’s business objectives. Given that this level of rationality is missing from the current dialogue, we thought we’d provide a more objective view of the science and a counterpoint describing the strong business benefits of ratings.

Starting with the Science

Readers of One Page Talent Management[3] know that we start any discussion about HR practices by reviewing the relevant science. In this case, it’s more helpful to see if the science cited by the opponents of ratings actually supports their claims.

Science Claim #1: The act of being rated invokes a flight-or-fight response that creates in individuals negative emotions and reactions that reduce productivity and commitment. This is the neuroscience argument cited to support the claims in “Kill Your Performance Ratings” and is based on a variety of neuroimaging experiments. Those experiments use magnetic resonance imaging (MRI) machines to see how the brain “lights up” when it processes information.

There are two fundamental flaws with claiming that neuroscience findings suggest we eliminate ratings. First, it’s correct that when we interact with other people our limbic brains generate either an “approach response” (more please!) or an “avoid response” (run away!). It’s also correct that these subconscious processes can drive our behaviors without our being fully aware of this.

But, there’s no science that says that being rated automatically creates a negative response. Highly rated people or those rated consistent with their self-evaluation are likely to have either positive or neutral reactions. Even negative feedback is proven to be more acceptable when the source is credible, and the feedback high quality and delivered in a considerate way.[4] A bad performance conversation may trigger a negative reaction, but that’s independent of whether you use a rating or not.

Second, the neuroscience claim also suggests that this subconscious “avoid” process will dominate our reactions to feedback. This ignores the fact there is a conscious process taking place as well during feedback, and that we have the power to control our reactions to it.[5] In short, we’re able to intelligently evaluate the information that we hear even if our limbic brain is sending us an “avoid” message.

The reality: We’re still a long way from conclusively understanding what mental process is occurring when certain parts of the brain light up in an MRI. Claiming that we know this is called ‘reverse inference’ and leading Stanford University neuroscientist Russ Poldrack warns against drawing that type of conclusion from neuroimaging data.[6]

Scientists are still learning about the interrelatedness of mental processes and we should support continued neuroscience research into this. Right now, however, it’s incorrect to extrapolate from a bright spot on a brain scan to a design element of performance management (ratings).

Science Claim #2: Ratings aren’t accurate, so don’t use them.

In “Reinventing Performance Management”, Marcus Buckingham and Deloitte’s Ashley Goodall write about Deloitte’s former performance management system, which based on their descriptions (“creating the ratings consumed close to 2 million hours a year”) sounds ridiculously complex and bureaucratic. It’s understandable why they felt a redesign was necessary but the logic they used to eliminate ratings is far less understandable.

In their article they cite, under the heading “The Science of Ratings,” research that they say shows that rater bias (anything unrelated to one’s actual performance) accounts for most of the differences in performance ratings. They state that the research says that “actual performance” accounts for only 21% of a rating. They wanted to redesign their performance management process to avoid this type of error.

The Reality: The article they cite, “The Latent Structure of Ratings,”[7] is an interesting read for those passionate about I/O psychology, but it in no way supports their argument about ratings. In fact, the study had nothing to do with actual performance ratings or a real company’s performance management process! The research used development ratings from a Personnel Decisions International database to model what performance might be given various rating on a Profilor assessment tool.

Citing this study as “The Science of Ratings” is, at best, highly misleading and ignores the significant body of academic research that directly addresses the topic of ratings.[8] Even the article’s authors state, “because true performance levels are unknown, none of the validities can be determined with certainty.”[9] Since nothing in that article relates to an actual performance review, it’s challenging to see how this study in any way suggests using or not using ratings.

What the research actually said. The research said that managers were the most accurate assessors of their employees compared to peers or direct reports. There was variance in how two managers assessed the same person but the open question was whether different managers rated that person differently because they had observed the subordinate in different situations or they were projecting personal biases. That’s an interesting question but it has nothing to do with whether ratings are helpful or unhelpful.

 The loudest, least logical reason for a ratingless approach

In articles about performance ratings, HR leaders will generally describe their process as universally unliked. They’ll say that those who perform well are forced into lower performance categories and that those who are not highly rated regard the process as unfair.

It’s not surprising that if you have a poorly designed and poorly run process that the focal point of that process – the review conversation with ratings – will feel the heat. However, it seems a rather twisted journey from “we have a complex process and are horrible at setting goals and coaching employees” to ”ratings are the reason for our failure.” It may be helpful to first intelligently design the entire performance management process (see “The Hard Truth About Effective Performance Management”[10]) and then evaluate if ratings add or detract from it.

A Moderate Defense of Ratings

So with a more objective view of the science on the table, let’s explore both the benefits and practical realities of having ratings.

You are constantly being rated: You were accepted (or not accepted) into your preferred college because of your SAT ratings. You got (or didn't get) the house you wanted because of your FICO rating. You got (or didn’t) the date you wanted because of your Tinder “rating.” 

At least the first of those two ratings likely had an impact on your life that was much greater than your recent performance rating. Yet you didn’t object to those ratings even though the stakes were higher, you didn’t set the measurement standard, you were emotionlessly evaluated against others and you had no choice about how the process worked. 

Each of those ratings allowed someone to make a smarter decision (admit/no admit, lend/reject, date/drop) because they had a reference point about you. A performance rating is no different. It reflects a collective judgment about your performance relative to others. It allows a company to make a smarter – not objectively perfect – assessment to inform a choice they need to make (how to pay you, what feedback to provide, whether to promote).

Your manager, peers, direct reports, the on-campus Starbucks barista and everyone else you interact with are rating you every day at work. A performance rating is simply one summarized data point of those evaluations.

Ratings provide helpful differentiation. If your company can’t accurately differentiate its investments in people, it will by definition mis-invest. Some team members will get more than they need and others will get less. Without a standardized way to differentiate (i.e. ratings) you’ll be stuck trying to allocate resources among 10,000 people described in your now qualitative process as ”pretty darned good.”

But can’t you differentiate without a rating? Yes, just inaccurately. The science is clear that individuals and managers are delusional about theirs’ and others’ performance. The classic article “Unskilled and Unaware of It”[11] describes what’s become known as the Dunning-Kruger Effect. The authors’ repeated experiments have shown that we’re not just unaware of our own lack of competence, we don’t recognize genuine ability in others and refuse to admit we were previously wrong when our performance is corrected.  

Managers’ overrating their team is an enduring, scientifically proven fact in companies. It’s most pronounced where performance ratings are used to determine compensation,[12] where it’s difficult to assess an employee’s true competence,[13] and where the manager and employee have a strong relationship.[14] So, hoping that managers will naturally and accurately differentiate without a rating is, to put it kindly, a pipe dream. 

Ratings provide a structure for assessing people against a consistent standard in a consistent way, but they don’t eliminate upward rating inflation. What does? Forced rankings. There’s only room in the top 10% for your top 10%.

Ratings limit conscious/unconscious bias: Goodbye ratings? Hello conscious and unconscious bias. Without at least the imperfect crutch of performance ratings, there’s no way to analyze if personnel decisions are being made based on performance or managers’ personal preferences.

Ratingless systems reduce transparency: The Achilles’ heel of ratingless systems clearly emerges at bonus time. Any organization that says they have a ratingless process but still differentiates bonus amounts is fooling itself. They do have ratings – they’re called bonuses. They’ve simply removed the transparency between performance and pay. 

Somewhat inexplicably, Deloitte is choosing not to be transparent with their employees about how they are being assessed in their “radically redesigned” process. They say they’re looking for a better answer but hiding information from employees feels like a gigantic step backward rather than the radical transformation that Harvard Business Review advertised this move to be.

Ratings put the data in big data: You lose the independent variable in most HR analytic exercises when you eliminate performance ratings. In an age where we want to better understand what drives or is driven by high-performance, eliminating the metric of performance seems incredibly short sighted and na?ve.

Is this just an HR concern?

It was telling that the Wall Street Journal recently featured an article that cited Intel’s attempt to eliminate ratings.[15] It reported that Intel’s HR group tested a ratingless performance management process with their 1,700 HR employees and received positive reviews. When the HR group suggested to Intel’s executives that ratingless performance management be rolled out across the company, they said “No thanks.”

It’s fair to ask whether the noise about ratings is generated purely by some in HR, external consultants and lower performers. It’s possible that everyone else just wants a simpler, easier to use, more value-adding process.

Do We Love Ratings?

Yes, but we’re not in love with them. They serve a valuable purpose when used to help accurately differentiate levels of performance so we can more intelligently invest our organization’s resources. They’re a tool – nothing more, nothing less. They should be used if they add more value to decision-making than they add complexity or effort. 

The recent hysteria around ratings would be humorous if organizations weren’t making that choice driven by a combination of questionable science and the unwillingness to acknowledge the benefits (and occasional pain) of differentiation. Driving high-performance means that we must take a broader and more accurate look at the science, apply far less dogma and understand how the pieces of performance management actually fit together. If that means that not using ratings is the best choice for you – great. But please make that decision because you understand the facts, not because it’s the latest fashion.

---------------------

[1] David Rock, Josh Davis, and Beth Jones, “Kill Your Performance Ratings,” strategy+business, August 8, 2014, Autumn 2014, Issue 76.

[2] Marcus Buckingham and Ashley Goodall, “Reinventing Performance Management,” Harvard Business Review, April 2015.

[3] Marc Effron and Miriam Ort, One Page Talent Management: Eliminating Complexity, Adding Value, Harvard Business Review Press, 2010.

[4] Steelman, Lisa A., and Kelly A. Rutkowski. "Moderators of employee reactions to negative feedback." Journal of Managerial Psychology 19, no. 1 (2004): 6-18.

[5] Kinicki, Angelo J., Gregory E. Prussia, Bin Joshua Wu, and Frances M. McKee-Ryan. "A covariance structure analysis of employees' response to performance feedback." Journal of applied psychology 89, no. 6 (2004): 1057.

[6] Poldrack, Russell A. "Can cognitive processes be inferred from neuroimaging data?." Trends in cognitive sciences 10, no. 2 (2006): 59-63.

[7] Scullen, Steven E., Michael K. Mount, and Maynard Goff. "Understanding the latent structure of job performance ratings." Journal of Applied Psychology 85, no. 6 (2000): 956.

[8] There’s a lot. Look it up in Google Scholar.

[9] ibid

[10] Effron, Marc, “The Hard Truth About Effective Performance Management,” The Talent Strategy Group, accessed at https://www.talentstrategygroup.com/publications/performance-management

[11] Kruger, Justin, and David Dunning. "Unskilled and unaware of it: how difficulties in recognizing one's own incompetence lead to inflated self-assessments."Journal of personality and social psychology 77, no. 6 (1999): 1121.

[12] Jawahar, I. M., and Charles R. Williams. "Where all the children are above average: The performance appraisal purpose effect." Personnel Psychology 50, no. 4 (1997): 905-925.

[13] Bol, Jasmijn C. "The determinants and performance effects of managers' performance evaluation biases." The Accounting Review 86, no. 5 (2011): 1549-1575.

[14] Tziner, Aharon, Kevin R. Murphy, and Jeanette N. Cleveland. "Contextual and rater factors affecting rating behavior." Group & Organization Management 30, no. 1 (2005): 89-98.

[15] Rachel Feintzeig, “The Trouble With Grading Employees,” The Wall Street Journal, April 22, 2015, retrieved at https://www.wsj.com/articles/the-trouble-with-grading-employees-1429624897

Choo Huat, Billy Teoh

"C-Level Executive Coach"

7 年

In a performance-driven platform, rating can be normative (bell curve) or criterion (plateau). However, to reduce rater bias, increase rater consistency, clarity of performance criterias are some areas of utmost improvements are required. Self rating, team rating, 360 degrees or even 720 degrees rating, or other hybrid ratings could be experimented with. Another would be consensus rating, which means the rating is pre-determined by the outcomes and results, rather than the performers. The performance criteria now moves from the person to the performance. Say for example: Performance is 'monetized' based on achievement at 25%, 50%, 75% and 100% or any other permutation. So if the end result is say 50%, and if it is a team effort; everyone in the team; gets their performance rating at the 50% percentile. It just means that even if the team performs 'tremendous'; it is the performance result that is measured and monetized instead of the team members. The focus now is on the performance outcome/result and not the person per se. This is a simplified version

回复
Kate McCourt

VP CIS Leadership Development, Change Management & Facilitation at DHL Express

7 年

Ratings are useful when there are clear roles, responsibilities and defined goals. Lacking these - the rating discussion is ineffective. The "get rid of performance review"-movement is a reaction to the VUCA environments we face also inside our organizations. The question we need to ask is - "How do we support managers and employees to have better conversations, in order to drive business performance, enhance organizational capabilities AND engage our workforce at the same time. The rating itself is not the issue.

Gordon (Gordy) Curphy, PhD

Managing Partner at Curphy Leadership Solutions

7 年

So Joanna, is discrimination only related to ratings or can it happen without ratings? And if there is no way to measure "discrimination" would it get worse or better in a no-rating environment?

Joanna Wilde

Organisation and Community Psychologist, PhD, C. Psychol., FBPsS, FAcSS, C. Sci

7 年

Disappointing- does not address the issue of discrimination at all effectively...only covers the neurohype and not the much more extensive social psychology research evidencing how people behave in competitive environments (where sabotage is an option) -and no evidence given for the opening claim that orgs with ratings do better than those without...also doesn't examine the different impacts of how comparisons are structured or what outcomes ratings are used to support.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了