Challenge #6 – How to how to spot dodgy concepts in OCM – Part 2
Spotting a dodgy concept : Part 2

Challenge #6 – How to how to spot dodgy concepts in OCM – Part 2

In Challenge #5 - How to spot dodgy OCM concepts – Part 1 I argued that sometimes concepts that seem highly intuitive are not clearly defined or supported by rigorous research. This is important because once we believe in these myths and they become embedded in our language and mental models they become hard to weed out of our practice. Something that has become increasingly important in this 'post-truth age'. Using an evidence-based approach to counter these concepts should become a priority. So, in this Part 2 I aim to dive deeper into the ideas of validity and reliability.

New wine in old bottles or old wine in new bottles?

In organisational research there is a real problem with ‘construct redundancy’, people just making up ‘new’ constructs (concepts) which are just repackaged old constructs – new wine in old bottles or old bottles with new wine? Take for instance job satisfaction or employee engagement. Both are highly correlated with organisational commitment. If they have no more explanatory power than organisational commitment, then why do we need a new construct? In fact, organisational commitment used to suffer from its own identity crisis. And what about ethical and authentic leadership? They are both highly correlated with transformational leadershipAnd emotional Intelligence is just IQ + Conscientiousness + Emotional Stability. And is there really any difference between fixed and growth mindsets and say Douglas McGregor’s theory X and theory Y or Learning Goal Orientation – the desire to develop and master new situations? In the case of EQ, you could argue that it neatly packages concepts and makes them easier to understand i.e. it is easier to say EQ than IQ + Conscientiousness + Emotional Stability. This is fine until people confuse things by saying EQ is more important than IQ when one is nested in the other. So, doesn’t creating new words for existing concepts potentially create confusion and narrow our understanding?

More extra ordinary extrapolations

Currently it is ‘en-vogue’ to couch concepts with the word ‘neuro’ – so there is neurochange, neuroleadership, neurosuccess, neurocoaching, neuroentreprenuership etc. But are the claims these ‘neuro’ management theories make valid or are they getting ahead of the evidence? ‘Neuro’ may make them sound ‘sciency’ but the reality is that many of these claims are based on just a few studies which are not related to their application. Take for example the neuroscience of change. Rock & Schwartz (2006) extrapolate evidence from neuroscience studies on stress to OCM. They assume that physical pain and psychological stress are the same when they are potentially two completely different things. So, can we really use experiments on rats or humans playing card games with people’s reaction to OCM interventions? There is real ‘brain overclaim’ here which potentially overshadows the three most important factors in OCM – context, context, context. Neurosciences potentially reduce OCM to a subjective brain-body experience. So, what sense does it make to study mental processes out of context of their social environment? Using neuroscience as evidence for OCM is not only an extraordinary extrapolation from one discipline to another but also from lab to real life. Such overclaims potentially expose OCM practice to ridicule. Similarly, can we equate people’s emotional experience of losing a loved one with losing a beloved system in an IT upgrade at work? This is the premise of using the Kubler Ross curve to explain change. Making this extraordinary extrapolations from studies completely unrelated to OCM can potentially do more harm than good.

Are we taking the right measures?

Once we know the construct is well defined, unique and applicable to OCM, we can then try to see whether we can measure it.Going back to engagement should we be measuring the preconditions of engagement or the outcomes? A question that often occurs in employee engagement surveys is ‘I would recommend someone to work at this company’. Is this question testing that person’s engagement, loyalty, commitment, satisfaction or just the fact that they can get away with not doing anything at work? So, this question would lack face validity as a measure of engagement because we can’t be sure what it is measuring. And this is probably because we don’t really know what engagement is!

Similarly there are at least 43 different instruments for measuring change readiness of which 7 have reasonable levels of validity. Arguably the best of the best for measuring individual readiness to change is Herscovitch and Meyer (2002). But maybe we want to measure organisational rather than individual readiness in which case we could use Bouckenooghe, Devos, 2009. We can also find effective measures for concepts but operationally they are meaningless. Take for example learning goal orientation (you could call it a ‘growth mindset’). When you measure it subjectively, i.e. people think they have or haven’t improved, you see a strong correlation between performance and learning goal orientation. Now use an objective measure e.g. actual improvement in task performance, the correlations disappear

So, when we try to measure constructs, we really need to understand exactly what we are trying to measure. We could be measuring something reliably, but it might not be what we intended to measure i.e. it lacks validity. Recently I was trying to measure resilience within a large human rights organisation. You could use The Resilience Scale? but this measures coping resources need to deal with potentially stressful situations e.g. making plans, but it doesn’t measure ‘bounce back’. So, what if someone is already suffering (as many human rights activists are) and are low on available ‘resources’? Would a person still be able to bounce back? I ended up using the brief resilience scale which just asks directly ‘I tend to bounce back quickly after hard times’. So, what we want to measure (the definition of the concept or the problem we are trying to understand) determines the measure we want to choose. 

This is why taking generic surveys (like most engagement surveys) to potentially understand very specific problems will tell you more about less and leave you with unactionable data.

Can we predict future performance?

So, assuming you can define the unique construct and it is applicable and measurable, the next step is to test it to see whether it does what it says on the tin. Take the Kubler Ross curve. If it was a valid construct you would expect it to explain people’s reaction to death at maybe least 50% of the time? The curve actually explains people’s reaction to deaths, about 11-17% of the time. But what about MBTI? It may have test-retest reliability (which is debatable) but then again so does using foot size as a measure of intelligence. My foot size will not vary day to day, and neither will my intelligence, so foot size will be a reliable predictor of my intelligence but is it a valid measure? Of course, it is not and neither is MBTI a valid measure of personality type because we don’t know if ‘types’ exist let alone the way Myers-Briggs define them.

But MBTI is widely used to try and build high performing teams even though there is no evidence to suggest it predicts high performance. And MBTI suffers from construct redundancy because it overlaps with the Big 5 Personality and has inferior predictive validity. Also, MBTI does not account for neuroticism (poor ability to manage psychological stress) which is critical in helping people manage their wellbeing. And it is not just MBTI. In 2004 Coffied et al immersed themselves in learning styles for 16 months and found of 13 popular learning style/models only one met all four validity & reliability criteria, but this measures cognitive styles and not learning styles

No alt text provided for this image
Learning styles are so poorly defined they have no practical value apart from confirming any beliefs we have in ourselves (confirmation bias) or contributing to a belief of what we want to be (Pygmalion/Rosenthal effect) – the more I believe I want to be extrovert the more I act in an extroverted way.

And again, the concept of Growth Mindset. A popular concept that has apparently tripled Microsoft’s value. A study by Sisk & Burgoyne 2018 (covering 49 studies and a 57,155 population size) of 43 randomized controlled trials (RCTs – very high up the evidence hierarchy which give causalities not just correlations) found that 86% of the results (effect sizes) were not significantly different from zero, 12% showed a positive impact of mindset interventions and 1 had a negative effect. A recent replication of Mueller & Dweck’s 1998 famous study by Li & Bates 2019 concluded that ‘Mindsets and mindset manipulation effects on both grades and ability, however, were largely nonsignificant, or even reversed from the theorized direction’. Dweck responded in an article in Nature where she pre-registering the hypothesis (a great way to ensure that negative results get published and to stop people skewing their hypothesis to the results) which hypothesised a ‘very small (near zero) positive’ and gave the dataset blind to statisticians to check her results. This is all good stuff and indeed Dweck’s hypothesis of a very small positive effect was found – an average 0.05 point higher GPA or 0.10 for lower achieving students. But this a long way off Dweck’s claim that beliefs about mindset ‘has profound effects on their (student’s) motivation’ and seriously questions the Microsoft claim or indeed any claim that organisations can make about the impact of adopting a growth mindset.

How much proof is enough?

Now we may have a well-defined and testable construct. But what happens when we test it again and again and again? Do we get the same results? If we do, then the concept is also reliable. But replication is not as easy as it sounds. Firstly, there are all the institutional issues – why would an academic just repeat someone else’s study when they could be (re)inventing a new theory of their own? And why would a journal publish negative findings – no one is interested in ‘failed’ research. Eventually people loose interest in disputes between academics anyway and just move onto the next theory. There are also theoretical issues.  Gómez et al. (2010) identified 18 different types of replication so replication has its own replication problem. Should the method be exactly replicated, or should we just seek to validate the underlying theory? Another complication with replication is that academics make ‘auxillary’ or underlying assumptions such as emotions can be measured (as in the case of the positivity ratio) or (as in the case of MBTI and Jungian theories of personalities generally) people should fit into ‘types’ e.g. they must be either Extrovert or Introvert, Sensing or Intuitive, Thinking or Feeling & Judging or Perceiving etc

So, falsification through replication is tricky and academia is not really set up for it. But we cannot avoid falsification ad infinitum. At some point people need to question their beliefs given an overwhelming body of evidence. Aren’t 37 studies (86% of the sample) showing that a growth mindset has no significant effect on educational achievement enough?

In defense of evidence-based OCM

For those avoiding falsification, the last line of defense is that no two companies are the same - concepts can be proved ‘right’ in one organisational context and ‘wrong’ in another or even within one organisation. This may be true, but this doesn’t mean practitioners can just pluck any idea out of the sky because on the face of it they seem to make sense. This is the third image at the top of the post. When we use invalid, unreliable generic solutions (e.g. the Kubler Ross curve or the latest leadership fad) to poorly defined problems (blurry target in the image) we can never know whether we are having impact. 

Equally we might have well defined problems (a poor performing team) and be using something that is reliable (such as MBTI) but it can never solve the problem because the concept isn’t valid (this is the second image above). Because, like the shoe and IQ analogy, MBTI is not a valid measure of team capability. In this case we keep using the same solution and hitting the same spot but consistently missing the target because it is the wrong solution (or in the case of MBTI not a valid solution at all!)

So this means as practitioners we need to thoroughly understand the problem we are trying to solve and the context in which we are working and draw on valid theories (ones that fit the validity steps outlined above) that might apply in that context. This way we may go some way to finding solutions to problems (the first image above). This is the essence of evidence-based OCM – we take the best available evidence and ultimately make a judgement call.

We could take a perfectly valid concept that is reliable in some situations and not others. Take for example resilience training. Research shows that for it to be effective it needs to be tailored to individual needs – people need to be aware of their own daily stressors to be able to apply interventions effectively. So, what might be a valid and reliable in one context may not be in another (the fourth picture in the image above). In fact, even in one training session you may find an intervention (e.g. Cognitive Behavioural Therapy) works well for one person and not another. And in extreme situations some interventions (e.g. psychological debriefing) may do more harm than good.

So how can we reduce this ‘judgement call’ risk? We can test interventions by controlling their delivery (e.g. limit them to a certain part of the organisation or team) and measure their impact before rolling them out. We can also re-use interventions that have already worked in the organisation and amplify them. We would be testing and assuring their reliability within a specific organisational context. But testing reliability does not need to be conclusive to be informative. Inquiring into the differences in reliability would give a practitioner a much richer contextual picture of the organisation than just a ‘sheep dip’ blanket intervention across the whole organisation. And imagine what would happen if practitioners had a list of interventions that they thought might work (based on the ‘conscientious, explicit and judicious use of the best available evidence from multiple sources’) and allowed managers to choose which one they thought best fitted their team? After all, no matter how valid a theory is, it cannot be reliably deployed without support ‘on the ground’. 

Enemas and eugenics – are our practices ethical?

And last, but probably the issue that gets least attention is whether, even given everything above, the concept is ethical? Take for example ‘neuroleadership’. What if we used neuroimaging and brain profiling to try and transform someone into an ‘inspirational’ leader using neurofeedback therapies? Is it ethical to use these techniques even though there is nothing strictly pathological about their behaviour? Like enemas or eugenics, what are the ethical implications of trying to categorise people through their MBTI profile or learning styles when they have little validity or predictive power? And particularly if they potentially have unethical ethnocentric origins - they were used in the 1960's to reach culturally deficient inner-city African American youth. We may start out with noble aims but if we don’t drive deep into the validity and reliability of our concepts and claims things can soon go horribly wrong.

Reimar Paschke

Coaching, Change support, Team- and Organizational Development, Leadership development

4 年

Wow, what an impressive pulling in pieces of all "this is the one and only" theories. Doesn't it all boil down to looking at organizations as unique living organisms, just like our kids? Whatever you do in (hopefully) best intentions can't be right or wrong but only suitable or unsuitable for that client/kid in that specific situation. And we need to be humble and self-aware enough to realize, that whatever we do can only be an impulse that does something in/to the system or not. And if it (even the worst theory) does and results improve (whether causal or not): hurray! I guess with this article I understood better than ever before that the map is not the territory... Thanks Bernd Zimmermann!

Philipp Forster

Partner bei emergize | Organisationsdesign, Collaboration, Transformation | NoGlossyStories

4 年
Bernd Zimmermann

Managing Partner @ Change Pioneers | Senior Advisor & Interims Manager | OD, HR, Change Management, Leadership Development

4 年
Howard Wiener, MSIA, CERM

Author | Educator | Principal Consultant | Enterprise Architect | Program/Project Manager | Business Architect

4 年

Well said, Alex. In particular, one line caught my attention: ". . . why would an academic just repeat someone else’s study when they could be (re)inventing a new theory of their own?" It is exactly this sort of thinking that drives business consultants to create 'new' approaches, methodologies and vocabularies in attempts to differentiate themselves and, thus, sell services. We end up with muddied concepts that require enormous effort to apply and which produce questionable results. As you point out, it's quite difficult to scientifically prove the truth of a lot of this stuff and where it's applied in substantially different situations and circumstances, it's nearly impossible. Ultimately, there's ample evidence for and against much of this stuff so we, in a cloud of confirmation bias, can cherry-pick evidence and get to believe whatever we want.

Eduardo Muniz

GM/Strategic Change Consulting Practice Lead at The Advantage Group, Inc.

4 年

Alex Boulting. Great article. At the end the only evidence out there is that despite of the proliferation of OCM Gurus, approaches, training providers and the McDonaldization of OCM that has mass produced tens of thousands of on Line OCM Certificates without making any significant difference, still the vast majority of Change/Business Transformation initiatives are wrongly deployed and unsustainable. Another fact is the OCM training providers, book authors keep making lots of money getting richer specially now offering the OCM software due to COVID19. OCM profession is badly commoditized. Thank you for sharing

要查看或添加评论,请登录

Alex Boulting, Chartered FCIPD的更多文章

社区洞察

其他会员也浏览了