I've made a lovely theory of change, but now what?
What use is a theory of change or causal map to an evaluator? What kind of questions can we ask and answer with it? At our new startup Causal Map we have some suggestions, which we will set out here. Some of them are already implemented in our web app, some are just for discussion; either way, get really trigger happy with the comments button below.
Evaluators often make use of causal maps - (networks of boxes linked by arrows showing what causes/caused what), for example in the form of theories of change, whether established intentionally in a planning phase or as the result of research. In this post we'll focus more on their use as detailed tools for programme evaluation and research, rather than for planning (theories of change for planning purposes are usually best kept simple).
There is a long tradition of capturing causal maps through exercises in which individual experts or groups of experts are asked "what causes (or caused) what in this area" and the information is pieced together as a consensus; or large numbers of stakeholders are asked systemically about "what causes what" and the information is aggregated separately by expert analysts into to a combined causal map in which the links come from the different stakeholders. This latter method is the one adopted by our sister organisation Bath SDR for the Qualitative Impact Protocol (QuIP).
Unfortunately the literature about what evaluators can actually do once they've got a causal map is much thinner than the literature about how to construct them. As causal maps are a kind of network, we often see suggestions about how to assess the causal map in terms of network characteristics like centrality or cohesion. These can give interesting insights into the structure of a network and can help in for example identifying nodes which are important in the sense that a lot of causal influences flow through them. But structural metrics tend to be disappointing for us as evaluators in the sense that they answer interesting questions but not the ones we really wanted the answer to.
Evaluators are usually more interested in questions like "did this intervention plausibly make much difference to these outcomes?" or "what kind of consequences might this intervention have?". With quantitative causal maps such as structural equation models there is a long tradition and well established routes to answering questions like this, and recently statisticians and in particular Judea Pearl have been teaching themselves to use causal maps as a central, or even the central, way to answer questions about what causes what. You put in the numbers and press the button and some algorithm can tell you what is the indirect influence of one factor on another expressed as a number. It's usually possible to compare these numbers to say, for example, that the effect of intervention B on outcome E was larger, or smaller, or five times as much as the effect of intervention C.
But whatever kind of causal map we have in front of us as evaluators it's very unlikely that most of it is expressed in purely numerical form; more often the links encode information like "a little" or "a lot", or "positive/increasing" versus "negative/decreasing" influence. And we have no hope of unequivocally capturing a factor like "Ethnic tolerance" or "Family support" as a number.
Some approaches like Systems Dynamics have applied arbitrary numbers for example between 0 and 1 or -1 to +1 to the levels of the factors and the strengths of the links between them and indeed we can have a lot of fun in calculating what one factor does to another. However following Rick Davies we'd agree that most of these procedures are overly sensitive to our weak ability to model to any degree of accuracy and errors are compounded too quickly, such that they are pretty useless for doing predictions. Systems dynamics maps can indeed be useful for exploring and thinking about things like feedback loops, identifying such loops and how these make the network chaotic and hard to predict.
The central question that we want to address here is, are we stuck between a rock and a hard place as evaluators? Is it the case that if we don't have really accurate measurements, good numbers and accurate models we can't do anything at all with a theory of change except paste it in as an appendix to an evaluation report?
At Causal Map we have some suggestions. Later I'll add some references, but it is worth noting that these issues were addressed as far back as 1976 by Axelrod.
First of all we can formalise the kind of questions we want to ask:
- we are interested in in the the influence of of upstream factors called inputs or drivers, (possibly contrasted with one or more alternative sets of such drivers for contrast or comparison) on another set of downstream factors called outcomes which can be on the one hand pre-defined as in the case of a summative evaluation or can be inductively and retrospectively identified, for example when we identify unintended effects.
We see a causal map as a semi-formal model of a qualitative world. It's formal in the sense that we want to be able to ask questions like how can I correctly synthesise this map to its five most important factors, or is the influence of B on E larger than the influence of C on E. It's easy (too easy) to do this with numerical information and most cases it's much more difficult to do it with qualitative information. But it's not impossible, at least in some cases, to make some kinds of deductions even with qualitative information (Michael Scriven has a lot to say on this).
For a start we can help simply by visualising. So we can define the traced map from a set of drivers to a set of outcomes as simply all the paths that lead from the former to the latter. Simply displaying this map in the jungle of other influences and consequences can be useful.
There are already some questions we can ask, for example:
- are there any paths at all from the drivers to the outcomes?
- and if so how long are they?
- are there any paths from alternative sets of drivers, for example competing interventions or external factors?
- are there any paths from our designated drivers (haha) to alternative sets of outcomes, for example to unforeseen and possibly negative consequences which might have been identified during an evaluation?
As well as just showing the existence of a path, we can also detail its properties (quite apart from the possibility of doing any calculations with them). For example we can note:
- all of the paths have been noted as weak or hypothetical or negative.
- no paths exist which don't include at least one link which is marked as weak.
We can do this kind of thing in an impromptu way before we have any formal definition of, or calculus for, words like "weak" or "strong".
We can even use comparison:
- there are four separate strong paths from intervention B to outcome E, but from the alternative influence C there is only one path, and that is marked as weak.
This kind of summary can be understood as a narrative without formally addressing the question of how do we know that five strong links is better than one weak link.
We can also refine these kinds of questions and answers by adding conditions or filters:
- are there any paths from intervention to outcome in which a link was mentioned at least once by a woman,
- or in which the entire path was mentioned at least once by at least one woman,
- some paths are mentioned by both genders and other paths are mentioned exclusively by one or the other.
All these are examples of causal reasoning using causal maps; the derivation of true narrative statements (or simpler causal maps) on the basis of a causal map.
Even here it is a somewhat open question how to interpret the fact that one traced map has more links than another.
- It might mean that in the real world there are literally more causal connections and or mechanisms. But this begs the question about the real existence of mechanisms in the specific sense that there is one true way to count mechanisms. You'd have to be a pretty aggressive Realist to claim that... For example, is the way that the members of a school class all responded as predicted to a teacher intervention an instantiation of 1 causal mechanism or 30? This problem is particularly acute if we are (by accident or design) much more granular in one part of a causal map than another.
- This question also overlaps with information about how many respondents or sources mentioned a particular link. We are sure to take more seriously claims that all 30 respondents spontaneously mentioned a direct link from the new seeds to increased crop yields than if only three did. In a visual representation of causal maps this link, or these links, might be represented by 3 or 30 separate paths or by a single arrow with the number 3 or 30 attached to it. But we would want to avoid literally amalgamating all the three or 30 links into one, because then we would lose information about the different sources and possibly lose any characterisations as weak or strong etc.
What we need, and what we are implementing at Causal Map, is a formal way of asking and answering the kinds of question sketched out above without falling into the trap of simply representing everything as a number and then relying on ordinary arithmetic to trick us into thinking that we are able to generate super accurate predictions which were never really warranted by the quality of the data we have.
Another critical issue is how to distinguish between the theoretical part of a causal map as encoded in the links (this causes that) and any historical information about what actually happened. So we might want to report on the one hand that factor C has many strong links to factor E, but on the other hand that factor C was not in fact activated (for example, remained in an "off" or "low" position).
Finally, we also allow (and encourage) users to nest more specific causal factors (e.g. "wellbeing: perceived good health") into more general ones (e.g. "wellbeing") and use this hierarchical structure to simplify the causal map and provide more general answers to questions where required.
So, in a nutshell, what we are trying to do at Causal Map is:
Allow users to code ideas like the strength or trustworthiness of a link, as well as information (where available) about the characteristics of the source of the information (e.g. gender) and use this information to provide useful answers to relevant queries without necessarily trying to compute answers in terms of a (probably spurious) number. These answers can be provided in terms of a simpler, perhaps filtered, map, and/or in terms of a narrative report ("there are six links from B to E and all of them are strong").
[I should add that this post was partly written in response to Prof. MacKay's paper on trophic analysis of (causal) networks, which is a way of establishing a left-to-right layout or causal flow even when a network has (many) loops. I haven't addressed the issue of how the presence of loops can make the problems I mention here harder. Our Causal Map app isn't bothered by loops for the simple reason that our current algorithm for tracing paths through a network is deliberately limited at four or five steps. We believe that it is rarely useful or credible in real-life evaluation studies to try to predict or even really conceptualise paths longer than this. So the wicked problems which loops might introduce are sidestepped: loops are followed, but only as part of paths which are in total no longer than four or five steps.]
-------
Footnote: why do we prefer the phrase "causal map" to the currently more popular "systems map/diagram"? The only reason we can really see for calling a causal map a systems map is when it happens to include cycles or loops and may (therefore; arguably) exhibit behaviour which is hard to predict, which might arguably be a reason to claim it is a system. But, if I use approaches like participatory systems mapping and the resulting map happens not to include any interesting loops, does that mean I suddenly have to refer to it as a non-systems map? Do I need to demonstrate that the thing modelled by my map has any or some of the specific properties of a system? And if not, why call it a systems map? If we are prepared to call any factors which causally influence one another a "system" then just about everything is a system. The various theories about systems have a lot to offer evaluators and social sciences, but they do apply to systems (variously defined) and not just anything.
Director, Foundation for the Advancement of Social Theory (FAST)
4 年Excellent article! We've found that evaluators are increasingly dissatisfied with linear logic models - and have a strong interest in creating and using more systemic maps. We've also found that clients and participants find it easier to read the maps when they are the ones who make it. Like you, we've also been working to make the process more accessible. Lots of resources (including our new "plain language" book on "Practical Mapping") here: https://projectfast.org/resources/ That said, there are very good reasons for understanding the systemic structure of a causal map. And, importantly, there are easy ways to measure that structure. Briefly, structure is about connectedness. Theories with more connections are more useful in practical application for guiding decisions, predicting results, and reaching desired goals. For teams, more structured maps are more useful for communication and collaborative decision making. A recent paper here that applies the method to a strategic plan: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sres.2599 That paper shows how to evaluate the structure of a causal map using Integrative Propositional Analysis (IPA - the methodology, not the beer). It also shows how to find "leverage points" to have the maximum sustainable impact with the least effort. Another way to evaluate structure is to measure the loops: Wallis, S. E. (2020). Evaluating and improving theory using conceptual loops: A science of conceptual systems (SOCS) approach. Cybernetics and Human Knowing, 27(3). An important point here is that loops should "balance." That is if your map is entirely made of "positive" causality, you do not have an effective representation of how the world works (perhaps a tad optimistic). Current thinking is that there should be a ratio of two positive to one negative... but that is a tentative result. we really need to study this more! Typically, we find that theories (along with national policies and strategic plans) have about a 20% structure. And, they tend to reach their stated goals about 20% of the time. The map by Steve Powell is clearly much better than average! Importantly, using systemic structure as a guide, we can easily double the effectiveness of our theories - and so double their usefulness to our clients. Finally, yes, many causal connections are hypothetical or tentative... questionable. In creating maps with stakeholders, we ask them to confirm their understanding of each causal connection. In a sense, gaining consensus as to the validity of the map. Then too, from a broader view, it is an iterative process. So the map is put into use... change happens... the client learns... and the map is refined. So... what do you do with a lovely theory? Put it into practice... and then improve it.
?? causalmap.app. Mad about causal mapping & evaluation.
4 年I just added this footnote to the post: [I should add that this post was partly written in response to Prof. MacKay's paper on trophic analysis of (causal) networks, which is a way of establishing a left-to-right layout or causal flow even when a network has (many) loops. I haven't addressed the issue of how the presence of loops can make the problems I mention here harder. Our Causal Map app isn't bothered by loops for the simple reason that our current algorithm for tracing paths through a network is deliberately limited at four or five steps. We believe that it is rarely useful or credible in real-life evaluation studies to try to predict or even really conceptualise paths longer than this. So the wicked problems which loops might introduce are sidestepped: loops are followed, but only as part of paths no longer than four or five steps.]
Slovenian Evaluation Society
4 年So glad to read your post after some time, Steve. Your contribution to evaluation theory is obviously progressing well awarding us with new insights. I especially appreciate your emphasis on non-systemic aspect of evaluation, reaching beyond quantification and empiricism. Should one understand a causal map as a sort of an alternative to experimental approach to evaluation? Causal constructivism vs econometric modeling? Double disposition in relativist manner, instead of double blinding in rationalist manner? I praise your work as a critic of causal comprehension of evaluation. Causality is invaluable only when dealing with simple (and sometimes forcefully simplified) problems. But for these only simple explanations are needed – complicted explanations revive systemic thinking (see Stacey, ACM matrix), which one aims to avoid. This probably constrains explanatory potential of causal constructivism for dealing with complex situations. My experience is that the most productive would be for an evaluator to establish interaction between rationalist and relativist thinking from the middle between them, being part of both but none of them. Stay well.
Professor at University of Warwick
4 年Interesting questions! I'd like to follow up with you.