Part 1:  What Should the Evaluation (M&E/MERL/MERLIN) Budget Look Like?

Part 1: What Should the Evaluation (M&E/MERL/MERLIN) Budget Look Like?

An evaluation question I have been often asked and – in truth – found the answer I gave was disliked the most by those who asked -- is what should the monitoring and evaluation (#M&E) budget look like??Or – as I prefer to put it – what should the #MERL (adding in Research and Learning) or #MERLIN (adding in “innovations” from USAID) budget look like? ??

Three Part Answer.?I am sharing my answer in three parts as I’ve discovered that donor-based and implementer-based evaluators are equally mystified about evaluation budgets.?This blog discusses the overall size of the evaluation budget and why it matters to decide the overall level of the evaluation budget at the outset based on Foundations for Evidence-Based Policymaking Act (Evidence Act) of 2018.?The second part focuses on how to translate the overall evaluation budget into a direct cost budget.?And the third will highlight five important questions donors should be asking implementers when reviewing budgets.?This blog discusses the overall size of the evaluation budget which can range from less than 5% to 25% or more of the overall project budget.

Proportionate Size Matters.?My “disliked” answer ranges toward the higher end and recommends including evaluation and evaluators at the design table. Post Evidence Act, this inclusion is necessary for compliance.?Then, working at the inception phase, you need to be generous with the evaluation budget using quarantined funds accountable only to an evaluation team (and donors).?And this needs to be an evaluation team that is independent from the project team and the project budget.?This team needs to nonetheless remain close (likely internal-independent) to the inception, implementation and close-out so that project managers using adaptive management can learn from the results. ?You need to first decide the level of the overall budget, then work to turn those dollar amounts into a (mostly) direct cost budget, and throughout both implementers and donors need to be careful with the details to ensure rigor and adaptiveness as well as accountability and transparency.

Evidence Act Changes the Game.?Following the Evidence Act, deciding what an adequate to well-resourced evaluation budget should be at proposal and/or inception is an essential skill to be developed for both program donors and program implementers.?This has been true for foreign assistance since 2016 (Foreign Aid Transparency and Accountability Act (FATAA) of 2016), but the Evidence Act not only extends accountability requirements to all federal agencies, but it also requires grants and contracts as well agency work to be evidence-based, learning-focused, and data-driven.?U.S. federal agencies are now:

  • Treating M&E as a standard and required element of managing any project;
  • Expecting all M&E costs must be included in your funding request
  • Requiring specific program design practices, monitoring, evaluating and reporting on the performance of U.S. foreign assistance and its contribution to the policies, strategies, program goals and priorities of the Federal Government;
  • Encouraging requests for additional M&E funding to improve and strengthen capabilities to carry out project design, monitoring, and evaluation (M&E) practices; and
  • Expecting that evaluations and data collected as part of M&E are registered using open data principles on publicly available websites (unless exempted for specific reasons).

Research Activity Budgets Are Different.?Coming from #academia, being able to calculate the budget is relatively straightforward.?This is true because it is an activity-based budget where most of the potential unknowns are known by the evaluation/research team.?For example, you know (or can readily calculate) the anticipated costs of the research activities, and you can estimate with sufficient specificity how much time it will take you (or the research assistants you have trained and directly supervise) to do quality work.?When data collection approaches (e.g., surveys and key informant interviews (number of interviews completed per hour/day), focus groups (number of days) or methodologies (e.g., time/cost of most significant change methods) are selected in advance, the level of effort in staff days can be estimated.?Further, you are expert in the problem to be addressed and thus can predict the influence or and/or control for contextual factors.?You also have direct control over the inevitable trade-offs entailed in the quality assurance checks and the methodological changes/adjustments you need to make in the field for cost-efficient data that is also valid and reliable and can produce defensible credible results in public fora.

These qualifications may also likely exist for experienced third-party evaluator firms, in domestic U.S. evaluations, or in well-researched content areas (such as education or public health fields) where evaluation best practices and approaches are more well-defined and established registries are available about “what works.”

Traditional Donor/Implementer Approaches Give Short Shrift to Evaluation Budgeting.?But in my recent experience in international development, these conditions rarely hold. Unfortunately, in international development, most M&E, MERL and MERLIN budgets are often an afterthought where evaluation is given leftovers after the program budget has been allocated.?This results in a great deal of confusion among both donors and implementers about what level of overall project or portfolio budget investment should be in monitoring and evaluation.?Traditionally, the emphasis is on routine monitoring which conducted largely by project staff who engage in ongoing collection of project-level data on inputs, activities (processes), outputs, and direct outcomes based upon a previously agreed list of indicators (usually drawn from the proposal and encapsulated in the award). But post Evidence-Act, the need is higher level:?to align routine monitoring with evaluation to track trends, learn from results, and manage for impact. The evaluation budget gap increases in the “name that tune in five notes” competition where the lowest budget for the most content-rich program activities wins.?Here evaluation budgets get short shrift as implementers rely on donors to flag evaluation gaps at award launch meetings while donors in turn magnify the gaps by not looking too closely at the risks posed in the competition for evaluation proposal promises for the fewest resources (discussed in a previous post).

The Traditional Medium and Small Project Loophole.?In costing evaluations, one of the confounding factors is the artificial separation between external (third party) evaluations and impact evaluations.?While the Evidence Act as interpreted by the Office of Management and Budget now accepts internal independent evaluations as equally rigorous (provided that the evaluation team is separate from the project teams and has appropriate skills and resources to conduct an adequate design), traditionally U.S. federal agencies have limited external evaluations to large projects.?This rule allowed medium and small projects to meet guidelines with narrow indicator monitoring only because the requirement of external evaluations was limited to large projects.?Large projects were defined as those extending two years or longer and whose budgets exceed their median agency project budget (this amount varies by agency but often this is projects with budgets in excess of $750,000).

What are the new rules provided by donors or adopted by large implementers and/or results-focused evaluation approaches??These are rapidly changing, and the answer(s) will surprise you.?Most donors and some implementers provide standard estimates based on the overall project budget.?These estimates range from a low of 1% to 25% or more of the total project budget.?

Standard Minimum Percent Approach.?One option is a standard low minimum percentage across all projects.?A decade ago, USAID had a 3% minimum standard (i.e., as a proportion of the overall project budget) (Charney study 2013).?Often, when there is a low standard percentage across all projects, there is a separation between the monitoring budget and the evaluation budget.?For example, the United Nations (UN) affiliated International Labor Organization (ILO) has established Policy Guidelines for Results-based Evaluation (4th Edition) recommending a minimum of 3% of the project budget for monitoring, review and internal evaluations, with an additional minimum 2% of total project budget (for projects over $500,000) to be reserved for independent evaluations (totally a minimum of 5%) of the total project budget (p. 32).

Evaluation Tiers.?Another option is tiers of evaluation rigor based on what you want to achieve with evidence.?Given how the emphasis on evidence blurs the distinction between attribution and contribution as well as between monitoring and evaluation, the new focus is increasingly on what kind of evaluation rather than what size of project deserves evaluation.?Thus, there is increasing agreement with renowned evaluators Michael Bamberger and Linda Mabry who divide evaluation budgets into three general categories or tiers based on the overall program budget.?These categories rank evaluation costs (beyond routine monitoring) based upon the type of evaluation (in an appendix for the third (2020) edition of Real World Evaluation (p. 6)) include:

  • Small (e.g., less than 5% of program budget)
  • Moderate (e.g., up to 15% of program budget)
  • Generous (e.g., more than 15%—for example, a major purpose is research to test a new intervention)

Serious Evaluation Budgets Have a Broader Range.?A growing number of donors and implementers are expecting all projects to take evaluation seriously and provide for higher investments into evaluation and go well above the minimum percent.?For example:?

  • The William and Flora Hewlett Foundation review of evaluation budgeting notes that “Conventional wisdom long held that a serious commitment to evaluation required spending on the order of 5 to 10 percent of programmatic budgets” although the total may vary depending on the size of a project.” (Benchmarks for Spending on Evaluation (2014).
  • The Kellogg Foundation distinguishes between a performance monitoring and/or a process or formative evaluation (“allocate 5 to 10 percent of your total program cost”) while an outcome or summative evaluation would require more (“consider allocating 15 to 20 percent of your total program cost to evaluation”).?(Step-by-Step Guide to Evaluation (2017) p. 134)
  • The Donor Committee for Enterprise Development (DCED) which has a signature approach to results-based evaluation which includes both home office evaluation leadership and embedded evaluators trained with their method. The DCED approach emphasizes both results and the ability of the evaluation to support attribution of the project results to the intervention, includes a standard for budgeting for results measurement between 5-10% of the overall project budget (DCED Attribution in Results Measurement (2017).
  • The Government of Western Australia recommends “quarantining evaluation budget between 5 – 20% of total program costs” (Evaluation Guide (2015).
  • USAID guidance has moved from a “one-size-fits-all” approach to a broader range that “could require 10%, or maybe even more, to undertake monitoring and evaluation (e.g. impact evaluations of activities/IMs or interventions in remote geographic areas) [while] others could require 1% or less of the total budget.” (USAID Evaluation Toolkit, p. 36 (2015)

Specialized Design and Impact Evaluation Funding.?The most rigorous designs anticipate the “generous” evaluation budgets.?Others have provided guidance on where a larger proportion of the overall project budget is needed for evaluation.?Higher level evaluation budgets (e.g., over 20% of the project budget) are more typical for both experimental research designs (often termed randomized control trials (RCTs) such as championed by the Abdul Latif Jameel Poverty Action Lab (J-PAL) and the International Initiative for Impact Evaluation (3iE) in international development).?It can also apply to ?intensive participatory qualitative designs such as outcome harvesting used in developmental and adaptive management projects.?For example, one large project I saw presented using the most significant change qualitative design developed by Rick Davies stated they had an evaluation budget of about 25% of the total project budget due to the extensive coding and data analysis which drew upon a series of participatory sessions.

The U.S. Corporation for National and Community Service (CNCS) agency (created in 1993 and rebranded in 2020 as AmeriCorps) has developed specific standards for evaluation budgeting due to its early adoption of RCTs and evidence-based impact and implementation programming.?According to the Social Innovation Fund Evaluation Budgeting Quick Guide, CNCS has concluded that “The rule of thumb ratios in use to date (i.e., between 5% and 10% of the total budget allocated for evaluation) result in serious underbudgeting of evaluation.”?Instead, when assessing impact, CNCS finds that “between 15% and 20% is more realistic for single site quasi-experimental designs (QEDs) and randomized control trials (RCTs), with some designs (e.g., multisite RCTs, designs with intensive implementation studies) requiring 25% or more.”

There is Also Likely a Minimum “Floor” and a Maximum “Ceiling” Evaluation Budget when considering Economies of Scale.?While it is useful to consider overall project budgets as a rough guide for the size of different evaluation budgets, it is also useful to consider a “floor.”?

  • First, while different folks may place the floor at different levels, most will likely find it difficult to find an experienced firm who will agree to collect original data for a third-party evaluation for less than $50,000 or thereabouts.?Yes, you may find individual consultants or firms willing to do so for less, but you will likely not get much more than a desk review of existing documents and some reflections that lack any credible external data or expertise. ??
  • Second, there are a number of data collection methods such as surveys or other specialized assessments such as needs assessments and gender analyses that can be costly even though they would comprise just one component in an evaluation.?Fielding a survey alone could cost $20,000 or more, for example.?And USAID, to give another example, has estimated that the typical budget for a country-level gender analysis might range from $115,000 to $240,000.

Ceilings also likely exist even for very large projects. The reasons that a project budget may balloon are different than those factors which affect evaluation budgets. Staffing and data collection are the two biggest costs for any evaluation.?Evaluation teams can be more efficient with staff with different levels of expertise who are paid at different rates.?If some staffers who are embedded in the field, this will also reduce travel costs.?And there can be reasonable cost-cutting data collection measures even with very large projects using sampling procedures for data collection that ensure precise enough data at lower costs.

For these reasons, a transparent and accountable evaluation budget should be turned into a direct cost budget to the extent possible.?Now that you can guesstimate the likely size of the overall evaluation budget, how to do that will be the topic of my next blog.??

Salome Tsereteli-Stephens

Learning, Evidence, and Impact Director at American Bar Association Center for Global Programs. Any posts made are in personal capacity.

2 年

Answering questions everyone is asking. Thank you Denise!

Amazing! This is such an important topic and I'm glad to see you providing this valuable information in a book! Super excited! It's really great to see that you're teaching and helping so many people out there on how to succeed in securing budgeting and getting a grip in other financial principles. Keep up the great work! By the way, would you be interested in teaching children to learn about financial literacy and investment, please check out the GravyStack app where kids can learn the basics of banking and investing by playing games and completing fun challenges. Because we believe it's necessary for kids to grow up being financially responsible and savvy from an early age. Do check us out and let us know your thoughts!

回复
Andrew Green

DRG and Applied Research Expert

2 年

Wow, Denise, this is amazingly helpful

回复
Jerim Obure

Senior Specialist -Monitoring, Evaluation & Learning- International Justice Mission

2 年

I look forward to the book, Denise! Quite a great topic to bring out for discussion!

回复

要查看或添加评论,请登录

Denise Baer的更多文章

社区洞察

其他会员也浏览了