According to the leading evaluation resource, Better Evaluation, “evaluation, in its broadest sense, refers to any systematic process to judge merit, worth or significance by combining evidence and values”. Just as evaluations are a key part of the process of any behaviour change intervention, they are also key in traditional international development projects and programmes.  Virtually all international development organisations commission (external) evaluations. In theory, these project or programme evaluations are supposed to offer insights into what worked during implementation, for who and under what circumstances. However, often, they are delivered to the commissioning organisation, passed on to the donor, and rarely looked at again.

My suspicion is that one of the main reasons that evaluation reports are rarely used, or not used efficiently, is because they do not ask the right questions. In my (very biased) opinion, evaluations should at a minimum include behaviour change questions. Ideally however, they should start with a behavioural perspective.

The OECD Development Assistance Committee (DAC) evaluation criteria are now standard in almost every evaluation. In 2020, the OECD updated their evaluation criteria to include an additional sixth criterion, coherence. Coherence focuses on how well a project fits in with or adds to other ongoing projects, as well as local policies.

The OECD calls for the ‘thoughtful application’ of the criteria, but if you ask anyone in the sector, they will admit that most of the time, the evaluation questions are copy/pasted without much thought. Part of the reason is that evaluations are seen as a tick-box exercise rather than a valuable part of intervention implementation and future intervention design.

If we look at criteria such as relevance, effectiveness and sustainability, there are often missed opportunities in how these criteria are interpreted. For example, while assessing relevance, evaluators will often look at whether an intervention is relevant to the needs of a community from a technical standpoint. In other words, if community members have a problem generating income, offering them goods to sell as a means of income-generation would be a relevant solution. However, if one was to return some time after the evaluation was completed, they may find that the community members have used the majority of the goods for their own sustenance because that was their main priority at the time. If, on the other hand, a behavioural perspective was first considered, an evaluation would assess the extent to which the intervention was co-designed with the intended audience for a nuanced solution that helps to address their current needs as well as future needs.

Effectiveness is another tricky criterion to evaluate because NGO outcomes are notoriously vague and unspecific, and they leave a lot of room for interpretation. The conclusions on the effectiveness of an intervention end up being vague because the defined objectives themselves are vague. One of the most innovative solutions I came across that addresses the problem of vagueness in the evaluation process comes from E. Jane Davidson and her innovative technique of evaluation rubrics.

A rubric is a framework that sets out criteria and standards for different levels of performance and describes what performance would look like at each level.  Rubrics have often been used in education for grading student work, and in recent years have been applied in evaluation to make transparent the process of synthesising evidence into an overall evaluative judgement (Oakden, 2013).

This adapted approach helps us to dig down on, for example, what has really changed in people’s behaviour as a result of what we introduced. For example, our outcome might read ‘partner organisation builds capacity’, but the question is, what in their behaviour will tell us the extent to which their capacity was built and what values will we put to the varying degrees of these changes.

Sustainability questions also often leave a lot to be desired. A quick Google search shows just how unsustainable many development programmes are. Again, if we ask the right questions, we can do better in the next iteration of an intervention. For example, a hallmark of sustainability should be the extent to which adopting new tools, services, trainings, stipends, or whatever else is offered under the intervention, at a minimum, follows the principles of the EAST framework: Easy, Attractive, Social and Timely.

One way in which many evaluations can be made more informative and user-friendly would be to incorporate behavioural international design frameworks as a way of assessing the extent to which an intervention truly made a difference in a community. For the novice in behavioural science, using the socio-ecological model as an entry point to understanding social change will offer a lot of depth and nuance, and move us away from sole descriptive facts to evaluative conclusions that effectively guide better programme design.

On that note, I leave you with evaluation questions that Davidson (2012) recommends for better quality, to-the-point and usable evaluation reports:

  1. Was the programme needed and is it still needed? How well does it address the most important root causes? Is it still the right solution?
  2. How well designed and implemented is the programme?
  3. How valuable are the outcomes for the intended groups?
  4. What works best for whom, under what conditions, why and how?
  5. How worthwhile was it overall? Which parts or aspects of the programme generated the most valuable outcomes for the time, money and effort invested?

Do you have other ideas about how we can make evaluations more useful? Share your thoughts with me here!

 Davidson, E.J. (2012). Actionable evaluation basics: Getting succinct answers to the most important questions. Real Evaluation.  

 Oakden, J. (2013). Why rubrics are useful in evaluations.