If we’ve ever chatted about research and reporting, you know my gripe about evaluation reports that are never revisited. It is a missed opportunity to learn something genuinely useful about whether your organisation is actually changing anything.
Across the impact sector, evaluation is often treated as a reporting obligation rather than a strategic tool. The result is a sector that spends significant resources measuring its work without fundamentally improving it. That cycle can be broken. But it requires starting with a different framing.
The OECD Development Assistance Committee (DAC) criteria, covering relevance, coherence, effectiveness, efficiency, impact, and sustainability, have become the standard evaluation framework not just in development, but increasingly across the broader impact space. Foundations, social investors, and impact-first organisations often borrow the same criteria, either directly or in adapted form.
In principle, these criteria are sound. In practice, they are applied on autopilot. The OECD itself calls for their “thoughtful application” but anyone who has sat on a programme team or reviewed a terms of reference will recognise the reality: the evaluation questions are largely copy-pasted, the criteria become a checklist, and the reports that emerge tell you what happened without telling you why.
The missing ingredient, almost universally, is a behavioural perspective.
Consider how relevance is typically assessed. An evaluator asks whether a programme addresses a genuine need. The answer is usually yes, because someone conducted a needs assessment before designing the programme. Box ticked.
But relevance is not only about whether a problem exists. It is about whether the proposed solution fits how people actually think, decide, and act in context. A financial inclusion programme offering savings accounts to low-income entrepreneurs may be technically relevant. But if the design fails to account for things such as irregular income patterns, social pressure to share resources with family, or distrust of formal financial institutions, the uptake will be low and the report will call it an implementation challenge rather than a design failure.
A behavioural lens on relevance asks: was this co-designed with the people it is meant to serve? Did it account for competing priorities, cognitive load, and the layered pressures shaping people’s decisions day to day?
Effectiveness is the criterion that is often measured least rigorously. Impact organisations tend to write outcomes that are aspirational rather than behavioural. “Communities are empowered.” “Young people access opportunities.” “Smallholder farmers increase resilience.” These statements describe desirable states of the world, not observable changes in how people behave.
When objectives are vague, evaluative conclusions are vague.
One of the most useful innovations for addressing this problem is the evaluation rubric, developed by E. Jane Davidson. A rubric sets out explicit criteria and standards for different levels of performance, making transparent what would otherwise be a highly subjective judgement call. Rather than asking “did this work?”, a rubric forces you to define what behaviour would indicate it worked, at a high standard, a moderate standard, or barely at all. This transforms evaluation from description into genuine appraisal.
For organisations embedding behavioural science into their work, rubrics are a natural fit. They operationalise behaviour change by asking not whether an outcome was reached, but precisely what shifted in how people act, engage, or relate to one another as a result of the intervention.
Evaluations typically note sustainability challenges across dimensions (e.g. environmental, financial, etc.) without probing deeper.
A behavioural framing changes the sustainability question entirely. Rather than asking “is there a plan for continuation?”, it asks whether the behaviours introduced by the programme are ones people can and will maintain. The EAST framework for example, which asks whether a behaviour is Easy, Attractive, Social, and Timely, provides a useful lens. If the behaviour required effort, offered no clear reward, ran against prevailing social norms, or was only relevant in a narrow window of time, it will not persist. Evaluation can and should surface this, and those findings should directly inform the redesign of future programmes.
Behavioural design frameworks are most commonly used at the design stage of a project. But there is a strong and underutilised case for deploying them retrospectively, as a framework for evaluation itself.
Models such as the Behavioural Drivers Model, the Socio-Ecological Model, or the COM-B Model offer a structured way of asking whether an intervention genuinely engaged with the multiple layers of influence shaping behaviour, from individual decision-making, to community norms, to structural and systemic conditions. This kind of analysis moves evaluation away from descriptive summaries of activities toward evaluative conclusions that guide better programme design in the next iteration.
For teams newer to behavioural science, the Socio-Ecological Model is a good entry point. It immediately expands the frame of analysis across the system.
Giving evaluation a behavioural perspective transforms evaluation from a compliance exercise into one of the most powerful tools an impact organisation has for learning, adaptation, and ultimately for doing less harm and more good.
The sector invests enormous resources in evaluation. It is time we demanded that those evaluations actually change how we work.
👉🏾 How skilled are you at designing for change? Start with the FREE assessment: https://lnkd.in/dK7YPKgR
👉🏾 Join my mailing list for exclusive insights and content! subscribe.osmanadvisoryservices.com