The Goldilocks Challenge
Right-Fit Evidence for the Social Sector
Mary Kay Gugerty and Dean Karlan
Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and certain other countries.
Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America.
© Oxford University Press 2018
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by license, or under terms agreed with the appropriate reproduction rights organization. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above.
You must not circulate this work in any other form and you must impose this same condition on any acquirer.
Library of Congress Cataloging-in-Publication Data
Names: Gugerty, Mary Kay, author. | Karlan, Dean S., author.
Title: The Goldilocks challenge : right-fit evidence for the social sector / Mary Kay Gugerty and Dean Karlan.
Description: New York, NY : Oxford University Press, [2018] | Includes bibliographical references.
Identifiers: LCCN 2017043942| ISBN 9780199366088 (hardcover : alk. paper) | ISBN 9780199366101 (epub) | ISBN 9780199366095 (updf)
Subjects: LCSH: Nonprofit organizations—Evaluation. | Organizational effectiveness--Evaluation—Methodology.
Classification: LCC HD62.6 .G84 2018 | DDC 658.4/08—dc23 LC record available at https://lccn.loc.gov/2017043942
9 8 7 6 5 4 3 2 1
Printed by Sheridan Books, Inc., United States of America
CONTENTS
Author’s Note vii
PART I: The CART Principles: Building Credible, Actionable, Responsible, and Transportable Evidence Systems
1. Introduction 3
2. Introducing the CART Principles 15
3. The Theory of Change 30
4. The CART Principles in More Detail 49
5. Monitoring with the CART Principles 66
6. The CART Principles for Impact Evaluation 90
7. Collecting High-Quality Data 118
PART II: Case Studies
8. Educate! Developing a Theory of Change for “Changemakers” 147
9. BRAC: Credible Activity Monitoring for Action 166
10. Salama SHIELD Foundation: The Challenge of Actionability 181
11. Invisible Children Uganda: An Evolving Monitoring and Evaluation System 199
12. Deworm the World: From Impact Evidence to Implementation at Scale 211
13. Un Kilo de Ayuda: Finding the Right Fit in Monitoring and Evaluation 227
PART III: The Funder Perspective and Concluding Thoughts
14. The Institutional Donor Perspective 245
15. The Retail Donor Perspective 254
16. Concluding Thoughts and (Hopefully) Helpful Resources 263
Acknowledgments 267
Acronyms 269
Notes 271
Glossary 275
Further Resources 277
Works Cited 279
Index 283
[ vi ] Contents
AUTHOR’S NOTE
This book began with a question. It was a rainy day in Seattle in 2010, and we were presenting to a room full of nonprofit organizations. Dozens of small- and medium- sized Seattle-based organizations came to learn about measuring their impact. Donors often ask for evidence of impact, so the room was packed. Of course, many also wanted to measure impact because they were genuinely motivated to achieve their mission.
In the session, we introduced the basics of using randomized controlled trials (RCTs) to test the impact of social programs. RCTs are the method used in medical research to test drug and treatment efficacy. We use this method to measure impact in our own research because it is the best way, when feasible, of finding out how programs are affecting the lives of the poor. However, a lot rides on the words “when feasible.” A big push in development economics in the 1990s has shown that “when feasible” is indeed fairly often. Our focus in this book, though, is not on when RCTs are feasible, but on what to do when they are not.
At the training session, clicking through a set of slides, we explained how impact evaluation via RCTs can help organizations get good answers to important questions. We discussed a series of technical issues, such as the importance of control groups, how to craft experimental designs, and what data to collect. Then we used case studies to show that RCTs do require certain conditions, such as a large enough sample size, in order to work.
As the session wore on, a sense of frustration seemed to be growing in the room. “I totally get it,” said one attendee. “I get the setup you need to be able to randomize, and the sample size and everything, and yet I’m never going to be able to do an impact evaluation like that,” she said.
She was right. And as more people in the audience spoke about their work, it was clear that most of them were not going to be able to conduct a reliable impact evaluation even though they all wanted one. Perhaps they could instead learn from other, larger organizations that can pave the way
by creating knowledge. But will the lessons from these larger organizations hold for the smaller ones, working in different settings?
The audience members at that workshop wanted to use their own data and undertake their own impact measurement. Yet their organizations were either too small for an RCT— some were running programs in just one or two towns—or their work focused on advocacy or institutional reform, which typically cannot be evaluated with an RCT.
Then came the question that triggered a long conversation and, ultimately, this book: “So if I can’t do an impact evaluation, what should I do?”
At the time, we did not have a set of slides for that. It was awkward. We gave some broad generalities and then a few examples. We had no manageable framework, usable by organizations, to think through the alternatives. We did notice that many of the questions the audience posed were not actually impact questions. Rather, they were management questions. Accountability. Quality control. Client satisfaction. These are important concepts, yet not always part of the “impact evaluation” movement.
As we talked after the event, we discussed how many organizations— large or small, based in Kenya or Seattle—also lacked a guiding framework for what to measure and how.
Soon after the conference, we spoke with someone who represented a donor that was dealing with this same question. The donor supported a number of small organizations in Uganda that were struggling to collect the right data on their programs. “How do you think we could raise the bar on [monitoring and evaluation] for our grantees?” he asked.
Over the next few years, we set out to investigate that question— to find out what organizations were doing right in monitoring and evaluation and what they could do better.
We found a system that is out of balance: the trend to measure impact has brought with it a proliferation of poor methods of doing so, resulting in organizations wasting huge amounts of money on bad “impact evaluations.” Meanwhile, many organizations are neglecting the basics. They do not know if staff are showing up, if their services are being delivered, if beneficiaries are using services, or what they think about those services. In some cases, they do not even know whether their programs have realistic goals and make logical sense.
To correct this imbalance and strengthen programs, we propose a framework for building right- fit monitoring and evaluation systems. This framework does not just apply to organizations that are ill- suited or too small for impact evaluations; it applies to all organizations—including nonprofit organizations, governments, and social enterprises—aiming to do good around the world.
As we wrote this book, we each brought different, though complementary, perspectives to the subject. We have known each other since our graduate school days at Harvard and M.I.T., where we studied development economics. Since then, we have both landed at universities and conducted field research internationally, but our paths and research interests have diverged.
Mary Kay studies nonprofit performance and accountability systems and teaches public policy to master’s students at the Evans School of Public Policy at the University of Washington. She focuses on the management and accountability issues facing US nonprofits and global nongovernmental organizations (NGOs). Many of her students hope, one day, to run a social- sector organization. While writing this book, she often saw things from the perspective of the people inside organizations trying to improve their programs. They struggle to identify and collect data that help them track their performance. They strive to develop decision-making systems that turn that data into programmatic improvements. They feel pressure to evaluate their impact but worry about the expense of doing so and whether it would yield any actionable information.
Dean conducts his research around the world, in Africa, Asia, and Latin America, as well as the United States. In 2002, upon finishing graduate school, he founded an international research and policy advocacy nonprofit, Innovations for Poverty Action (IPA). In 2015, he co- founded a charity rating and feedback nonprofit, ImpactMatters. Dean has focused his research on measuring the impact of programs. He tests development and behavioral theories that have implications for how to best implement programs or run businesses with social goals. Through his research and the work of the team at IPA, he has seen such impact evaluations identify successful programs to fight poverty and then seen the evidence influence policy for hundreds of millions of families. Hence IPA’s motto: More Evidence, Less Poverty. But he also has seen organizations try to measure their impact before they knew if they were implementing their program as intended. And he has seen countless impact evaluations, good and bad, go unused because they were not designed to address an organization’s key questions or did not come at the right time in the program cycle. The result is wasted money, misspent staff time, and lost opportunities to learn.
Both of us are passionate about making programs work better for the poor. Although we envision this book being used in classrooms, this is not a textbook per se. Whether you are a student, practitioner, or funder, we hope this book helps you use data to drive decisions that make the world a better place.
PART I
The CART Principles
Building Credible, Actionable, Responsible, and Transportable Evidence Systems
CHAPTER 1
Introduction
In 2011, a woman named Lucia moved to Kampala, Uganda, to work for a nonprofit aiming to help families with children improve their food security. On her first day on the job in the monitoring and evaluation group, she found an extensive system in place for tracking changes in nutrition and income for beneficiaries of the organization’s food security program. She began working with her team to compile and analyze these data about the program’s progress—data that surveyors had collected through multiple rounds of detailed interviews with people in far-flung villages. Every day, Lucia spent most of her time converting 303 indicators from two project sites and 25 partner organizations into a vast, organized system of spreadsheets.
As months passed and Lucia continued her work, however, she realized the organization was not using—or learning from—most of the data she and others laboriously collected and compiled. Instead, the data were sitting on hard drives and on shelves at headquarters gathering dust. Even when the information Lucia compiled got used, she worried program managers were drawing faulty conclusions by using monitoring data to assess the program’s impact. For example, by comparing income data from parents before they enrolled in the program to their income after they “graduated” from it, staff claimed that the program had lifted thousands of children out of poverty. But they were unable to say whether the income change they saw was the result of the program’s food assistance, its cash-for- work component, or on the more pessimistic side, merely the result of broader economic shifts that had nothing to do with the program.
The organization Lucia worked for was not fulfilling its responsibility to learn. It was monitoring and evaluating projects, but the program’s leaders were not learning from these efforts. In the end, they had no idea which activities, if any, actually helped people they served. As one staff member lamented, “Currently the [monitoring and evaluation] system helps us write reports, but it does not actually teach us what’s working best.”
A few miles away, the Director of Programs at another nonprofit was dealing with an entirely different problem. The organization had been operating in Uganda for about eight years, working to help conflictaffected young men and women in Northern Uganda, but it had never assessed whether its programs were working as intended or achieving their goals. The organization raised funds largely from small donations from the public and the sale of merchandise; as a result, it had never been asked to demonstrate accountability to specific donors. But, over time, the organization had become better known and its model had started generating broader interest. In light of this increased attention, the Director of Programs felt that the organization had the responsibility to demonstrate— to its staff, donors new and old, and those looking to learn from its model— that the program was well run and was changing people’s lives in the way it promised. This would mean gathering data on exactly how the program was being implemented and proving the impact it was having on people’s lives.
But that would be no easy feat. Since the organization had never tracked or evaluated its programs, it had no data to share with donors or the public on its participants. Nor did it have much administrative or operational data that could help articulate its model and what exactly the staff did every day. To address these information gaps, the organization established a monitoring and evaluation team and began tracking program activities.
Around the same time, the organization also hired an external firm to conduct an impact evaluation of its work. Midway through the evaluation though, the Director of Programs made a frustrating discovery. The evaluation firm was taking a lot of short cuts, such as simply comparing participants’ lives after the program to their lives before the program. Yet people’s lives change, for better or worse, for many reasons. So how could the organization know if the changes observed were because of what they did, rather than other factors? Moreover, the firm had sent surveyors into the field who did not even speak the local language. How could they possibly gather good data in face- to- face interviews? It became rapidly clear that the “impact evaluation” would not actually provide any information
about the program’s impact and that any data collected would be of suspect quality. It was just a huge waste of money—not at all what the Director had wanted.
These accounts are based on interviews conducted in Kampala, Uganda. We are grateful to the organizations for sharing their stories with us for us to share publicly, and we have kept their names anonymous. The identity of these groups is not important for the point we are making, and the people we met are trying to help the poor and their data collection efforts are well intentioned.
These two organizations have different problems with their monitoring and evaluation systems, but both systems share one commonality: they do not provide the information that stakeholders really want. In the first case, Lucia’s organization collects more monitoring data than it can analyze or use to improve operations. Beyond that, the organization uses these monitoring data to make claims of impact but, unfortunately, that is not a credible way of demonstrating the effect of the program. In short (and we will talk more about this later), their data fail to include a comparison group—a group that shows how participants would have fared without the program. Without that, it is quite challenging to know whether the program caused the observed changes to occur. The cause could have been good economic conditions, some other government or nongovernmental organizational (NGO) program, or the drive and ambition of the participants. In the second case, the organization has too few data on the implementation of its program to know how it is actually being run, how to manage and improve day- to-day operations. It also ended up measuring impact poorly, in this case with a poorly designed and managed evaluation.
Are these organizations unique? Not at all. Countless organizations around the world face similar problems every day. (“Organizations” is the term we use in this book to refer to nonprofits, government, and social enterprises; see the glossary for our definition of a social enterprise.) Motivated to prove that their programs work, many organizations have developed systems that are too big, leaving staff with more data than they can manage. And the data that are available often fail to provide the information needed to support operational decisions, program learning, and improvement. For other organizations, data collection efforts are too small, providing little to no information about program performance, let alone their impact on people’s lives.
The struggle to find the right fit in monitoring and evaluation systems resembles the predicament Goldilocks faces in the fable “Goldilocks and the Three Bears.” In the fable, a young girl named Goldilocks finds herself
lost in the forest and takes refuge in an empty house. Inside, Goldilocks finds an array of options: comfy chairs, bowls of porridge, and beds of all sizes. She tries each, but finds that most do not suit her: the porridge is too hot or too cold, the bed too hard or soft— she struggles to find options that are “just right.” (See the box at the end of the chapter for a longer version of the fable.) Like Goldilocks, organizations must navigate many choices and challenges to build monitoring and evaluation systems. How can organizations develop systems that work “just right?”
Answering that question is the goal of this book. We tackle the challenge using a new framework of four basic principles: credibility, actionability, responsibility, and transportability. We call these the CART principles. And because evaluation and monitoring are concepts often taught in the abstract, this book will present case studies of actual organizations and their struggles to develop the right monitoring and evaluation systems. In the process, we hope you come to agree with us that high-quality monitoring is necessary for sound implementation, learning, and improvement. It should be a bedrock of every social sector organization. And we hope to convince you that while impact evaluation is important for accountability and learning, it is not the right fit for every organization or every stage of a program’s life. It should be undertaken only when certain conditions are met. Sometimes less is better.
THE EVIDENCE CHALLENGE
Let’s remember how we arrived at this dilemma. Unlike for-profit companies with no stated social impact goal, nonprofits and social enterprises claim to make a positive impact on the world. And they raise money, either from donors or investors, based on this premise.
Often, organizations trying to produce social impact have marketed their work to donors through stories about specific individuals who benefitted from their programs. Stories about how a program has changed the life of a specific individual can be persuasive.1 We have all seen ads highlighting how an organization’s program can help a particular child attend school or help a particular young mother pull herself and her family out of poverty.
Even though these stories may be compelling, they do not tell us the impact of the program on people’s lives— whether the program is actually working, whether the program caused those changes to happen. One person’s success story does not mean the program caused that success. And
one person’s success story does not mean that everyone in the program succeeded, or even that, on average, the program succeeded. How did we arrive in a situation where stories are often seen as a substitute for impact?
New Pressures, New Trends
A confluence of factors has contributed to this push for impact.
Data today are radically cheaper to collect, send, and analyze. Twenty years ago, organizations could only dream of collecting data at the scale they can now. In the past, it was simply too expensive to gather information on programs. The Information Age, and mobile technology in particular, has changed that. Cellphones, GPS devices, satellite imagery, wi- fi, and many more technological innovations have made it less expensive to gather and transmit data, while a myriad of software innovations has made information easier to analyze and use. Previously, organizations might have said, “we’d like to get data on results, but it is too time- consuming and expensive.” Today, organizations can collect data on almost anything they can imagine and do so with relatively little expense. Naturally, cheaper data then also makes donors more willing to demand it: “no more money without evidence of impact.”
Meanwhile, we have seen calls for more accountability in the public sector, particularly the development aid sector. Calls for concrete development aid targets and more proof of aid’s general effectiveness have been steadily increasing. In 2000, the United Nations adopted the Millennium Declaration, establishing international development goals to reach by 2015.2 In the mid-2000s, 91 countries and 26 major development organizations came together to improve the quality of aid and increase its impact with the Paris Declaration on Aid Effectiveness.3 One of the declarations’ five main principles, “Results,” asked that “developing countries and donors shift focus to development results and [make sure] results get measured.”4 Other international conferences in the 2000s solidified this agenda, giving birth to a culture of measuring outcomes.
This push to ensure accountability by measuring results is not limited to the development sector; organizations in the US face similar pressures. In recent years, the US government has introduced a number of initiatives that promote the use of evidence in policy- making. 5 One such program is the Social Impact Fund (SIF), which directs federal grant money to evidence- based organizations and requires all grantees to conduct a rigorous impact evaluation to quantify the results they
produce. The SIF also implements a “Pay for Success” program that uses results- based contracting. In this model, the government only pays for services when a program has produced the promised results.6 Similar trends exist elsewhere. In the UK, for example, the government has initiated more than 30 social impact bonds, which are a way of committing public and private expenditures that rely on similar “pay for results” contracts. 7
At the same time, philanthropic culture has changed. According to a 2013 report based on a national survey of 310 major donors aged 21 to 40, young and wealthy philanthropists are different from their parents and grandparents: they want to make sure their dollars are having a measurable impact. “They see previous generations as more motivated by a desire for recognition or social requirements, while they see themselves as focused on impact, first and foremost,” the report summarizes. “They want impact they can see. . . . They want to use any necessary strategies, assets, and tools—new or old— for greater impact.”8
Technological advancement and the push for results increase the supply of information available to both organizations and donors. Organizations now have data and information that can be used to make decisions large and small, from how to improve a program model to whom to promote within their organization. They also have access to data and tools that, if used correctly, can rigorously measure their programs’ impact. And donors now have a growing body of evidence on the impact of many kinds of programs. These abundant data give them the ability to direct resources to programs that work and to avoid ones that do not, injecting a results orientation into philanthropic giving.
But the push to estimate impact is fraught with challenges. Many organizations fall into one of three traps in their monitoring and evaluation efforts:
• Too few data: Some organizations do not collect enough appropriate data, which means they cannot fulfill what should be their top priority: using data to learn, innovate, and improve. The solution is often collecting more data on what an organization is actually doing and on whether people are actually using its services.
• Too much data: Other organizations collect more data than they actually have the resources to analyze, wasting time and effort that could have been spent more productively elsewhere.
• Wrong data : Many organizations track changes in outcomes over time, but not in a way that allows them to know if the organization caused the changes or if they just happened to occur alongside the program.
This distinction matters greatly for deciding whether to continue the program, redesign it, or scrap it in favor of something more effective.
Ultimately, poorly done monitoring and evaluation drains resources without giving us the information we think we need—be it useful information for managing programs or the evidence of impact that donors (and organizations themselves) desire. Misdirected data collection carries a steep cost. By monitoring activities that do not help staff learn how to improve programs, or conducting poorly designed evaluations that do not accurately estimate the impact of a project, organizations take resources away from program implementation.
In short, the push for more data on impact has often led organizations to develop “wrong- fit” systems, depleting resources but failing to actually measure impact or provide useful data for decision-making. It is time for a change.
Just in case one doubted how big a fan we are of measuring impact: Karlan’s first book, More Than Good Intentions, 9 is entirely about measuring the impact of efforts to alleviate poverty in low- and middleincome countries, and he founded a nonprofit organization, Innovations for Poverty Action (IPA), which has conducted more than 500 randomized controlled trials of poverty programs since its inception in 2002.
A key punchline of this book: there is a time and place to measure impact. But in many situations, the best questions to address may be “Did we do what we said we would do?” (accountability) and “How can data help us learn and improve?” (performance management) instead of “Did we change the world in the way we set out to?” (impact).
OUR APPROACH: THE CART PRINCIPLES
How can organizations find “right- fit” monitoring and evaluation systems that support learning and improvement? As with Goldilocks’ search for the best “fitting” porridge, chair, and bed, the key is to find the right data. More is not always better. Nor is less. And simply “in between” is not always the answer either (that is where we deviate a bit from the Goldilocks fairytale). What is the right balance?
The number of different approaches to data collection and management are enough to make anyone’s head spin. Organizations need a framework to help them wade through the decisions they encounter— whether they are setting up a whole monitoring and evaluation system from scratch;
reforming an old, tired, and poorly fit system; or simply designing a small survey.
After working with many organizations to design their monitoring and evaluation systems, we identified four key principles of a right- fit system. We call them the “CART” principles:
• Credible Collect high-quality data and analyze them accurately.
• Actionable Collect data you can commit to use.
• Responsible Ensure the benefits of data collection outweigh the costs.
• Transportable Collect data that generate knowledge for other programs.
Sounds simple, right? And in some ways it is. But building right-fit systems sometimes means turning some current practices on their head and learning when to say “yes” and when to say “no” to data. This book aims to take you on that journey.
THE ROAD AHEAD
Part I of this book focuses on how organizations can build a strong base for their programs using the CART principles and a well- constructed theory of change.
Chapter 2 discusses the differences between program monitoring and impact evaluation; demonstrates why all organizations should monitor themselves, even if they do not evaluate their impact; and introduces the heart of this book, the CART principles.
Chapter 3 introduces a process called “theory of change” and explains why it is a critical underpinning for designing a monitoring system or impact evaluation. We then get a bit less abstract and walk through the process of creating a theory of change using a hypothetical supplemental feeding program for malnourished children.
Chapter 4 presents the CART principles in detail, providing a foundation for right- fit, action-oriented measurement and impact evaluation.
Chapters 5 through 7 dive into monitoring and impact evaluation in greater detail. Chapter 5 covers the role that monitoring plays in strengthening organizations and programs: the types of monitoring data all organizations should collect, how the CART principles can help organizations build right- fit monitoring systems, and the management issues related to designing and using data to improve operations.
Chapter 6 explores common biases that get in the way of measuring impact and explains what it means to conduct credible, actionable, responsible, and transportable impact evaluation. Chapter 7 then dives into the details of the data collection process, exploring some of the mechanics of gathering high-quality, credible data.
Part II of the book presents six real- world case studies from a range of social sector organizations—large and small, well-established and new— that provide concrete examples of the Goldilocks approach to right- sized monitoring and evaluation:
Educate! Educate!, an NGO based in Uganda, aims to teach entrepreneurial skills to young people. As its program rapidly grew, Educate! saw the need to demonstrate how its program connected to its intended outcomes. Yet staff did not agree either on the problem they were addressing or on the type of change they expected to see from the program. This case study illustrates how to find a common vision and guiding framework using a theory of change.
BRAC—A global development organization with operations in 11 countries, BRAC has been operating in Uganda since 2006. To improve its operations on the ground, BRAC wanted to find out if a key component of its theory of change—an incentive structure for field staff— was working. This case study breaks down the steps it took to find out, focusing on the process of collecting credible, actionable data and actually turning that data into action.
Salama Shield Foundation—The Salama Shield Foundation, an organization focused on community capacity-building in Africa, had a microcredit program that boasted 100% repayment rates. To the staff, this reflected the success of the program. But staff had two main questions they hoped to answer through data collection: first, were repayment rates really 100%? And second, if so, what motivated people to pay on time? This case study explores how Salama Shield went about answering those questions and provides a lesson on collecting actionable data.
Invisible Children Invisible Children, a Uganda-based NGO best known for its media and advocacy efforts (in particular, the Kony 2012 campaign), also implemented a set of traditional antipoverty programs. Invisible Children did not have a monitoring and evaluation system in place for these programs and wanted a way to prove to institutional donors that its programs were being implemented according to plan and making an impact. This case
study illustrates the drawbacks of having too little data and offers a warning about “impact evaluations” that do not actually measure impact at all.
Deworm the World Deworm the World, which develops and implements national school-based deworming programs, helps administer a program in Kenya that reaches 5 million students per year. This case study sheds light on how to monitor a program of such massive size and scale.
Un Kilo de Ayuda Un Kilo de Ayuda, a Mexican NGO that is working to end child malnutrition, collected data tracking the progress of all 50,000 children in its program. A massive task! Although the system provided actionable monitoring data that fed back into the program, it was costly and time- consuming to enter it all, raising the question: How much data is too much? This case study explores how Un Kilo de Ayuda answered that question and examines the challenges involved in designing a credible impact evaluation for the program.
Part III of the book then approaches these issues from the donor perspective. Chapter 14 focuses on large, institutional and government or government-funded donors, and Chapter 15 focuses on the perspective of individual donors, i.e. those without staff employed to set philanthropic strategies and metrics.
Chapter 16 presents online tools that complement this book and that we will continue to develop. These are publicly available at no cost on the IPA website at https://www.poverty-action.org/goldilocks/toolkit. We also provide additional resources to help design and implement CART-adherent monitoring and evaluation systems. And we discuss some related areas that we are eager to tackle but that are outside the scope of this first book on the subject. We are keen to hear from readers about ways we can expand these tools.
IS THIS BOOK FOR ME?
The Goldilocks Challenge aims to help people across the social sector use better information to make better decisions. As you read this book, we hope you agree that we manage to hit the “right- fit” balance
between engaging and actionable (although to get there, we do have to present some critical theoretical concepts). While many of our examples come from the field of international development, our framework and lessons can be broadly applied to all mission-based organizations.
If you are a “doer” in a nonprofit or a social enterprise, we hope the framework and principles this book offers can help you build an evidence strategy from scratch, revamp an existing strategy, or better understand what others are doing.
If you are a “giver,” we hope this book can guide you to support and advance the work of the organizations you fund. Funders often lead the push for more evidence without full awareness of what that pressure means for the organizations they support. The Goldilocks Challenge will help funders better understand what data collection is appropriate for the organizations they fund. By asking for useful and appropriate data, funders can steer organizations to collect information that furthers learning and advances their mission. This book will also help funders relieve pressure to collect data that does not advance organizations’ learning needs.
We also address the particular data needs of funders themselves— needs that require their own lens. For example, what should funders do if they want to compare grants within their portfolio, but different grantees have different right- fit data needs? Should they demand secondbest data from an organization, taxing the organization’s resources, in order to be able to compare one to another? For funders that want to pay for performance, what metrics should they use and how should they validate them?
Ultimately, we seek to make data collection more useful, approachable, and feasible. We cover both the science (theories and principles) and the art (applying those theories and principles) of monitoring and evaluation. We hope that, after reading this book, organizations can build stronger monitoring and evaluation systems and that donors can support organizations most committed to learning and improvement. While impact evaluations are wonderful, when feasible, there are many other types of evaluations that are important and often overlooked in the never-ending quest to measure impact. At minimum, we hope we can help organizations avoid wasteful data collection that uses money that could be more productively spent delivering services. More ambitiously, we aim to guide the
reader through complementary data and analysis that can help improve operations, even if they do not provide answers regarding impact. Ultimately, by providing a framework for the messy field of monitoring and evaluation, we aim to help each organization find their ideal path.
Box 1.1 THE STORY OF GOLDILOCKS AND THE THREE BEARS
Once upon a time there were three bears—a big father bear, a mediumsized mother bear, and a little baby bear. They lived in a charming cottage in the forest. One morning, mother bear made some porridge for breakfast, but it was too hot to eat. While it cooled, the three bears went for a walk in the woods.
Not far away lived a little girl named Goldilocks. That very same morning, she was wandering through the woods picking flowers. When she came upon the three bears’ cottage, she knocked on the door and called “Anyone home?” Nobody answered, so she opened the door and went on in.
Goldilocks came to the table and saw the three chairs. She sat in the great big chair, but it was too hard. She tried the medium- sized chair, but it was too soft. Then she tried the little chair, which was just right, but it broke when she sat on it!
Then Goldilocks spied the porridge. “I sure am hungry,” she said, and began tasting the porridge. The porridge in the big bowl was too hot. The porridge in the medium- sized bowl was too cold. The porridge in the little bowl was just right, so she ate it all up!
Then Goldilocks went upstairs and tried the beds. The big bed was too hard, and the medium- sized bed was too soft. But the little bed was just right, so Goldilocks lay down and fell fast asleep.
Just then, the bears came home from their walk, hungry and ready for their porridge. Father bear looked around and noticed that something was amiss. “Someone has been sitting in my chair!” said father bear. “Someone has been eating my porridge!” said mother bear. Little bear ran upstairs and cried, “someone has been lying in my bed, and she’s still there!”
At that very moment Goldilocks awoke and saw the three bears peering at her. Terrified, she jumped out of bed, ran through the door, and escaped into the woods. She ran all the way home, and promised never to wander through the forest again.
CHAPTER 2