An Approach To Applying Statistics For Social Change

It has proven to be difficult to find places where data-driven technologies can be applied to social good efforts in a way that tries to effectively handle the risk x benefit of optimizing the increase of the world’s accumulative quality of life.

My current approach to this is to need find within charities that GiveWell has praised, in hopes of finding the most pressing problem that exists across initiatives within the world poverty space. The ideal would then be to create a product or service in response to this need. Right now the poster child product that has resulted from something similar to this perspective is SurveyCTO, a computer assisted personal interview, which provides a solid foundation for effective data collection. My current goal is to find ways of helping effective world poverty initiatives leverage this collected data as an asset.

However, the starting point for this is to try to find the best projects, which has been difficult.

There are several reasons for why this is difficult:
First of all, there are three main difficulties with data advocacy in development: (1) Data Interpretation: language barriers, incomplete metadata which leads to unknown biases, an importance for the dataset to be interpreted by people with local knowledge (or a means to create the local context), etc. (2) Organizational Challenges: at what point should an organization invest in data capacity, and what are the indicators that show it will pay off. Proprietary restrictions on datasets often arise. Resulting transparency from data capacity challenges power dynamics. (3) Use & Dissemination: organizations must take responsibility for the data they are using and disseminating. Data in an advocacy context cannot be regarded as neutral, and organizations should consider the risks involved in a given analysis, and its potential impacts on the people in question. (Bellagio Big Data Workshop Participants, 2014)

A second difficulty is that several biases keep us from choosing effective social causes. When looking for a cause, our decisions tend to be guided by powerful perceptual biases (i.e. guided towards issues that have an element of self-interest, issues that involve others who are similar to us, and issues that are popular either in general society or in our social group).

This perspective is important to recognize because ineffective aid has caused “the West to spend $2.3 trillion on foreign aid over the last 5 decades while still not managing to get twelve-cent medicines, or four-dollar bed nets, to children to prevent half of all malaria deaths…It’s a tragedy that so much well-meaning compassion did not bring these results for needy people.” (William Easterly)

A third difficulty is that the more we invest in something, the more resistant we are to evidence that we’re not on the right track. Nick Cooney sums this up perfectly:

“the more we’ve emotionally invested in something, the more highly we value it and believe it to be right. This makes it very difficult to abandon campaigns and strategies even when (as would be obvious to an impartial outsider) they’re deeply flawed. If we are strongly invested in a cause, as many activists are, it becomes tied up with our identity to such an extent that any criticism of our advocacy methods feels like a criticism of us; any thought that the campaign we’re working on is flawed carries the implication that we ourselves are flawed.”

A fourth difficulty results from how most charities use biased evidence to measure impact. The two main types of evidence include: (1) anecdotes, pictures and testimonials: This kind of evidence merely provides affirmation over information of the program’s impact. Odds are that any sufficiently large program will have at least a couple success stories, just by chance. (2) Simple comparisons of program participants and non-participants: This comparison may pick up other differences that correlate with the program’s selection process, thus creating a bias: correlation does not imply causation. Random Controlled Trials (RCTs), commonly used in rigorous impact evaluations, are an attempt to avoid this selection bias. The trade-off is that RCTs are expensive to execute. Mass Nonprofit provides a compelling example of this lack of data-driven methodologies in the nonprofit space:

A study of 155 foundations with more than one hundred million dollars in assets found that only 8% could describe the specific types of information or data that led them to believe they were likely to achieve some of their goals. The study, conducted by the Center for Effective Philanthropy, found that instead of hard data most foundations used anecdotal evidence to demonstrate the effectiveness of their programs. Only 39% used any tools or indicators whatsoever in assessing even a portion of their work, with even less (26%) using indicators or tools to assess all of their work (Mass Nonprofit 2010).

Evidence-driven evaluations of impact are useful because these are difficult problems.
Many of these difficulties stem from the three I’s: Ideology, Ignorance, and Inertia.

“The failures of the World Bank, the International Monetary Fund, the United Nations, and USAID result from grandiose ambitions, top-down planning, and a lack of accountability” (William Easterly)

According to Givewell, “many of the problems charities aim to address are extremely difficult problems that foundations, governments and experts have struggled with for decades. Many well-funded, well-executed, logical programs simply haven’t had the desired results.”

Since these are such difficult problems, Givewell has shifted from the common prior assumption that aid is working, to the prior assumption that it isn’t.
So what does the null hypothesis look like when it comes to aid?

GiveWell’s default assumption for charities is that they “fall far short of what they promise donors, and are very likely failing to accomplish much of anything (or even doing harm). This doesn’t mean we think all charities are failing – just that, in the absence of strong evidence of impact, this is the appropriate starting-point assumption.”


“Many others seem to have the opposite prior: they assume that a charity is doing great things unless it is proven not to be…When someone asks them to give to a charity, they usually give.” (Givewell)

“Charities raise money based on their ability to market themselves and fundraise, as opposed to their ability to change lives. Because charities aren’t being held accountable based on impact, there are probably a lot of charities that continue to raise and spend money but don’t make any difference at all. Does that mean that a given charity’s programs don’t work? Not necessarily. But, it does mean that charities bear the burden of proof to convince you, the donor, that your donation will make a difference.” (GiveWell)

This burden of proof has created a scope of charities with commitment to data-driven development
GiveWell holds very high standards for evidence of impact when considering a recommendation. These high standards have created a scope of charities that have already committed to developing this Evaluation & Monitoring (E&M) capacity to a significant extent.

Therefor, I think it would be useful to connect domain experts and data partners with concrete social problems within these initiatives that can be addressed with effective data analysis and statistical and computational techniques.

In collaboration with Stanford’s Statistics For Good, GalvanizeUniversity, and Datakind, I have reached out to 30% of the charities that GiveWell has praised. Of those, I got a 25% response rate. I expect E&M projects to dominate the results of need finding these charities. Becoming familiar with the tools and services non-profits could benefit from when transition to E&M practices has proven relatively easy to navigate, and I have begun documenting them here. It is useful to know these foundational tools so that I can point charities who have insufficient data capacity to leverage effective data analysis to tools and services that will help them get there. However, it is useful to keep in mind that the biggest need among these charities in terms of data methodologies may not solely be related to E&M.

This is a good place to start. However, this perspective is limited.
First, “the technology development process is only around 7% of the work.” (Ushahidi) This does not include data analysis, but is useful to point out to bring silicon valley engineers down off their pedestals. More importantly, technology—no matter how well designed—is only a magnifier of human intent and capacity. It is not a substitute. Kentaro Toyama provides a great explanation of this:

“In every one of our projects, a technology’s effects were wholly dependent on the intention and capacity of the people handling it. The success of PC projects in schools hinged on supportive administrators and dedicated teachers. Microcredit processes with mobile phones worked because of effective microfinance organizations. Teaching farming practices through video required capable agriculture-extension officers and devoted nonprofit staff. In our most successful ICT4D projects, the partner organizations did the hard work of real development, and our role was simply to assist, and strengthen, their efforts with technology.”

A second reason for why this initiative is limited is that GiveWell’s perspective of room for more funding doesn’t necessarily correspond with areas that may have utilitarian technological impact.

Third, this approach tends to filter aid projects that are more difficult to evaluate, such as Oxfam’s perspective of “capacity building” (assisting the poor to develop their skills to become self-sufficient in various ways). Evaluating these less concrete endeavors is different from what GiveWell is currently equipped to research. However, these charities are sometimes in the scope of who GiveWell has contacted. An example of this is the Malaria Consortium, who are focused on technical assistance and operations research for malaria, instead of direct bed net distribution.

This quantification bias also filters much of the larger ecosystem of using big data for social change in the developing world. While there are some great projects, the use of big data in this space is mostly premature. This is because socio-metric data has to be generated from the physical world; its not just given to you from the digital ether (a luxury of the data wealthy .coms). That means that there are relatively substantial costs and trade-offs when choosing what to measure, and leaves you with an emphasis on interpretation over cutting-edge computational tricks. It is therefor also useful to develop these technologies within the corporate sector (since they are not hindered by funding and can move faster than NGOs) – with a view to bringing it to non-profits later.

The Longer Term Scope
is to help magnify the effectiveness of world poverty priority programs through building and using data-driven technologies.

Promising Initiatives
1) DataKind
2) Stats For Change Partners
3) Galvanize University
4) Data Science For Social Good summer fellowship
5) BayesImpact

We can already begin to see the positive effects of this approach in mobile tech’s potential to provide accessible and scalable feedback mechanisms (such as Grameen’s AppLab). “In the next 20 years, machine learning will have more impact than mobile has” (Vinod Khosla). Nathan Eagle, the CEO of Jana, has a nice summary of this opportunity:

“Data-driven development represents an opportunity to transcend observational science, enabling us not only to learn more about the underlying dynamics driving behaviour, but to be able to use these insights to design better mechanisms, better systems, better tools that can improve the lives of these billions of people generating this data and the societies in which they live. Rather than the passive, observational roles that scientists have played in these other fields, there is an opportunity to take an active role – not collecting data – but designing more appropriate interventions.”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s