Tag Archives: Science

Data Science for 100 Resilient Cities

The Rockefeller Foundation recently launched a major international initiative called “100 Resilient Cities.” The motivation behind this global project stems from the recognition that cities are facing increasing stresses driven by the unprecedented pace urbanization. More than 75% of people expected to live in cities by 2050. The Foundation is thus rightly concerned: “As natural and man-made shocks and stresses grow in frequency, impact and scale, with the ability to ripple across systems and geographies, cities are largely unprepared to respond to, withstand, and bounce back from disasters” (1).

Resilience is the capacity to self-organize, and smart self-organization requires social capital and robust feedback loops. I’ve discussed these issues and related linkages at lengths in the posts listed below and so shan’t repeat myself here. 

  • How to Create Resilience Through Big Data [link]
  • On Technology and Building Resilient Societies [link]
  • Using Social Media to Predict Disaster Resilience [link]
  • Social Media = Social Capital = Disaster Resilience? [link]
  • Does Social Capital Drive Disaster Resilience? [link]
  • Failing Gracefully in Complex Systems: A Note on Resilience [link]

Instead, I want to make a case for community-driven “tactical resilience” aided (not controlled) by data science. I came across the term “tactical urbanism” whilst at the “The City Resilient” conference co-organized by PopTech & Rockefeller in June. Tactical urbanism refers to small and temporary projects that demonstrate what could be. We also need people-centered tactical resilience initiatives to show small-scale resilience in action and demonstrate what these could mean at scale. Data science can play an important role in formulating and implementing tactical resilience interventions and in demonstrating their resulting impact at various scales.

Ultimately, if tactical resilience projects do not increase local capacity for smart and scalable self-organization, then they may not render cities more resilient. “Smart Cities” should mean “Resilient Neighborhoods” but the former concept takes a mostly top-down approach focused on the physical layer while the latter recognizes the importance of social capital and self-organization at the neighborhood level. “Indeed, neighborhoods have an impact on a surprisingly wide variety of outcomes, including child health, high-school graduation, teen births, adult mortality, social disorder and even IQ scores” (1).

So just like IBM is driving the data science behind their Smart Cities initiatives, I believe Rockefeller’s 100 Resilient Cities grantees would benefit from similar data science support and expertise but at the tactical and neighborhood level. This explains why my team and I plan to launch a Data Science for Resilience Program at the Qatar Foundation’s Computing Research Institute (QCRI). This program will focus on providing data science support to promising “tactical resilience” projects related to Rockefeller’s 100 Resilient Cities initiative.

The initial springboard for these conversations will be the PopTech & Rockefeller Fellows Program on “Community Resilience Through Big Data and Technology”. I’m really honored and excited to have been selected as one of the PopTech and Rockefeller Fellows to explore the intersections of Big Data, Technology and Resilience. As mentioned to the organizers, one of my objectives during this two-week brainstorming session is to produce a joint set of “tactical resilience” project proposals with well articulated research questions. My plan is to select the strongest questions and make them the basis for our initial data science for resilience research at QCRI.

bio

Data Science for Social Good: Not Cognitive Surplus but Cognitive Mismatch

I’ve spent the past 12 months working with top notch data scientists at QCRI et al. The following may thus be biased: I think QCRI got it right. They strive to balance their commitment to positive social change with their primary mission of becoming a world class institute for advanced computing research. The two are not mutually exclusive. What it takes is a dedicated position, like the one created for me at QCRI. It is high time that other research institutes, academic programs and international computing conferences create comparable focal points to catalyze data science for social good.

Microsoft Research, to name just one company, carries out very interesting research that could have tremendous social impact, but the bridge necessary to transfer much of that research from knowledge to operation to social impact is often not there. And when it is, it is usually by happenstance. So researchers continue to formulate research questions based on what they find interesting rather than identifying equally interesting questions that could have direct social impact if answered by data science. Hundreds of papers get presented at computing conferences every month, and yet few if any of the authors have linked up with organizations like the United Nations, World Bank, Habitat for Humanity etc., to identify and answer questions with social good potential. The same is true for hundreds of computing dissertations that get defended every year. Doctoral students do not realize that a minor reformulation of their research question could perhaps make a world of difference to a community-based organization in India dedicated to fighting corruption, for example.

Cognitive Mismatch

The challenge here is not one of untapped cognitive surplus (to borrow from Clay Shirky), but rather complete cognitive mismatch. As my QCRI colleague Ihab Ilyas puts it: there are “problem owners” on the one hand and “problem solvers” on the other. The former have problems that prevent them from catalyzing positive social change. The later know how to solve comparable problems and do so every day. But the two are not talking or even aware of each other. Creating and maintaining this two-way conversation requires more than one dedicated position (like mine at QCRI).

sweet spot

In short, I really want to have dedicated counterparts at Microsoft Research, IBM, SAP, LinkedIn, Bitly, GNIP, etc., as well as leading universities, top notch computing conferences and challenges; counterparts who have one foot in the world of data science and the other in the social sector; individuals who have a demonstrated track-record in bridging communities. There’s a community here waiting to be connected and needing to be formed. Again, carrying out cutting edge computing R&D is in no way incompatible with generating positive social impact. Moreover, the latter provides an important return on investment in the form of data, reputation, publicity, connections and social capital. In sum, social good challenges need to be formulated into research questions that have scientific as well as social good value. There is definitely a sweet spot here but it takes a dedicated community to bring problem owners and solvers together and hit that social good sweet spot.

Bio

Data Science for Social Good and Humanitarian Action

My (new) colleagues at the University of Chicago recently launched a new and exciting program called “Data Science for Social Good”. The program, which launches this summer, will bring together dozens top-notch data scientists, computer scientists an social scientists to address major social challenges. Advisors for this initiative include Eric Schmidt (Google), Raed Ghani (Obama Administration) and my very likable colleague Jake Porway (DataKind). Think of “Data Science for Social Good” as a “Code for America” but broader in scope and application. I’m excited to announce that QCRI is looking to collaborate with this important new program given the strong overlap with our Social Innovation Vision, Strategy and Projects.

My team and I at QCRI are hoping to mentor and engage fellows throughout the summer on key humanitarian & development projects we are working on in partnership with the United Nations, Red Cross, World Bank and others. This would provide fellows with the opportunity to engage in  “real world” challenges that directly match their expertise and interests. Second, we (QCRI) are hoping to replicate this type of program in Qatar in January 2014.

Why January? This will give us enough time to design the new program based on the result of this summer’s experiment. More importantly, perhaps, it will be freezing in Chicago ; ) and wonderfully warm in Doha. Plus January is an easier time for many students and professionals to take “time off”. The fellows program will likely be 3 weeks in duration (rather than 3 months) and will focus on applying data science to promote social good projects in the Arab World and beyond. Mentors will include top Data Scientists from QCRI and hopefully the University of Chicago. We hope to create 10 fellowship positions for this Data Science for Social Good program. The call for said applications will go out this summer, so stay tuned for an update.

bio

A Research Framework for Next Generation Humanitarian Technology and Innovation

Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).

picme8

These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.

In 2010, I wrote this piece on “The Humanitarian-Technology Divide and What To Do About It.” This divide became increasingly clear to me when I co-founded and co-directed the Harvard Humanitarian Initiative’s (HHI) Program on Crisis Mapping & Early Warning (2007-2009). So I co-founded the annual Inter-national CrisisMappers Conference series in 2009 and have continued to co-organize this unique, cross-disciplinary forum on humanitarian technology. The CrisisMappers Network also plays an important role in bridging the humanitarian and technology divide. My decision to join Ushahidi as Director of Crisis Mapping (2009-2012) was a strategic move to continue bridging the divide—and to do so from the technology side this time.

The same is true of my move to the Qatar Computing Research Institute (QCRI) at the Qatar Foundation. My experience at Ushahidi made me realize that serious expertise in Data Science is required to tackle the major challenges appearing on the horizon of humanitarian technology. Indeed, the key words missing from the DfID, ALNAP and OCHA innovation reports include: Data Science, Big Data Analytics, Artificial Intelligence, Machine Learning, Machine Translation and Human Computing. This current divide between the humanitarian and data science space needs to be bridged, which is precisely why I joined the Qatar Com-puting Research Institute as Director of Innovation; to develop and prototype the next generation of humanitarian technologies by working directly with experts in Data Science and Advanced Computing.

bridgetech

My efforts to bridge these communities also explains why I am co-organizing this year’s Workshop on “Social Web for Disaster Management” at the 2013 World Wide Web conference (WWW13). The WWW event series is one of the most prestigious conferences in the field of Advanced Computing. I have found that experts in this field are very interested and highly motivated to work on humanitarian technology challenges and crisis computing problems. As one of them recently told me: “We simply don’t know what projects or questions to prioritize or work on. We want questions, preferably hard questions, please!”

Yet the humanitarian innovation and technology reports cited above overlook the field of advanced computing. Their policy recommendations vis-a-vis future information systems for crisis response and management are vague at best. Yet one of the major challenges that the humanitarian sector faces is the rise of Big (Crisis) Data. I have already discussed this here, here and here, for example. The humanitarian community is woefully unprepared to deal with this tidal wave of user-generated crisis information. There are already more mobile phone sub-scriptions than people in 100+ countries. And fully 50% of the world’s population in developing countries will be using the Internet within the next 20 months—the current figure is 24%. Meanwhile, close to 250 million people were affected by disasters in 2010 alone. Since then, the number of new mobile phone subscrip-tions has increased by well over one billion, which means that disaster-affected communities today are increasingly likely to be digital communities as well.

In the Philippines, a country highly prone to “natural” disasters, 92% of Filipinos who access the web use Facebook. In early 2012, Filipinos sent an average of 2 billion text messages every day. When disaster strikes, some of these messages will contain information critical for situational awareness & rapid needs assess-ment. The innovation reports by DfID, ALNAP and OCHA emphasize time and time again that listening to local communities is a humanitarian imperative. As DfID notes, “there is a strong need to systematically involve beneficiaries in the collection and use of data to inform decision making. Currently the people directly affected by crises do not routinely have a voice, which makes it difficult for their needs be effectively addressed” (2012). But how exactly should we listen to millions of voices at once, let alone manage, verify and respond to these voices with potentially life-saving information? Over 20 million tweets were posted during Hurricane Sandy. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day, i.e., 2,000 tweets per second on average.

Screen Shot 2013-03-20 at 1.42.25 PM

Of course, the volume and velocity of crisis information will vary from country to country and disaster to disaster. But the majority of humanitarian organizations do not have the technologies in place to handle smaller tidal waves either. Take the case of the recent Typhoon in the Philippines, for example. OCHA activated the Digital Humanitarian Network (DHN) to ask them to carry out a rapid damage assessment by analyzing the 20,000 tweets posted during the first 48 hours of Typhoon Pablo. In fact, one of the main reasons digital volunteer networks like the DHN and the Standby Volunteer Task Force (SBTF) exist is to provide humanitarian organizations with this kind of skilled surge capacity. But analyzing 20,000 tweets in 12 hours (mostly manually) is one thing, analyzing 20 million requires more than a few hundred dedicated volunteers. What’s more, we do not have the luxury of having months to carry out this analysis. Access to information is as important as access to food; and like food, information has a sell-by date.

We clearly need a research agenda to guide the development of next generation humanitarian technology. One such framework is proposed her. The Big (Crisis) Data challenge is composed of (at least) two major problems: (1) finding the needle in the haystack; (2) assessing the accuracy of that needle. In other words, identifying the signal in the noise and determining whether that signal is accurate. Both of these challenges are exacerbated by serious time con-straints. There are (at least) two ways too manage the Big Data challenge in real or near real-time: Human Computing and Artificial Intelligence. We know about these solutions because they have already been developed and used by other sectors and disciplines for several years now. In other words, our information problems are hardly as unique as we might think. Hence the importance of bridging the humanitarian and data science communities.

In sum, the Big Crisis Data challenge can be addressed using Human Computing (HC) and/or Artificial Intelligence (AI). Human Computing includes crowd-sourcing and microtasking. AI includes natural language processing and machine learning. A framework for next generation humanitarian technology and inno-vation must thus promote Research and Development (R&D) that apply these methodologies for humanitarian response. For example, Verily is a project that leverages HC for the verification of crowdsourced social media content generated during crises. In contrast, this here is an example of an AI approach to verification. The Standby Volunteer Task Force (SBTF) has used HC (micro-tasking) to analyze satellite imagery (Big Data) for humanitarian response. An-other novel HC approach to managing Big Data is the use of gaming, something called Playsourcing. AI for Disaster Response (AIDR) is an example of AI applied to humanitarian response. In many ways, though, AIDR combines AI with Human Computing, as does MatchApp. Such hybrid solutions should also be promoted   as part of the R&D framework on next generation humanitarian technology. 

There is of course more to humanitarian technology than information manage-ment alone. Related is the topic of Data Visualization, for example. There are also exciting innovations and developments in the use of drones or Unmanned Aerial Vehicles (UAVs), meshed mobile communication networks, hyper low-cost satellites, etc.. I am particularly interested in each of these areas will continue to blog about them. In the meantime, I very much welcome feedback on this post’s proposed research framework for humanitarian technology and innovation.

 bio

Promises and Pitfalls in the Spatial Prediction of Ethnic Violence

My colleague Nils Weidmann just published this co-authored piece with Harvard Professor Monica Toft. The paper deserves serious attention. Weidmann and Toft review this article on the spatial prediction of ethnic conflict that was authored by Lim, Metzler and Bar-Yam (LMB) and published in the prestigious journal Science.

I reviewed the article myself earlier this year and while I was highly suspicious of the findings—correlations of 0.9 (!) and above—I did not dig deeper. But Weidmann and Toft have done just this and their findings are worth reading.

The authors clearly show that the analysis by LMB “suffers from a biased selection of groups and regions, an inadequate null hypothesis and unit of analysis.” This really begs the following question: how did the LMB paper ever make it through the peer-review process?

The authors’ case selection is seriously biased as it “seems to adjust the group map as to better fit the model predictions,” for example. The isolationist policy recommendations that LMB put forward are thus founded on misleading methods and ought to be entirely dismissed.

Better yet, Science should retract the LMB paper or at least publish the commentary by Weidmann and Toft. Indeed, another question that follows from the conclusion reached by Weidmann and Toft is this: how many other below-par papers have been accepted and published by Science?

In sum, not only are the methods used by LMB questionable, but as Weidmann and Toft conclude, “the model provides little advance on prior research” in the field of crisis mapping.

On the plus side, the fact that there is push back on early articles in the field of crisis mapping is also a good sign and evidence that the field is becoming more formalized. In addition, the general approach taken by LMB still holds much promise for crisis mapping—it simply needs to be done with a lot more care and transparency. Indeed, combining agent-based models with real world empirical data and a sound understanding of ethnic conflict could become a winning strategy for crisis mapping analytics.

In closing, I look forward to following Nils Weidmann’s work at Princeton and have no doubt that he will continue to play an important role in the development of the field, and will do so with integrity and rigorous scholarship.

Patrick Philippe Meier